map - python map_async, where is the overhead coming from? -
i using map_async intended - map iterable on multiple processing cores using:
cores = mp.cpu_count() pool = mp.pool() r = pool.map_async(func, offsets,callback=mycallback) r.wait()
func
returns dict, callback 'merges' dicts using:
ddict = defaultdict(set) def mycallback(w): l in w: key, value in l.items(): v in value: ddict[key].add(v)
offsets iterable have tested 1,000 - 50,000 elements.
if remove r.wait()
not possible return of output map_async
call.
using r.wait()
, seeing processing times both inferior serial implementation , not scale, i.e. parallel implementation increases in time exponentially, while serial version increases linearly.
i know func
sufficiently expensive in serial , parallel pegs processing cores.
where have introduced overhead using map_async? not in callback function, removing , replacing result.append
not impact time.
edit comments:
i moving large dicts around, anywhere 1,000 - 100,000 elements. value sets 3-5 elements. so, pickling issue. alternative data structures 1 suggest without moving in shared memory?
apply_async
similar callback, savefor l in w
line, returns same results. speed better map_async problem sets , worse others. using managed dict , joinable queue worse.some time tests. using 2 cores. add additional cores see exponential increase, can assume that increase caused process spawning or pickling return data.
func
takes data point , looks neighbors. identical function cases, except need pass offsets telling parallel code data points search. kdtree search function.
homogeneously distributed
1,000 data points: serial 0.098659992218 | apply_async
0.120759010315 | map_async
0.080078125
10,000 data points <====== improvement parallel | serial 0.507845163345 | apply_async
0.446543931961 | map_async
0.477811098099
randomly distributed
10,000 data points: serial 0.584854841232 | apply_async
1.03224301338 | map_async
0.948460817337
50,000 data points: serial 3.66075992584 | apply_async
4.95467185974 | map_async
5.37306404114
can change func()
return dictionaries of sets instead of dictionaries of lists? callback function rewritten this:
def mycallback(w): l in w: key, value in l.items(): ddict[key].update(value)
that should both serial , parallel processing times.
unfortunately, think @dougal right pickling/unpickling data when passing between threads. might faster write binary data disk , read again, instead of passing around in memory because of overhead of pickling. use format like:
key value1 value2 value3 ... key2 valuea valueb valuec ... ...
which should easy both write , read.
Comments
Post a Comment