map - python map_async, where is the overhead coming from? -


i using map_async intended - map iterable on multiple processing cores using:

cores = mp.cpu_count() pool = mp.pool()  r = pool.map_async(func, offsets,callback=mycallback) r.wait() 

func returns dict, callback 'merges' dicts using:

ddict = defaultdict(set) def mycallback(w):     l in w:         key, value in l.items():             v in value:                 ddict[key].add(v)   

offsets iterable have tested 1,000 - 50,000 elements.

if remove r.wait() not possible return of output map_async call.

using r.wait(), seeing processing times both inferior serial implementation , not scale, i.e. parallel implementation increases in time exponentially, while serial version increases linearly.

i know func sufficiently expensive in serial , parallel pegs processing cores.

where have introduced overhead using map_async? not in callback function, removing , replacing result.append not impact time.

edit comments:

  1. i moving large dicts around, anywhere 1,000 - 100,000 elements. value sets 3-5 elements. so, pickling issue. alternative data structures 1 suggest without moving in shared memory?

  2. apply_async similar callback, save for l in w line, returns same results. speed better map_async problem sets , worse others. using managed dict , joinable queue worse.

  3. some time tests. using 2 cores. add additional cores see exponential increase, can assume that increase caused process spawning or pickling return data.

func takes data point , looks neighbors. identical function cases, except need pass offsets telling parallel code data points search. kdtree search function.

homogeneously distributed

1,000 data points: serial 0.098659992218 | apply_async 0.120759010315 | map_async 0.080078125

10,000 data points <====== improvement parallel | serial 0.507845163345 | apply_async 0.446543931961 | map_async 0.477811098099

randomly distributed

10,000 data points: serial 0.584854841232 | apply_async 1.03224301338 | map_async 0.948460817337

50,000 data points: serial 3.66075992584 | apply_async 4.95467185974 | map_async 5.37306404114

can change func() return dictionaries of sets instead of dictionaries of lists? callback function rewritten this:

def mycallback(w):     l in w:         key, value in l.items():             ddict[key].update(value) 

that should both serial , parallel processing times.

unfortunately, think @dougal right pickling/unpickling data when passing between threads. might faster write binary data disk , read again, instead of passing around in memory because of overhead of pickling. use format like:

key value1 value2 value3 ... key2 valuea valueb valuec ... ... 

which should easy both write , read.


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

android - send complex objects as post php java -

charts - What graph/dashboard product is facebook using in Dashboard: PUE & WUE -