Message 80168 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	0x666
Recipients	0x666
Date	2009-01-19.13:50:50
SpamBayes Score	2.1057976e-07
Marked as misclassified	No
Message-id	<1232373052.48.0.519652663771.issue5000@psf.upfronthosting.co.za>
In-reply-to

Content
I think something wrong with implementation of multiprocessing module. I`ve run this very simple test on my machine (core 2, vista): import multiprocessing as mul from time import time def f(x): return x*x if __name__ == '__main__': print "-------- testing multiprocessing on ",mul.cpu_count(),"cores ----------" print "" elements = 100000 pool = mul.Pool(processes=mul.cpu_count()) t1 = time() res_par = pool.map(f, range(elements)) t2 = time() res_seq = map(f, range(elements)) t3 = time() res_app = [pool.apply_async(f,(x,)) for x in range(elements)] res_app = [result.get() for result in res_app] t4 = time() print len(res_seq),"elements","map() time",(t3-t2),"s" print len(res_par),"elements","pool.map() time",(t2-t1),"s" print len(res_app),"elements","pool.apply_async() time", (t4-t3),"s" print raw_input("press enter to exit...") __________________________________________ Results: -------- testing multiprocessing on 2 cores ----------- 100000 elements map() time 0.0269 s 100000 elements pool.map() time 0.108 s 100000 elements pool.apply_async() time 10.567 s -------------------------------------------------------- IMHO, execution on 2 cores should be 1.x - 2 times faster than compared with non-parallel execution. (at least in simple cases). If you dont believe in this, check http://www.parallelpython.com/ module (demo example sum_primes.py), which fits very well this idea. So how it can be that parallel pool.map() method executes in about 5 times SLOWER, than ordinary map() function ? So please correct multiprocessing package to work in more-less perfomance predictable way (like parallelpython).

I think something wrong with implementation of multiprocessing module.
I`ve run this very simple test on my machine (core 2, vista):
import multiprocessing as mul
from time import time

def f(x):
    return x*x

if __name__ == '__main__':
    print "-------- testing multiprocessing on ",mul.cpu_count(),"cores
----------"
    print ""

    elements = 100000

    pool = mul.Pool(processes=mul.cpu_count())
    t1 = time()
    res_par = pool.map(f, range(elements))
    t2 = time()
    res_seq = map(f, range(elements))
    t3 = time()
    res_app = [pool.apply_async(f,(x,)) for x in range(elements)]
    res_app = [result.get() for result in res_app]
    t4 = time()

    print len(res_seq),"elements","map() time",(t3-t2),"s"
    print len(res_par),"elements","pool.map() time",(t2-t1),"s"
    print len(res_app),"elements","pool.apply_async() time", (t4-t3),"s"
    
    print
    raw_input("press enter to exit...")
__________________________________________
Results:
-------- testing multiprocessing on  2 cores -----------

100000 elements map() time 0.0269 s
100000 elements pool.map() time 0.108 s
100000 elements pool.apply_async() time 10.567 s
--------------------------------------------------------

IMHO, execution on 2 cores should be 1.x - 2 times faster than compared
with non-parallel execution. (at least in simple cases).
If you dont believe in this, check http://www.parallelpython.com/
module (demo example sum_primes.py), which fits very well this idea.

So how it can be that parallel pool.map() method executes in about 5
times SLOWER, than ordinary map() function ?
So please correct multiprocessing package to work in more-less
perfomance predictable way (like parallelpython).

History
Date	User	Action	Args
2009-01-19 13:50:52	0x666	set	recipients: + 0x666
2009-01-19 13:50:52	0x666	set	messageid: <1232373052.48.0.519652663771.issue5000@psf.upfronthosting.co.za>
2009-01-19 13:50:51	0x666	link	issue5000 messages
2009-01-19 13:50:50	0x666	create