Author asksol
Recipients asksol, jafo, jnoller, nirai, sbt, ysj.ray
Date 2012-06-07.09:09:21
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1339060162.62.0.969065415856.issue10037@psf.upfronthosting.co.za>
In-reply-to
Content
Well, I still don't know exactly why restarting the socket read made it work, but the patch solved an issue where newly started pool processes would be stuck in socket read forever (happening to maybe 1/500 new processes)

This and a dozen other pool related fixes are in my billiard fork of multiprocessing, e.g. what you
describe in your comment:
# trying res.get() would block forever
works in billiard, where res.get() will raise WorkerLostError in that
case.

https://github.com/celery/billiard/

Earlier commit history for the pool can be found in Celery:
https://github.com/ask/celery/commits/2.5/celery/concurrency/processes/pool.py

My eventual goal is to merge these fixes back into Python, but except
for people using Python 3.x, they would have to use billiard for quite some time anyway, so I don't feel in a hurry.


I think this issue can be closed, the worker handler is simply borked and  we could open up a new issue deciding how to fix it (merging billiard.Pool or someting else).

(btw, Richard, you're sbt? I was trying to find your real name to give
you credit for the no_execv patch in billiard)
History
Date User Action Args
2012-06-07 09:09:22asksolsetrecipients: + asksol, jafo, jnoller, nirai, ysj.ray, sbt
2012-06-07 09:09:22asksolsetmessageid: <1339060162.62.0.969065415856.issue10037@psf.upfronthosting.co.za>
2012-06-07 09:09:22asksollinkissue10037 messages
2012-06-07 09:09:21asksolcreate