Author dan.oreilly
Recipients Albert.Strasheim, aljungberg, asksol, bquinlan, brian.curtin, dan.oreilly, gdb, gkcn, hongqn, jcea, jnoller, neologix, pitrou, python-dev, vlasovskikh, vstinner
Date 2014-08-24.19:56:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1408910176.55.0.130100895672.issue9205@psf.upfronthosting.co.za>
In-reply-to
Content
>> So, concurrent.futures is fixed now. Unless someone wants to patch multiprocessing.Pool, I am closing this issue.

I realize I'm 3 years late on this, but I've put together a patch for multiprocessing.Pool. Should a process in a Pool unexpectedly exit (meaning, *not* because of hitting the maxtasksperchild limit), the Pool will be closed/terminated and all cached/running tasks will return a BrokenProcessPool. These changes also prevent the Pool from going into a bad state if the "initializer" function raises an exception (previously, the pool would end up infinitely starting new processes, which would immediately die because of the exception).

One concern with the patch: The way timings are altered with these changes, the Pool seems to be particularly susceptible to issue6721 in certain cases. If processes in the Pool are being restarted due to maxtasksperchild just as the worker is being closed or joined, there is a chance the worker will be forked while some of the debug logging inside of Pool is running (and holding locks on either sys.stdout or sys.stderr). When this happens, the worker deadlocks on startup, which will hang the whole program. I believe the current implementation is susceptible to this as well, but I could reproduce it much more consistently with this patch. I think its rare enough in practice that it shouldn't prevent the patch from being accepted, but thought I should point it out. 

(I do think issue6721 should be addressed, or at the very least internal  I/O locks should always reset after forking.)
History
Date User Action Args
2014-08-24 19:56:17dan.oreillysetrecipients: + dan.oreilly, jcea, bquinlan, pitrou, vstinner, jnoller, hongqn, brian.curtin, asksol, vlasovskikh, neologix, gdb, Albert.Strasheim, aljungberg, python-dev, gkcn
2014-08-24 19:56:16dan.oreillysetmessageid: <1408910176.55.0.130100895672.issue9205@psf.upfronthosting.co.za>
2014-08-24 19:56:16dan.oreillylinkissue9205 messages
2014-08-24 19:56:16dan.oreillycreate