Author neologix
Recipients Albert.Strasheim, aljungberg, asksol, bquinlan, brian.curtin, gdb, gkcn, haypo, hongqn, jnoller, neologix, pitrou, vlasovskikh
Date 2011-05-13.10:40:02
SpamBayes Score 4.14587e-09
Marked as misclassified No
Message-id <1305283204.11.0.0887437824261.issue9205@psf.upfronthosting.co.za>
In-reply-to
Content
Antoine, I've got a couple questions concerning your patch:
- IIUC, the principle is to create a pipe for each worker process, so that when the child exits the read-end - sentinel - becomes readable (EOF) from the parent, so you know that a child exited. Then, before reading from the the result queue, you perform a select on the list of sentinels to check that all workers are alive. Am I correct?
If I am, then I have the following questions:
- have you done some benchmarking to measure the performance impact of calling select at every get (I'm not saying it will necessary be noticeable, I'm just curious)?
- isn't there a race if a process exits between the time select returns and the get?
- is there a distinction between a normal exit and an abnormal one? The reason I'm asking is because with multiprocessing.Pool, you can have a maxtasksperchild argument which will make workers exit after having processed a given number of tasks, so I'm wondering how that would be handled with the current patch (on the other side, I think you patch only applies to concurrent.futures, not to raw Queues, right?).

Finally, I might be missing something completely obvious, but I have the feeling that POSIX already provides something that could help solve this issue: process groups.
We could create a new process group for a process pool, and checking whether children are still alive would be as simple as waitpid(-group, os.WNOHANG) (I don't know anything about Windows, but Google returned WaitForMultipleObjects which seems to work on multiple processes). You'd get the exit code for free.
History
Date User Action Args
2011-05-13 10:40:04neologixsetrecipients: + neologix, bquinlan, pitrou, haypo, jnoller, hongqn, brian.curtin, asksol, vlasovskikh, gdb, Albert.Strasheim, aljungberg, gkcn
2011-05-13 10:40:04neologixsetmessageid: <1305283204.11.0.0887437824261.issue9205@psf.upfronthosting.co.za>
2011-05-13 10:40:03neologixlinkissue9205 messages
2011-05-13 10:40:02neologixcreate