classification
Title: multiprocessing.pool processes started by worker handler stops working
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: jnoller Nosy List: asksol, jafo, jnoller, nirai, ysj.ray
Priority: critical Keywords: needs review, patch

Created on 2010-10-06 11:25 by asksol, last changed 2012-03-13 02:04 by jafo.

Files
File name Uploaded Description Edit
multiprocessing-worker-poll.patch asksol, 2010-10-06 11:25
Messages (3)
msg118062 - (view) Author: Ask Solem (asksol) (Python committer) Date: 2010-10-06 11:25
While working on an "autoscaling" (yes, people call it that...) feature for Celery, I noticed that the processes created by the _handle_workers thread doesn't always work.  I have reproduced this in general, by just using the maxtasksperchild feature and letting the workers terminate themselves so this seems to have always been an issue (just not easy to reproduce unless workers are created with some frequency)

I'm not quite sure of the reason yet, but I finally managed to track it down to the workers being stuck while receiving from the queue.

The patch attached seems to resolve the issue by polling the queue before trying to receive.

I know this is short, I may have some more data later.
msg122262 - (view) Author: ysj.ray (ysj.ray) Date: 2010-11-24 07:11
Could you give an example code which can reproduce this issue?
msg155556 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2012-03-13 02:04
The attached patch does change the semantics somewhat, but I don't fully understand how much.  In particular:

It changes the "get()" call to be turned into "get(timeout=1.0)" if inqueue doesn't have a _reader attribute.
In the case that inqueue doesn't have a _reader attribute, and "inqueue._reader.poll(timeout)" is false, "get()" isn't called at all.
It introduces a continue.

I'd want Jesse to pronounce on this.
History
Date User Action Args
2012-03-13 02:04:01jafosetassignee: jnoller

messages: + msg155556
nosy: + jafo, jnoller
2011-06-12 18:36:09terry.reedysetversions: - Python 3.1
2011-04-15 19:53:45niraisetnosy: + nirai
2010-11-24 07:11:02ysj.raysetnosy: + ysj.ray
messages: + msg122262
2010-10-06 11:25:57asksolcreate