This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: multiprocessing maxtasksperchild=1 + logging = task loss
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.4, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: nelson, sbt, vinay.sajip, xtreak
Priority: normal Keywords:

Created on 2015-01-20 02:02 by nelson, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
bug-demo.py nelson, 2015-01-20 02:02 demonstration of bug
Messages (2)
msg234336 - (view) Author: Nelson Minar (nelson) Date: 2015-01-20 02:02
I have a demonstration of a problem where the combination of multiprocessing with maxtasksperchild=1 and the Python logging library causes tasks to occasionally get lost. The bug might be related to issue 22393 or issue 6721, but I'm not certain. issue 10037 and issue 9205 also might be relevant.  I've attached sample code, it can also be found at https://gist.github.com/NelsonMinar/022794b6a709ea5b7682

My program uses Pool.imap_unordered() to execute 200 tasks. Each worker task writes a log message and sleeps a short time. The master process uses a timeout on next() to log a status message occasionally.

When it works, 200 jobs are completed quickly. When it breaks, roughly 195 of 200 jobs will have completed and next() never raises StopIteration.

If everything logs to logging.getLogger() and maxtasksperchild=1, it usually breaks. It appears that sometimes jobs just get lost and don't complete. We've observed that with maxtasksperchild=1 sometimes a new worker process gets created but no work assigned to it. When that happens the task queue never runs to completion.

If we log straight to stderr or don't set maxtasksperchild, the run completes.

The bug has been observed in Python 2.7.6 and Python 3.4.0 on Ubuntu 14.04

This is a distillation of much more complex application-specific code. Discussion of the bug and original code can be found at

https://github.com/openaddresses/machine/issues/51
https://github.com/openaddresses/machine/blob/7c3d0fba8ba0915af2101ace45dfaf5519d5ad85/openaddr/jobs.py

Thank you, Nelson
msg234392 - (view) Author: Nelson Minar (nelson) Date: 2015-01-20 20:47
Doing some more testing, I noticed that if I ask multiprocessing to also log, the problem stops occurring. If I configure multiprocessing.log_to_stderr() instead, the error still occurs. 

Here's how I configured multiprocessing logging that makes the problem go away. This goes right at the top of the main() function.

    mp_logger = multiprocessing.get_logger()
    mp_logger.propagate=True
    mp_logger.setLevel(logging.DEBUG)
History
Date User Action Args
2022-04-11 14:58:12adminsetgithub: 67467
2018-09-17 12:52:44xtreaksetnosy: + xtreak
2015-01-20 20:47:40nelsonsetmessages: + msg234392
2015-01-20 04:49:40ned.deilysetnosy: + vinay.sajip, sbt

stage: needs patch
2015-01-20 02:02:52nelsoncreate