Issue 43306: Error in multiprocessing.Pool's initializer doesn't stop execution

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/87472

classification

Title:	Error in multiprocessing.Pool's initializer doesn't stop execution
Type:	behavior	Stage:
Components:	Documentation, Library (Lib)	Versions:	Python 3.9

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	athompson6735, davin, docs@python, nemeskeyd, pitrou, terry.reedy
Priority:	normal	Keywords:

Created on 2021-02-23 15:55 by nemeskeyd, last changed 2022-04-11 14:59 by admin.

Files
File name	Uploaded	Description	Edit
pool.py	athompson6735, 2021-08-23 18:43

Messages (4)
msg387577 - (view)	Author: Dávid Nemeskey (nemeskeyd)	Date: 2021-02-23 15:55
There is an inconsistency in how multiprocessing.Pool handles exceptions thrown in the workers: - exceptions raised by the mapped function stop execution right away - exceptions raised in an initializer are ignored and the pool continues spawning new workers indefinitely, each of them failing. I believe the behavior should be the same in both cases, and of the two, the first one is preferable (especially since that's what people are used to). The documentation doesn't cover how exceptions are handled in pools, either.
msg387763 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2021-02-27 03:38
Can you add a minimal example with the ignore behavior?
msg400150 - (view)	Author: Aaron (athompson6735) *	Date: 2021-08-23 17:12
I ran into this bug answering this question on Stack Overflow: https://stackoverflow.com/questions/68890437/cannot-use-result-from-multiprocess-pool-directly I have minimized the code required to replicate the behavior, but it boils down to: when using "spawn" to create a multiprocessing pool, if an exception occurs during the bootstrapping phase of the new child or during the initialization function with any start method, it is just cleaned up, and another takes its place (which will also fail). This creates an infinite loop of creating child workers, workers exiting due to an exception, and re-populating the pool with new workers. ``` import multiprocessing multiprocessing.set_start_method("spawn") # bootstraping only problem with spawn def task(): print("task") if __name__ == "__main__": with multiprocessing.Pool() as p: p.apply(task) else: raise Exception("raise in child during bootstraping phase") ################################################# # or ################################################# import multiprocessing # multiprocessing.set_start_method("fork") # fork or spawn doesn't matter def task(): print("task") def init(): raise Exception("raise in child during initialization function") if __name__ == "__main__": with multiprocessing.Pool(initializer=init) as p: p.apply(task) ``` If Pool._join_exited_workers could determine if a worker exited before bootstrapping, or the initialization function completed, It would indicate a likely significant problem. I'm fine with exceptions in the worker target function not being re-raised in the parent, however it seems the Pool should stop trying if it's failing to create new workers.
msg400161 - (view)	Author: Aaron (athompson6735) *	Date: 2021-08-23 18:43
What should the behavior be if an exception is raised in a pool worker during bootstrapping / initialization function execution? I think an exception should be raised in the process owning the Pool, and in the fix I'm tinkering around with I just raise a RuntimeError currently. I can see an argument also for raising different exceptions (or having different behavior) for bootstrapping error vs init function, but implementation is more complicated. My current implementation simply creates a lock in _repopulate_pool_static, acquires it, and waits for the worker function to release it. By polling every 100ms I also detect if the process exited before releasing the lock in which case I raise a Runtime error. I just started testing this implementation, but I'll provide it for anyone else who wants to test / comment.

History
Date	User	Action	Args
2022-04-11 14:59:41	admin	set	github: 87472
2021-08-23 18:43:57	athompson6735	set	files: + pool.py messages: + msg400161
2021-08-23 17:12:14	athompson6735	set	nosy: + athompson6735 messages: + msg400150 versions: + Python 3.9, - Python 3.8
2021-02-27 03:38:03	terry.reedy	set	assignee: docs@python -> messages: + msg387763 nosy: + terry.reedy, pitrou, davin
2021-02-23 15:55:10	nemeskeyd	create