Issue38744
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019-11-08 15:00 by AMDmi3, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (11) | |||
---|---|---|---|
msg356243 - (view) | Author: Dmitry Marakasov (AMDmi3) * | Date: 2019-11-08 15:00 | |
System: FreeBSD 12.0-RELEASE, amd64. This simple program from multiprocessing import Pool from time import sleep Pool().map(sleep, [0.01] * 10) works fine with python 3.7, but is likely (about 20-50% probability on my 4 core box) to hang with python 3.8. Example backtraces after interruption: % python3.8 1.py ^CException ignored in: <Finalize object, dead> Traceback (most recent call last): File "/usr/local/lib/python3.8/multiprocessing/util.py", line 201, in __call__ Process ForkPoolWorker-2: Process ForkPoolWorker-4: res = self._callback(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 689, in _terminate_pool cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 674, in _help_stuff_finish inqueue._rlock.acquire() KeyboardInterrupt: Process ForkPoolWorker-3: Process ForkPoolWorker-1: % python3.8 1.py ^CException ignored in: <Finalize object, dead> Traceback (most recent call last): File "/usr/local/lib/python3.8/multiprocessing/util.py", line 201, in __call__ Process ForkPoolWorker-3: Process ForkPoolWorker-4: Process ForkPoolWorker-1: res = self._callback(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 689, in _terminate_pool cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 674, in _help_stuff_finish inqueue._rlock.acquire() KeyboardInterrupt: Process ForkPoolWorker-2: % python3.8 1.py ^CException ignored in: <Finalize object, dead> Process ForkPoolWorker-2: Process ForkPoolWorker-3: Traceback (most recent call last): File "/usr/local/lib/python3.8/multiprocessing/util.py", line 201, in __call__ Process ForkPoolWorker-1: res = self._callback(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 689, in _terminate_pool cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 674, in _help_stuff_finish Traceback (most recent call last): Traceback (most recent call last): File "/usr/local/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap self.run() File "/usr/local/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap self.run() File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 114, in worker task = get() File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 114, in worker task = get() File "/usr/local/lib/python3.8/multiprocessing/queues.py", line 355, in get with self._rlock: File "/usr/local/lib/python3.8/multiprocessing/queues.py", line 355, in get with self._rlock: File "/usr/local/lib/python3.8/multiprocessing/synchronize.py", line 95, in __enter__ return self._semlock.__enter__() File "/usr/local/lib/python3.8/multiprocessing/synchronize.py", line 95, in __enter__ return self._semlock.__enter__() KeyboardInterrupt KeyboardInterrupt inqueue._rlock.acquire() KeyboardInterrupt: Traceback (most recent call last): File "/usr/local/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap self.run() File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/multiprocessing/pool.py", line 114, in worker task = get() File "/usr/local/lib/python3.8/multiprocessing/queues.py", line 356, in get res = self._reader.recv_bytes() File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes buf = self._recv(4) File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) KeyboardInterrupt Process ForkPoolWorker-4: Related issue in FreeBSD bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241801 |
|||
msg356546 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2019-11-13 18:59 | |
What happens if you instead write: with Pool() as pool: pool.map(sleep, [0.01] * 10) |
|||
msg356551 - (view) | Author: Dmitry Marakasov (AMDmi3) * | Date: 2019-11-13 20:18 | |
> What happens if you instead write: It survived more than 4000 runs, it looks like it doesn't experience the problem. |
|||
msg364003 - (view) | Author: Arkadiusz Miśkiewicz (arekm) | Date: 2020-03-12 08:54 | |
That test program hangs even on Linux (kernel 4.9.184, glibc 2.30): $ python3 a.py Process ForkPoolWorker-13: Process ForkPoolWorker-12: Process ForkPoolWorker-16: Process ForkPoolWorker-15: Process ForkPoolWorker-11: Process ForkPoolWorker-6: Process ForkPoolWorker-7: Process ForkPoolWorker-14: Process ForkPoolWorker-5: Process ForkPoolWorker-2: Process ForkPoolWorker-10: Process ForkPoolWorker-3: Process ForkPoolWorker-8: Process ForkPoolWorker-1: Process ForkPoolWorker-4: Exception ignored in: <Finalize object, dead> Traceback (most recent call last): File "/usr/lib64/python3.8/multiprocessing/util.py", line 201, in __call__ res = self._callback(*self._args, **self._kwargs) File "/usr/lib64/python3.8/multiprocessing/pool.py", line 689, in _terminate_pool Process ForkPoolWorker-9: cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/usr/lib64/python3.8/multiprocessing/pool.py", line 674, in _help_stuff_finish inqueue._rlock.acquire() KeyboardInterrupt: $ python3 --version Python 3.8.2 |
|||
msg364178 - (view) | Author: Arkadiusz Miśkiewicz (arekm) | Date: 2020-03-14 14:59 | |
Bisecting led to this commit and reverting it on 3.8 branch makes the hang go away. 7c994549dcffd0d9d3bb37475e6374f356e7240e is the first bad commit commit 7c994549dcffd0d9d3bb37475e6374f356e7240e Author: Pablo Galindo <Pablogsal@gmail.com> Date: Sat Mar 16 22:34:24 2019 +0000 bpo-35493: Use Process.sentinel instead of sleeping for polling worker status in multiprocessing.Pool (#11488) * bpo-35493: Use Process.sentinel instead of sleeping for polling worker status in multiprocessing.Pool * Use self-pipe pattern to avoid polling for changes * Refactor some variable names and add comments * Restore timeout and poll * Use reader object only on wait() * Recompute worker sentinels every time * Remove timeout and use change notifier * Refactor some methods to be overloaded by the ThreadPool, document the cache class and fix typos Lib/multiprocessing/pool.py | 88 +++++++++++++++++++--- .../2019-01-09-23-43-08.bpo-35493.kEcRGE.rst | 3 + 2 files changed, 80 insertions(+), 11 deletions(-) create mode 100644 bpo-35493.kEcRGE.rst">Misc/NEWS.d/next/Library/2019-01-09-23-43-08.bpo-35493.kEcRGE.rst |
|||
msg364182 - (view) | Author: Arkadiusz Miśkiewicz (arekm) | Date: 2020-03-14 15:18 | |
bpo-35493 |
|||
msg364183 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-03-14 15:36 | |
Check out https://bugs.python.org/issue39360 foe more context on this issue. |
|||
msg364184 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-03-14 15:41 | |
Also, notice that writing Pool().map(sleep, [0.01] * 10) is out of contact as the Pool object can be collected immediately and there is no proper termination and cleanup. The correct way is to use the context manager: with Pool() as pool: pool.map(sleep, [0.01] * 10) |
|||
msg364211 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-03-15 02:15 | |
I am mentoring @BTaskaya and I asked to take a look a this. After some debugging, we found that the problem is the following: When the pull is not used via the context manager or terminate() is called, there is a system in multiprocessing.util that handles finalization of all pools via an atexit handler (the Finalize) class. This class registers the _terminate_pool handler in the registry of finalizers of the module, and that registry is called on interpreter exit via _exit_function. The problem is that the "happy" path with the context manager or manual call to finalize() does some extra steps that _terminate_pool does not: def terminate(self): util.debug('terminating pool') self._state = TERMINATE self._worker_handler._state = TERMINATE self._change_notifier.put(None) self._terminate() In this code self._terminate() calls _terminate_pool. The step that is not executed when the atexit() handler calls _terminate_pool is pinging the _change_notifier queue to unblock the maintenance threads. The fix is moving the "self._change_notifier.put(None)" to the _terminate_pool function. |
|||
msg364212 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-03-15 02:15 | |
BTaskaya will create a PR to fix this soon. |
|||
msg364256 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-03-15 19:54 | |
Closed via PR 19009 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:23 | admin | set | github: 82925 |
2020-03-15 19:54:21 | pablogsal | set | versions: + Python 3.9 |
2020-03-15 19:54:16 | pablogsal | set | status: open -> closed resolution: fixed messages: + msg364256 stage: resolved |
2020-03-15 02:15:35 | pablogsal | set | messages: + msg364212 |
2020-03-15 02:15:13 | pablogsal | set | messages: + msg364211 |
2020-03-14 15:41:12 | pablogsal | set | messages: + msg364184 |
2020-03-14 15:36:32 | pablogsal | set | messages: + msg364183 |
2020-03-14 15:18:58 | arekm | set | messages: + msg364182 |
2020-03-14 14:59:40 | arekm | set | messages: + msg364178 |
2020-03-12 08:54:01 | arekm | set | nosy:
+ arekm messages: + msg364003 title: python 3.8 hang in multiprocessing.Pool() locking on FreeBSD -> python 3.8 hang in multiprocessing.Pool() locking on FreeBSD / Linux |
2019-11-18 02:52:20 | koobs | set | nosy:
+ koobs |
2019-11-13 20:18:30 | AMDmi3 | set | messages: + msg356551 |
2019-11-13 19:05:34 | pitrou | set | nosy:
+ pablogsal |
2019-11-13 18:59:19 | pitrou | set | messages: + msg356546 |
2019-11-08 19:42:59 | ned.deily | set | nosy:
+ pitrou, davin |
2019-11-08 15:00:44 | AMDmi3 | create |