Issue38501
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019-10-16 18:31 by larsoner, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 11488 | pablogsal, 2020-04-10 11:03 | ||
PR 19466 | merged | pablogsal, 2020-04-11 00:52 | |
PR 19467 | merged | miss-islington, 2020-04-11 02:05 | |
PR 19468 | merged | miss-islington, 2020-04-11 02:05 |
Messages (14) | |||
---|---|---|---|
msg354813 - (view) | Author: Eric Larson (larsoner) * | Date: 2019-10-16 18:31 | |
The following code hangs on Python 3.8.0.rc0 on Ubuntu 19.10 when exiting the interpreter: from multiprocessing import Pool class A(object): def __init__(self): self.pool = Pool(processes=2) solver = A() When you eventually do ctrl-C, the traceback is: ^CProcess ForkPoolWorker-2: Error in atexit._run_exitfuncs: Process ForkPoolWorker-1: Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/util.py", line 277, in _run_finalizers finalizer() File "/usr/lib/python3.8/multiprocessing/util.py", line 201, in __call__ res = self._callback(*self._args, **self._kwargs) File "/usr/lib/python3.8/multiprocessing/pool.py", line 689, in _terminate_pool cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "/usr/lib/python3.8/multiprocessing/pool.py", line 674, in _help_stuff_finish inqueue._rlock.acquire() KeyboardInterrupt Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap self.run() File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker task = get() File "/usr/lib/python3.8/multiprocessing/queues.py", line 355, in get with self._rlock: File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 95, in __enter__ return self._semlock.__enter__() KeyboardInterrupt Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap self.run() File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker task = get() File "/usr/lib/python3.8/multiprocessing/queues.py", line 356, in get res = self._reader.recv_bytes() File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes buf = self._recv(4) File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) KeyboardInterrupt A similar problem seems to occur when garbage collecting when it's part of a class, and that class has not yet called `self.pool.terminate()` and `self.pool.close()`. Hopefully fixing the `atexit` behavior will similarly fix the `gc.collect()` behavior. Cross-ref is in SciPy, https://github.com/scipy/scipy/issues/10927. It appears to also cause hangs on Travis 3.8-dev builds: https://travis-ci.org/scipy/scipy/jobs/598786785 |
|||
msg354824 - (view) | Author: Paul Silisteanu (Paul Silisteanu) | Date: 2019-10-17 01:16 | |
Same issue on Window 10 and macOS Catalina with Python 3.8.0. Here is the error I get on Windows 10: C:\Users\x>py -3.8 Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from multiprocessing import get_context >>> >>> class A(object): ... def __init__(self): ... self.pool = get_context(None).Pool(2) ... >>> solver=A() >>> quit() Process SpawnPoolWorker-2: Error in atexit._run_exitfuncs: Process SpawnPoolWorker-1: Traceback (most recent call last): File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 313, in _bootstrap self.run() File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 114, in worker task = get() File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\queues.py", line 355, in get with self._rlock: File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\synchronize.py", line 95, in __enter__ return self._semlock.__enter__() Traceback (most recent call last): KeyboardInterrupt Traceback (most recent call last): File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\util.py", line 277, in _run_finalizers finalizer() File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\util.py", line 201, in __call__ res = self._callback(*self._args, **self._kwargs) File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 689, in _terminate_pool cls._help_stuff_finish(inqueue, task_handler, len(pool)) File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 674, in _help_stuff_finish inqueue._rlock.acquire() KeyboardInterrupt File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 313, in _bootstrap self.run() File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\pool.py", line 114, in worker task = get() File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\queues.py", line 356, in get res = self._reader.recv_bytes() File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\multiprocessing\connection.py", line 305, in _recv_bytes waitres = _winapi.WaitForMultipleObjects( KeyboardInterrupt C:\Users\x> |
|||
msg361464 - (view) | Author: Stefano Rivera (stefanor) * | Date: 2020-02-06 01:19 | |
A bisect points at https://github.com/python/cpython/pull/11488 being the change that introduced this hang (bpo-35493) |
|||
msg366133 - (view) | Author: Stefano Rivera (stefanor) * | Date: 2020-04-10 15:54 | |
Looks like it was fixed by PR 19009 (bpo-39360)). |
|||
msg366134 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-04-10 15:57 | |
>Looks like it was fixed by PR 19009 (bpo-39360)). Can we close the issue then? If not, could yo provide a reproducer and also check if the big is fixed on master or 3.8 HEAD? Thanks! |
|||
msg366135 - (view) | Author: Stefano Rivera (stefanor) * | Date: 2020-04-10 16:00 | |
On Linux, the reproducer in https://bugs.python.org/issue38501#msg354813 fails on ac10e0c93218627d1a639db0b7b41714c5f6a883^ and passes on ac10e0c93218627d1a639db0b7b41714c5f6a883, which is why I say PR 19009 fixes it. Not sure if there are any special considerations for Windows here. HEAD is good too. So from my PoV, good to close this. |
|||
msg366136 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-04-10 16:10 | |
> Not sure if there are any special considerations for Windows here. It should not be anything windows specific here. Also notice that although "it works", the reproducer in https://bugs.python.org/issue38501#msg354813 is out of contract because is not closing or terminating the pool correctly and is not using it with the context manager so if you are dealing with code equivalent to that I would suggest to take a look to properly finalize/finalize the pool to about similar problems (or other problems related to late finalization). |
|||
msg366137 - (view) | Author: Eric Larson (larsoner) * | Date: 2020-04-10 16:31 | |
If that's out of contract, perhaps there should probably a big, visible warning at the top of the multiprocessning docs stating that creating one of these objects requires either using a context manager or ensuring manual `.close()`ing? 1. It either used to be in contract not to close it manually or was wrongly represented, the first Python 2.7 example in the docs (https://docs.python.org/2/library/multiprocessing.html#introduction) is: from multiprocessing import Pool def f(x): return x*x if __name__ == '__main__': p = Pool(5) print(p.map(f, [1, 2, 3])) So I suspect this (difficult to track down) problem might hit users without more adequate warning. 2. I'm surprised it's actually out of contract when the 3.8 docs state that close will be automatically called: > close() > > Indicate that no more data will be put on this queue by the current process. The background thread will quit once it has flushed all buffered data to the pipe. This is called automatically when the queue is garbage collected. and > terminate() > > Stops the worker processes immediately without completing outstanding work. When the pool object is garbage collected terminate() will be called immediately. Or perhaps I misunderstand what this is saying? |
|||
msg366142 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-04-10 17:54 | |
> If that's out of contract, perhaps there should probably a big, visible warning at the top of the multiprocessning docs stating that creating one of these objects requires either using a context manager or ensuring manual `.close()`ing? Why? This is a resource like any other and it requires proper resource management. Would you also put a big warning on "open()" stating that opening a file requires either using a context manager or ensure a manual close()? > the first Python 2.7 example in the docs Python 2.7 is not supported and the pool has changed *a lot* since Python 2. For instance, the pool now does more correct resource management, it does not leak threads and it supports more safe mechanisms as a context manager. The reason it didn't hang that much in Python2.7 is likely because some threads were being leaked. > This is called automatically when the queue is garbage collected. Yeah, and CPython does not promise that the __del__ method of any object will be called, so is not assured that the finalized will call close(): https://docs.python.org/3/reference/datamodel.html#object.__del__ "It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits" |
|||
msg366145 - (view) | Author: Eric Larson (larsoner) * | Date: 2020-04-10 18:21 | |
> Why? This is a resource like any other and it requires proper resource management. Would you also put a big warning on "open()" stating that opening a file requires either using a context manager or ensure a manual close()? One potential reason would be that the consequences of bad resource management in this case are different than in the open() case, i.e., here the interpreter hangs -- or Travis runs for your repo (SciPy) get stuck with over-50-minute errors, which is how we started looking for how to track it down. > > the first Python 2.7 example in the docs > Python 2.7 is not supported and the pool has changed *a lot* since Python 2. Indeed, my point is more about potential prevalence: this (now incorrect) problematic usage pattern was the first example in the docs for multiprocessing for a long time, indicating that there might be a lot of code in the wild that might (still) make use of it. > Yeah, and CPython does not promise that the __del__ method of any object will be called, so is not assured that the finalized will call close(): Ahh, I didn't know that. This might be another good reason to add a warning about the risks of not ensuring closure yourself: others might have the same gap in knowledge I had, and assume the docs are saying that things will be taken care of by garbage collection on exit when they will not. But you of course cannot document every possible problematic situation users could put themselves in, so I leave it up to your judgment whether or not it's worth it in this case. |
|||
msg366147 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-04-10 18:53 | |
> One potential reason would be that the consequences of bad resource management in this case are different than in the open() case, i.e., here the interpreter hangs -- or Travis runs for your repo (SciPy) get stuck with over-50-minute errors, which is how we started looking for how to track it down. Agreed, but is the same problem: resources need to be managed. I am although ok if you want to add some targeted warning to the multiprocessing pool docs indicating what can happen if the resource is not properly managed. > Indeed, my point is more about potential prevalence: this (now incorrect) problematic usage pattern was the first example in the docs for multiprocessing for a long time, indicating that there might be a lot of code in the wild that might (still) make use of it. Right, we put great efforts to make the code such that even incorrect usages do not hang (and believe me, is *very* challenging) but we cannot sacrifice correct usage fixes or big improvements so incorrect usages keep working even if they are leaking resources. Hopefully, more and more people start using the context manager or are aware that are doing something wrong leaking the pool. ---- In conclusion, I agree that maybe adding some targetted warning to the pool docs about this is in place. When I prepare the PR, would you like to review the message to confirm that is clear enough and that makes sense? |
|||
msg366178 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2020-04-11 02:05 | |
New changeset 7ec43a73092d43c6c95e7dd2669f49d54b57966f by Pablo Galindo in branch 'master': bpo-38501: Add a warning section to multiprocessing.Pool docs about resource managing (GH-19466) https://github.com/python/cpython/commit/7ec43a73092d43c6c95e7dd2669f49d54b57966f |
|||
msg366179 - (view) | Author: miss-islington (miss-islington) | Date: 2020-04-11 02:11 | |
New changeset 2e49b5254aa0888dd21a7929be14b6cfe03d56ef by Miss Islington (bot) in branch '3.7': bpo-38501: Add a warning section to multiprocessing.Pool docs about resource managing (GH-19466) https://github.com/python/cpython/commit/2e49b5254aa0888dd21a7929be14b6cfe03d56ef |
|||
msg366180 - (view) | Author: miss-islington (miss-islington) | Date: 2020-04-11 02:11 | |
New changeset ceba0648d70524ec4ebb7a46d5d1162d4938c7ba by Miss Islington (bot) in branch '3.8': bpo-38501: Add a warning section to multiprocessing.Pool docs about resource managing (GH-19466) https://github.com/python/cpython/commit/ceba0648d70524ec4ebb7a46d5d1162d4938c7ba |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:21 | admin | set | github: 82682 |
2020-04-11 02:11:20 | miss-islington | set | messages: + msg366180 |
2020-04-11 02:11:19 | miss-islington | set | messages: + msg366179 |
2020-04-11 02:05:59 | miss-islington | set | pull_requests: + pull_request18823 |
2020-04-11 02:05:52 | miss-islington | set | nosy:
+ miss-islington pull_requests: + pull_request18822 |
2020-04-11 02:05:43 | pablogsal | set | messages: + msg366178 |
2020-04-11 00:52:59 | pablogsal | set | pull_requests: + pull_request18821 |
2020-04-10 18:53:20 | pablogsal | set | messages: + msg366147 |
2020-04-10 18:21:32 | larsoner | set | messages: + msg366145 |
2020-04-10 17:54:06 | pablogsal | set | messages: + msg366142 |
2020-04-10 16:31:07 | larsoner | set | messages: + msg366137 |
2020-04-10 16:10:33 | pablogsal | set | status: open -> closed resolution: fixed messages: + msg366136 stage: patch review -> resolved |
2020-04-10 16:00:53 | stefanor | set | messages: + msg366135 |
2020-04-10 15:57:36 | pablogsal | set | messages: + msg366134 |
2020-04-10 15:54:00 | stefanor | set | messages: + msg366133 |
2020-04-10 11:03:43 | pablogsal | set | keywords:
+ patch nosy: + pablogsal pull_requests: + pull_request18808 stage: patch review |
2020-04-09 01:52:45 | Andrew Nelson | set | nosy:
+ Andrew Nelson |
2020-02-06 01:19:21 | stefanor | set | nosy:
+ stefanor messages: + msg361464 |
2019-11-09 14:58:29 | ralf.gommers | set | nosy:
+ ralf.gommers |
2019-10-19 12:27:36 | cjrh | set | nosy:
+ cjrh |
2019-10-17 01:16:41 | Paul Silisteanu | set | nosy:
+ Paul Silisteanu messages: + msg354824 |
2019-10-16 18:31:14 | larsoner | create |