Issue31463
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2017-09-14 06:36 by benjamin.peterson, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (15) | |||
---|---|---|---|
msg302143 - (view) | Author: Benjamin Peterson (benjamin.peterson) * ![]() |
Date: 2017-09-14 06:36 | |
This hangs reliably for me on master: $ ./python -m test.regrtest test_multiprocessing_fork test_subprocess It looks like test_multiprocessing_fork doesn't clean up some of its subprocesses and then test_subprocess hangs on waitpid(0) forever. |
|||
msg315486 - (view) | Author: Miro Hrončok (hroncok) * | Date: 2018-04-19 11:17 | |
This started to bother us in Fedora for various Python versions, so chances are something changed on the system level. However for us it seams test_multiprocessing_fork hangs by itself, so this might or might not be relevant to # python3.7 -m test.regrtest test_multiprocessing_fork test_subprocess Run tests sequentially 0:00:00 load avg: 1.16 [1/2] test_multiprocessing_fork /usr/lib64/python3.7/multiprocessing/semaphore_tracker.py:55: UserWarning: semaphore_tracker: process died unexpectedly, relaunching. Some semaphores might leak. warnings.warn('semaphore_tracker: process died unexpectedly, ' Exception in thread Thread-26: Traceback (most recent call last): File "/usr/lib64/python3.7/threading.py", line 917, in _bootstrap_inner self.run() File "/usr/lib64/python3.7/threading.py", line 865, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.7/multiprocessing/resource_sharer.py", line 139, in _serve signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG)) File "/usr/lib64/python3.7/signal.py", line 60, in pthread_sigmask sigs_set = _signal.pthread_sigmask(how, mask) ValueError: signal number 32 out of range (hangs) ^CProcess Process-184: Traceback (most recent call last): File "/usr/lib64/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/usr/lib64/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.7/test/_test_multiprocessing.py", line 3328, in child_access w = conn.recv() File "/usr/lib64/python3.7/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 960, in rebuild_connection fd = df.detach() File "/usr/lib64/python3.7/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/usr/lib64/python3.7/multiprocessing/resource_sharer.py", line 87, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 498, in Client answer_challenge(c, authkey) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 741, in answer_challenge message = connection.recv_bytes(256) # reject large message File "/usr/lib64/python3.7/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) KeyboardInterrupt Warning -- multiprocessing.process._dangling was modified by test_multiprocessing_fork Before: <_weakrefset.WeakSet object at 0x7f4b63782dd8> After: <_weakrefset.WeakSet object at 0x7f4b637821d0> Warning -- threading._dangling was modified by test_multiprocessing_fork Before: <_weakrefset.WeakSet object at 0x7f4b63782780> After: <_weakrefset.WeakSet object at 0x7f4b63782da0> Test suite interrupted by signal SIGINT. 2 tests omitted: test_multiprocessing_fork test_subprocess Total duration: 1 min 35 sec Tests result: INTERRUPTED # python3.7 --version Python 3.7.0b3 # python3.6 -m test.regrtest test_multiprocessing_fork test_subprocess Run tests sequentially 0:00:00 load avg: 1.84 [1/2] test_multiprocessing_fork /usr/lib64/python3.6/multiprocessing/semaphore_tracker.py:55: UserWarning: semaphore_tracker: process died unexpectedly, relaunching. Some semaphores might leak. warnings.warn('semaphore_tracker: process died unexpectedly, ' Exception in thread Thread-20: Traceback (most recent call last): File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/lib64/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.6/multiprocessing/resource_sharer.py", line 139, in _serve signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG)) File "/usr/lib64/python3.6/signal.py", line 60, in pthread_sigmask sigs_set = _signal.pthread_sigmask(how, mask) ValueError: signal number 32 out of range (hangs) ^CProcess Process-178: Traceback (most recent call last): ... KeyboardInterrupt Warning -- multiprocessing.process._dangling was modified by test_multiprocessing_fork Before: <_weakrefset.WeakSet object at 0x7f1873917c88> After: <_weakrefset.WeakSet object at 0x7f188516d978> Warning -- threading._dangling was modified by test_multiprocessing_fork Before: <_weakrefset.WeakSet object at 0x7f1873917d68> After: <_weakrefset.WeakSet object at 0x7f1873917e48> Test suite interrupted by signal SIGINT. 2 tests omitted: test_multiprocessing_fork test_subprocess Total duration: 51 sec Tests result: INTERRUPTED # python3.6 --version Python 3.6.5 # python3.5 -m test.regrtest test_multiprocessing_fork test_subprocess 0:00:00 load avg: 1.83 [1/2] test_multiprocessing_fork Exception in thread Thread-20: Traceback (most recent call last): File "/usr/lib64/python3.5/threading.py", line 914, in _bootstrap_inner self.run() File "/usr/lib64/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.5/multiprocessing/resource_sharer.py", line 139, in _serve signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG)) File "/usr/lib64/python3.5/signal.py", line 60, in pthread_sigmask sigs_set = _signal.pthread_sigmask(how, mask) ValueError: signal number 32 out of range (hangs) ^CProcess Process-178: Traceback (most recent call last): ... KeyboardInterrupt Warning -- multiprocessing.process._dangling was modified by test_multiprocessing_fork Before: <_weakrefset.WeakSet object at 0x7f2bb3923ba8> After: <_weakrefset.WeakSet object at 0x7f2bc4d88b38> Warning -- threading._dangling was modified by test_multiprocessing_fork Before: <_weakrefset.WeakSet object at 0x7f2bb3923be0> After: <_weakrefset.WeakSet object at 0x7f2bb3923f28> Test suite interrupted by signal SIGINT. 2 tests omitted: test_multiprocessing_fork test_subprocess Tests result: INTERRUPTED # python3.5 --version Python 3.5.4 # python3.4 -m test.regrtest test_multiprocessing_fork test_subprocess [1/2] test_multiprocessing_fork Exception in thread Thread-21: Traceback (most recent call last): File "/usr/lib64/python3.4/threading.py", line 911, in _bootstrap_inner self.run() File "/usr/lib64/python3.4/threading.py", line 859, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.4/multiprocessing/resource_sharer.py", line 139, in _serve signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG)) ValueError: signal number 32 out of range (hangs) ^CTraceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib64/python3.4/multiprocessing/forkserver.py", line 170, in main Warning -- multiprocessing.process._dangling was modified by test_multiprocessing_fork Warning -- threading._dangling was modified by test_multiprocessing_fork Process Process-180: Traceback (most recent call last): ... KeyboardInterrupt Test suite interrupted by signal SIGINT. 2 tests omitted: test_multiprocessing_fork test_subprocess # python3.4 --version Python 3.4.8 |
|||
msg315487 - (view) | Author: Miro Hrončok (hroncok) * | Date: 2018-04-19 11:19 | |
Sorry, I've pressed the button before finishing the thought. ...so this might or might not be relevant to what you observe with master and originally reported. |
|||
msg315580 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-04-21 20:17 | |
@Benjamin I can't reproduce here. It's true that test_multiprocessing_fork leaves two processes being itself, but that doesn't seem to disturb test_subprocess. @Miro Could you report the signal number issue separately? Thank you! |
|||
msg315583 - (view) | Author: Benjamin Peterson (benjamin.peterson) * ![]() |
Date: 2018-04-21 20:34 | |
It doesn't hang for me anymore, but does print this warning: 0:00:00 load avg: 1.59 [1/2] test_multiprocessing_fork Lib/multiprocessing/semaphore_tracker.py:55: UserWarning: semaphore_tracker: process died unexpectedly, relaunching. Some semaphores might leak. warnings.warn('semaphore_tracker: process died unexpectedly, ' |
|||
msg315584 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-04-21 20:41 | |
There's a test that deliberately kills the semaphore tracker, so that warning should be expected (I don't see it myself, but that might be because of differing warning settings). |
|||
msg315591 - (view) | Author: Miro Hrončok (hroncok) * | Date: 2018-04-22 08:08 | |
> Could you report the signal number issue separately? Thank you! https://bugs.python.org/issue33329 Resetting the Versions back. |
|||
msg317382 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-05-23 10:37 | |
> It looks like test_multiprocessing_fork doesn't clean up some of its subprocesses and then test_subprocess hangs on waitpid(0) forever. Oh strange. This issue was supposed to be fixed. I spent a lot of time to fix such bug, but also make sure that not test leak any child process. "test_subprocess hangs on waitpid(0) forever" I recall a similar bug that I fixed recently. I spent time on test.support.reap_children() and test.support.threading_cleanup() to make sure that tests don't leak threads no child processes. It seems like I missed some tests. In Lib/test/libregrtest/runtest.py, I added: def post_test_cleanup(): support.reap_children() which is run after each test. But currently, support.reap_children() calls os.waitpid(-1, os.WNOHANG) to check for child processes which already completed. It doesn't warn if there are child processes which are still running. During my tests, I modified reap_children() to add a sleep(1). Maybe on Linux we could use a cgroup to really check for all child processes? |
|||
msg317399 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-05-23 12:52 | |
Let's take another look at the issue: why does test_subprocess need to know about all child processes, rather than those that were launched during test_subprocess? |
|||
msg320064 - (view) | Author: Pablo Galindo Salgado (pablogsal) * ![]() |
Date: 2018-06-20 10:46 | |
I would like to work on this issue |
|||
msg320065 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-06-20 10:50 | |
Last time I was bitten by waitpid(): commit 11045c9d8a21dd9bd182a3939189db02815f9783 Author: Victor Stinner <victor.stinner@gmail.com> Date: Thu Oct 5 06:32:53 2017 -0700 bpo-31178: Mock os.waitpid() in test_subprocess (#3896) Fix test_exception_errpipe_bad_data() and test_exception_errpipe_normal() of test_subprocess: mock os.waitpid() to avoid calling the real os.waitpid(0, 0) which is an unexpected side effect of the test. |
|||
msg320066 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-06-20 10:51 | |
See also the now hopefully fixed bpo-31509 "test_subprocess hangs randomly on AMD64 Windows10 3.x". |
|||
msg320067 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-06-20 10:54 | |
Antoine: "Let's take another look at the issue: why does test_subprocess need to know about all child processes, rather than those that were launched during test_subprocess?" It seems like this bug has been fixed by bpo-31178, commit 11045c9d8a21dd9bd182a3939189db02815f9783, no? |
|||
msg320068 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-06-20 10:54 | |
Another old (now fixed) issue of test_subprocess: bpo-31173 "test_subprocess: test_child_terminated_in_stopped_state() leaks a zombie process". |
|||
msg320617 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2018-06-27 20:39 | |
> This hangs reliably for me on master: > $ ./python -m test.regrtest test_multiprocessing_fork test_subprocess FYI right now, this command doesn't hang anymore. > ValueError: signal number 32 out of range This bug has been fixed by bpo-33332. > Let's take another look at the issue: why does test_subprocess need to know about all child processes, rather than those that were launched during test_subprocess? This issue should be fixed by my commit 11045c9d8a21dd9bd182a3939189db02815f9783. It seems like all issues described here have been fixed and so I close the issue. If an issue is not properly fixed, maybe it's better to open a new issue since this one has long history and described multiple unrelated bugs. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:52 | admin | set | github: 75644 |
2018-06-27 20:39:46 | vstinner | set | status: open -> closed resolution: fixed messages: + msg320617 stage: resolved |
2018-06-20 10:54:39 | vstinner | set | messages: + msg320068 |
2018-06-20 10:54:08 | vstinner | set | messages: + msg320067 |
2018-06-20 10:51:04 | vstinner | set | messages: + msg320066 |
2018-06-20 10:50:09 | vstinner | set | messages: + msg320065 |
2018-06-20 10:46:53 | pablogsal | set | messages: + msg320064 |
2018-06-20 10:17:31 | pablogsal | set | assignee: pablogsal nosy: + pablogsal |
2018-05-23 12:52:53 | pitrou | set | messages: + msg317399 |
2018-05-23 10:37:18 | vstinner | set | messages: + msg317382 |
2018-05-03 12:19:43 | vstinner | set | nosy:
+ vstinner |
2018-04-22 08:08:49 | hroncok | set | messages:
+ msg315591 versions: - Python 3.4, Python 3.5, Python 3.6 |
2018-04-21 20:41:59 | pitrou | set | messages: + msg315584 |
2018-04-21 20:34:29 | benjamin.peterson | set | messages: + msg315583 |
2018-04-21 20:17:41 | pitrou | set | nosy:
+ pitrou messages: + msg315580 |
2018-04-19 11:19:16 | hroncok | set | messages: + msg315487 |
2018-04-19 11:17:22 | hroncok | set | nosy:
+ hroncok messages: + msg315486 versions: + Python 3.4, Python 3.5, Python 3.6 |
2017-09-14 06:36:00 | benjamin.peterson | create |