classification
Title: infinite waiting in multiprocessing.Pool
Type: behavior Stage:
Components: Versions: Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Dongyan Li, calimeroteknik, coells, davin, ned.deily, pitrou, ronaldoussoren
Priority: normal Keywords:

Created on 2018-09-23 20:30 by coells, last changed 2019-02-25 16:22 by Dongyan Li.

Messages (11)
msg326178 - (view) Author: Tomáš Bouda (coells) Date: 2018-09-23 20:30
I have encountered a possible bug inside multiprocessing.Pool which behaves like race-condition while I don't believe it is a typical one.

Simply put, Pool from time to time freezes. It is occasional and hard to reproduce, but e.g. unit-tests running 3/day freeze several times a week.

We are using Pool heavily in our applications. Usually tens of workers and heavy load for each one of them. This production environment is using Python 2.7 (RHEL) and custom build, etc. However, I reproduced the same behavior in Python 3.6 (OSX) on my local machine.

When I run the following script like 20x, I get one or two frozen instances. You may notice in the output that ForkPoolWorker-42 never calls self.run(). The application than freezes as-is since it is probably waiting for the process.

It is easier to reproduce the behavior using debugger (PyCharm-Pro in my case), however, in our production environment there is just clean run, the bug occurs more often since multiprocessing is used quite a lot in there.

Thanks,
Tomas


--- My script:

import logging
from multiprocessing.pool import Pool
from multiprocessing.util import log_to_stderr

def f(i):
    print(i)

log_to_stderr(logging.DEBUG)

pool = Pool(50)
pool.map(f, range(2))
pool.close()
pool.join()

--- Output:

[DEBUG/MainProcess] created semlock with handle 9
[DEBUG/MainProcess] created semlock with handle 10
[DEBUG/MainProcess] created semlock with handle 13
[DEBUG/MainProcess] created semlock with handle 14
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-1] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-2] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-3] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-4] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-5] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-6] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-7] child process calling self.run()
[INFO/ForkPoolWorker-9] child process calling self.run()
[INFO/ForkPoolWorker-10] child process calling self.run()
[INFO/ForkPoolWorker-8] child process calling self.run()
[INFO/ForkPoolWorker-12] child process calling self.run()
[INFO/ForkPoolWorker-13] child process calling self.run()
[INFO/ForkPoolWorker-11] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-14] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-15] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-16] child process calling self.run()
[INFO/ForkPoolWorker-17] child process calling self.run()
[INFO/ForkPoolWorker-18] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-19] child process calling self.run()
[INFO/ForkPoolWorker-20] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-21] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-22] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-23] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-24] child process calling self.run()
[INFO/ForkPoolWorker-25] child process calling self.run()
[INFO/ForkPoolWorker-26] child process calling self.run()
[INFO/ForkPoolWorker-27] child process calling self.run()
[INFO/ForkPoolWorker-28] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-29] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-30] child process calling self.run()
[INFO/ForkPoolWorker-31] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-32] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-33] child process calling self.run()
[INFO/ForkPoolWorker-34] child process calling self.run()
[INFO/ForkPoolWorker-35] child process calling self.run()
[INFO/ForkPoolWorker-36] child process calling self.run()
[INFO/ForkPoolWorker-37] child process calling self.run()
[INFO/ForkPoolWorker-38] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-39] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-40] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-41] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-43] child process calling self.run()
[INFO/ForkPoolWorker-44] child process calling self.run()
[INFO/ForkPoolWorker-45] child process calling self.run()
[INFO/ForkPoolWorker-46] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-47] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
0
1
[INFO/ForkPoolWorker-48] child process calling self.run()
[DEBUG/MainProcess] closing pool
[DEBUG/MainProcess] joining pool
[DEBUG/MainProcess] worker handler exiting
[DEBUG/MainProcess] task handler got sentinel
[DEBUG/MainProcess] task handler sending sentinel to result handler
[DEBUG/MainProcess] task handler sending sentinel to workers
[DEBUG/ForkPoolWorker-3] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-4] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-5] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-3] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-6] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-4] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-7] worker got sentinel -- exiting
[INFO/ForkPoolWorker-3] process shutting down
[DEBUG/ForkPoolWorker-5] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-10] worker got sentinel -- exiting
[INFO/ForkPoolWorker-4] process shutting down
[DEBUG/ForkPoolWorker-6] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-7] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-3] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-4] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-8] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-10] worker exiting after 0 tasks
[INFO/ForkPoolWorker-5] process shutting down
[INFO/ForkPoolWorker-7] process shutting down
[INFO/ForkPoolWorker-6] process shutting down
[DEBUG/ForkPoolWorker-3] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-8] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-4] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-7] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-6] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-5] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-10] process shutting down
[INFO/ForkPoolWorker-8] process shutting down
[INFO/ForkPoolWorker-3] process exiting with exitcode 0
[INFO/ForkPoolWorker-4] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-10] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-7] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-8] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-5] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-7] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-10] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-9] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-6] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-5] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-8] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-9] worker exiting after 0 tasks
[INFO/ForkPoolWorker-10] process exiting with exitcode 0
[INFO/ForkPoolWorker-6] process exiting with exitcode 0
[INFO/ForkPoolWorker-8] process exiting with exitcode 0
[INFO/ForkPoolWorker-9] process shutting down
[DEBUG/ForkPoolWorker-9] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-12] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-12] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-13] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-11] worker got sentinel -- exiting
[INFO/ForkPoolWorker-12] process shutting down
[DEBUG/ForkPoolWorker-14] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-13] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-11] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-15] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-12] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-14] worker exiting after 0 tasks
[INFO/ForkPoolWorker-13] process shutting down
[DEBUG/ForkPoolWorker-16] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-15] worker exiting after 0 tasks
[INFO/ForkPoolWorker-11] process shutting down
[DEBUG/ForkPoolWorker-12] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-13] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-16] worker exiting after 0 tasks
[INFO/ForkPoolWorker-14] process shutting down
[DEBUG/ForkPoolWorker-17] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-11] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-18] worker got sentinel -- exiting
[INFO/ForkPoolWorker-15] process shutting down
[INFO/ForkPoolWorker-12] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-14] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-16] process shutting down
[DEBUG/ForkPoolWorker-13] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-17] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-15] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-18] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-19] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-16] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-13] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-11] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-14] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-15] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-11] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-19] worker exiting after 0 tasks
[INFO/ForkPoolWorker-17] process shutting down
[INFO/ForkPoolWorker-14] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-16] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-15] process exiting with exitcode 0
[INFO/ForkPoolWorker-19] process shutting down
[INFO/ForkPoolWorker-16] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-20] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-17] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-19] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-18] process shutting down
[DEBUG/ForkPoolWorker-20] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-19] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-18] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-17] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-20] process shutting down
[DEBUG/ForkPoolWorker-21] worker got sentinel -- exiting
[INFO/ForkPoolWorker-19] process exiting with exitcode 0
[DEBUG/MainProcess] result handler got sentinel
[INFO/ForkPoolWorker-17] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-18] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-20] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-21] worker exiting after 0 tasks
[DEBUG/MainProcess] ensuring that outqueue is not full
[DEBUG/ForkPoolWorker-22] worker got sentinel -- exiting
[INFO/ForkPoolWorker-18] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-20] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-21] process shutting down
[DEBUG/ForkPoolWorker-23] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-22] worker exiting after 0 tasks
[INFO/ForkPoolWorker-20] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-21] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-23] worker exiting after 0 tasks
[INFO/ForkPoolWorker-22] process shutting down
[DEBUG/ForkPoolWorker-24] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-21] running the remaining "atexit" finalizers
[DEBUG/MainProcess] result handler exiting: len(cache)=0, thread._state=0
[INFO/ForkPoolWorker-23] process shutting down
[DEBUG/ForkPoolWorker-22] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-25] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-24] worker exiting after 0 tasks
[INFO/ForkPoolWorker-21] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-23] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-25] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-22] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-26] worker got sentinel -- exiting
[INFO/ForkPoolWorker-24] process shutting down
[DEBUG/ForkPoolWorker-23] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-25] process shutting down
[INFO/ForkPoolWorker-22] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-26] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-24] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-27] worker got sentinel -- exiting
[INFO/ForkPoolWorker-23] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-25] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-26] process shutting down
[DEBUG/ForkPoolWorker-27] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-24] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-28] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-25] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-26] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-24] process exiting with exitcode 0
[INFO/ForkPoolWorker-27] process shutting down
[DEBUG/ForkPoolWorker-28] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-29] worker got sentinel -- exiting
[INFO/ForkPoolWorker-25] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-26] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-27] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-30] worker got sentinel -- exiting
[INFO/ForkPoolWorker-28] process shutting down
[DEBUG/ForkPoolWorker-29] worker exiting after 0 tasks
[INFO/ForkPoolWorker-26] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-31] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-30] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-27] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-28] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-29] process shutting down
[DEBUG/ForkPoolWorker-31] worker exiting after 0 tasks
[INFO/ForkPoolWorker-27] process exiting with exitcode 0
[INFO/ForkPoolWorker-30] process shutting down
[DEBUG/ForkPoolWorker-28] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-29] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-31] process shutting down
[DEBUG/ForkPoolWorker-30] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-33] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-32] worker got sentinel -- exiting
[INFO/ForkPoolWorker-28] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-29] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-31] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-33] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-30] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-32] worker exiting after 0 tasks
[INFO/ForkPoolWorker-29] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-31] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-33] process shutting down
[INFO/ForkPoolWorker-30] process exiting with exitcode 0
[INFO/ForkPoolWorker-32] process shutting down
[DEBUG/ForkPoolWorker-34] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-35] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-33] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-31] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-32] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-34] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-35] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-33] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-34] process shutting down
[DEBUG/ForkPoolWorker-32] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-36] worker got sentinel -- exiting
[INFO/ForkPoolWorker-35] process shutting down
[INFO/ForkPoolWorker-33] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-34] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-32] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-37] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-35] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-36] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-38] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-34] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-37] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-39] worker got sentinel -- exiting
[INFO/ForkPoolWorker-36] process shutting down
[DEBUG/ForkPoolWorker-35] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-34] process exiting with exitcode 0
[INFO/ForkPoolWorker-37] process shutting down
[DEBUG/ForkPoolWorker-36] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-38] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-39] worker exiting after 0 tasks
[INFO/ForkPoolWorker-35] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-37] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-38] process shutting down
[DEBUG/ForkPoolWorker-36] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-39] process shutting down
[DEBUG/ForkPoolWorker-37] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-38] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-39] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-36] process exiting with exitcode 0
[INFO/ForkPoolWorker-37] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-38] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-39] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-38] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-9] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-39] process exiting with exitcode 0
[INFO/ForkPoolWorker-9] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-40] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-41] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-40] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-41] worker exiting after 0 tasks
[INFO/ForkPoolWorker-40] process shutting down
[DEBUG/ForkPoolWorker-40] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-41] process shutting down
[DEBUG/ForkPoolWorker-41] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-43] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-40] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-44] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-41] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-43] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-45] worker got sentinel -- exiting
[INFO/ForkPoolWorker-40] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-44] worker exiting after 0 tasks
[INFO/ForkPoolWorker-41] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-45] worker exiting after 0 tasks
[INFO/ForkPoolWorker-43] process shutting down
[INFO/ForkPoolWorker-44] process shutting down
[DEBUG/ForkPoolWorker-43] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-44] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-45] process shutting down
[DEBUG/ForkPoolWorker-43] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-45] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-44] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-43] process exiting with exitcode 0
[INFO/ForkPoolWorker-44] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-45] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-45] process exiting with exitcode 0
[INFO/ForkPoolWorker-50] child process calling self.run()
[INFO/ForkPoolWorker-49] child process calling self.run()
[DEBUG/ForkPoolWorker-46] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-46] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-1] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-47] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-2] worker got sentinel -- exiting
[INFO/ForkPoolWorker-46] process shutting down
[DEBUG/ForkPoolWorker-1] worker exiting after 1 tasks
[DEBUG/ForkPoolWorker-47] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-2] worker exiting after 1 tasks
[INFO/ForkPoolWorker-1] process shutting down
[INFO/ForkPoolWorker-47] process shutting down
[DEBUG/ForkPoolWorker-46] running all "atexit" finalizers with priority >= 0
[INFO/ForkPoolWorker-2] process shutting down
[DEBUG/ForkPoolWorker-1] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-47] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-2] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-46] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-1] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-47] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-46] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-2] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-1] process exiting with exitcode 0
[INFO/ForkPoolWorker-47] process exiting with exitcode 0
[INFO/ForkPoolWorker-2] process exiting with exitcode 0
[DEBUG/MainProcess] task handler exiting
[DEBUG/ForkPoolWorker-50] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-48] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-48] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-50] worker exiting after 0 tasks
[INFO/ForkPoolWorker-48] process shutting down
[INFO/ForkPoolWorker-50] process shutting down
[DEBUG/ForkPoolWorker-49] worker got sentinel -- exiting
[DEBUG/ForkPoolWorker-48] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-50] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-49] worker exiting after 0 tasks
[DEBUG/ForkPoolWorker-50] running the remaining "atexit" finalizers
[DEBUG/ForkPoolWorker-48] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-49] process shutting down
[INFO/ForkPoolWorker-50] process exiting with exitcode 0
[INFO/ForkPoolWorker-48] process exiting with exitcode 0
[DEBUG/ForkPoolWorker-49] running all "atexit" finalizers with priority >= 0
[DEBUG/ForkPoolWorker-49] running the remaining "atexit" finalizers
[INFO/ForkPoolWorker-49] process exiting with exitcode 0
msg326187 - (view) Author: Tomáš Bouda (coells) Date: 2018-09-23 21:51
After more digging, I found that the following happens:

popen_fork.py -> _launch(self, process_obj) -> self.pid = os.fork()

When I let process (both child and parent) print resulting pid, on freezing I can see:
a) 50-times pid > 0
b) 49-times pid == 0

That means the parent is aware of 50 children, while only 49 of them get to the next line. Not sure if the one remaining process crashes on segfault, but parent apparently hangs later in os.waitpid() on this valid pid of the missing child.
msg326401 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-25 20:05
I couldn't reproduce with Python 3.6.5 on Ubuntu 18.04.

Does it happen if you reduce logging?  Or if you replace f() with:

def f(i):
    os.write(1, "{}\n".format(i).encode())
msg326407 - (view) Author: Tomáš Bouda (coells) Date: 2018-09-25 21:40
It's very difficult to reproduce.

In this example to get stuck on 3.6/OSX I need to attach debugger. However, the freeze happens regardless of print/logging, even def f(): pass can get stuck. os.write() made no difference and frozen, as well, as I've just tried.

It might be even possible that there's another problem in debugger itself, however on 2.7/RHEL the actual (production) code is embedded in unittest with "discover" mode, run from shell, no debugger attached. 

I couldn't reproduce it today morning. Later on afternoon in another script it occured 2 times in a row. In the past months, I have seen the problem under various conditions. Originally, I used ProcessPoolExecutor before, where it happened rather often, so I rewrote the code to use Pool directly, the problem is rare, now, but still occurs.

The production code has many variants including logging/custom prints/no prints at all, from time to time it happens regardless of anything else. However, there's also a heavy load and high OS resource demand (tens of workers, tens of GBs read/allocated, many/explicit calls to GC, etc.)
msg326408 - (view) Author: Tomáš Bouda (coells) Date: 2018-09-25 21:44
Oh, I should add that by decreasing number of workers to 4 or 8 the problem disappeared, at least to the extent when I wasn't able to reproduce it on any environment.
msg326448 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-26 11:47
If I understand correctly you have two cases:
- the standalone script hangs with 3.6 on OS X
- a much more involved use case hangs with 2.7 on RedHat

It's possible you are hitting two different bugs.  The 2.7 issue may be due to third-party packages or other issues.

It would be nice if another OS X user could try your reproducer script on 3.6+ and find out whether they can reproduce/diagnose the issue at all.
msg326482 - (view) Author: calimeroteknik (calimeroteknik) Date: 2018-09-26 16:22
A friend has found a very simple example that triggers such an issue in a very reproducible manner.

Attached two versions, one where the child process mysteriously disappears in the cpython interpreter.

pypy is unaffected.
msg326483 - (view) Author: calimeroteknik (calimeroteknik) Date: 2018-09-26 16:27
Attaching the version that randomly raises ChildProcessError: [Errno 10] No child processes.
The child process is lost in limbo if we don't sleep/print after creating it.
msg326484 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-26 16:30
@calimeroteknik, this doesn't seem to have anything to do with the issue at hand.  Please open a separate issue with your scripts.
msg326499 - (view) Author: Tomáš Bouda (coells) Date: 2018-09-26 19:26
By now I have spent several days trying to reproduce the behaviour in production environment with debugger attached. Unfortunately, no success. On the other hand yesterday the application froze, again, and colleague today experienced the problem in his script, too. (talking about RHEL)

Dealing with this kind of problem is always very frustrating.

By now, I agree with @pitrou that OSX/RHEL could be two different problems. In advance, I tried the approach by @calimeroteknik and this would actually make sense.

If the child process receives a signal (SIGTERM or SIGSEGV), parent waits forever. We do call 3rd party libraries and segfault is indeed possible. I've tried to send signal to a child and script really froze. By now, it seems to be the most probable explanation.

OSX debugger may also be buggy, yesterday I completely broke my system just by trying my original script, leading to a regular segfaults and system restart (never happened before).

Since I can't reproduce the problem under controlled conditions, I am ok with closing this bug. The script by @calimeroteknik seems to be pointing in the same direction and I think this may solve our problem, too.
msg336538 - (view) Author: Dongyan Li (Dongyan Li) Date: 2019-02-25 16:22
I got the same issue with Python 3.7.2 on Windows Build 14393. Seems that the program got stuck on the `waiter.acquire()` method.
History
Date User Action Args
2019-02-25 16:22:18Dongyan Lisetnosy: + Dongyan Li
messages: + msg336538
2018-09-26 19:26:50coellssetmessages: + msg326499
2018-09-26 16:30:16pitrousetfiles: - hang.py
2018-09-26 16:30:14pitrousetfiles: - except-out.py
2018-09-26 16:30:05pitrousetmessages: + msg326484
2018-09-26 16:27:06calimerotekniksetfiles: + except-out.py

messages: + msg326483
2018-09-26 16:22:59calimerotekniksetfiles: + hang.py
nosy: + calimeroteknik
messages: + msg326482

2018-09-26 11:47:33pitrousetnosy: + ronaldoussoren, ned.deily
messages: + msg326448
2018-09-25 21:44:42coellssetmessages: + msg326408
2018-09-25 21:40:13coellssetmessages: + msg326407
2018-09-25 20:05:08pitrousetmessages: + msg326401
2018-09-24 21:37:51brett.cannonsetnosy: + pitrou, davin
2018-09-23 21:51:08coellssetmessages: + msg326187
2018-09-23 20:30:41coellscreate