Using multiprocessing.Queue() with several processes writing very fast results in a deadlock both on Windows and UNIX.
For example, this code:
from multiprocessing import Process, Queue, Manager
import time, sys
def simulate(q, n_results):
for i in range(n_results):
time.sleep(0.01)
q.put(i)
def main():
n_workers = int(sys.argv[1])
n_results = int(sys.argv[2])
q = Queue()
proc_list = [Process(target=simulate,
args=(q, n_results),
daemon=True) for i in range(n_workers)]
for proc in proc_list:
proc.start()
for i in range(5):
time.sleep(1)
print('current approximate queue size:', q.qsize())
alive = [p.pid for p in proc_list if p.is_alive()]
if alive:
print(len(alive), 'processes alive; among them:', alive[:5])
else:
break
for p in proc_list:
p.join()
print('final appr queue size', q.qsize())
if __name__ == '__main__':
main()
hangs on Windows 10 (python 3.6) with 2 workers and 1000 results each, and on Ubuntu 16.04 (python 3.5) with 100 workers and 100 results each. The print out shows that the queue has reached the full size, but a bunch of processes are still alive. Presumably, they somehow manage to lock themselves out even though they don't depend on each other (must be in the implementation of Queue()):
current approximate queue size: 9984
47 processes alive; among them: [2238, 2241, 2242, 2244, 2247]
current approximate queue size: 10000
47 processes alive; among them: [2238, 2241, 2242, 2244, 2247]
The deadlock disappears once multiprocessing.Queue() is replaced with multiprocessing.Manager().Queue() - or at least I wasn't able to replicate it with a reasonable number of processes and results.
|