classification
Title: multiprocessing: the Resource Tracker process is never reaped
Type: resource usage Stage:
Components: Library (Lib) Versions: Python 3.9, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: viktor.ivanov
Priority: normal Keywords:

Created on 2021-07-23 13:29 by viktor.ivanov, last changed 2021-09-15 13:26 by vstinner.

Files
File name Uploaded Description Edit
multi.py viktor.ivanov, 2021-07-23 13:29 Minimal program leaking resource tracker zombies
Messages (1)
msg398053 - (view) Author: Viktor Ivanov (viktor.ivanov) Date: 2021-07-23 13:29
The multiprocessing.resource_tracker instance is never reaped, leaving zombie processes.

There is a waitpid() call for the ResourceTracker's pid but it is in a private method _stop() which seems to be only called from some test modules.

Usually environments have some process handling zombies but if python is the "main" process in a container, for example, and runs another python instance that does something leaking a ResourceTracker process, zombies start to accumulate.

This is easily reproducible with a couple of small python programs as long as they are not run from a shell or another parent process that takes care of forgotten children.

It was originally discovered in a docker container that has a python program as its entry point (celery worker in an airflow container) running other python programs (dbt).

The minimal code is available on Github here: https://github.com/viktorvia/python-multi-issue

The attached multi.py is leaking resource tracker processes, but just running it from a full-fledged development environment will not show the issue.

Instead, run it via another python program from a Docker container:

Dockerfile:
---
FROM python:3.9

WORKDIR /usr/src/multi

COPY . ./

CMD ["python", "main.py"]
---

main.py:
---
from subprocess import run
from time import sleep

while True:
    result = run(["python", "multi.py"], capture_output=True)
    print(result.stdout.decode('utf-8'))
    result = run(["ps", "-ef", "--forest"], capture_output=True)
    print(result.stdout.decode('utf-8'), flush=True)
    sleep(1)
---

When the program is run it will accumulate 1 zombie on each run:
---
$ docker run -it multi python main.py
[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0 11 11:33 pts/0    00:00:00 python main.py
root         8     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        17     1  0 11:33 pts/0    00:00:00 ps -ef --forest

[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  6 11:33 pts/0    00:00:00 python main.py
root         8     1  3 11:33 pts/0    00:00:00 [python] <defunct>
root        19     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        28     1  0 11:33 pts/0    00:00:00 ps -ef --forest

[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  4 11:33 pts/0    00:00:00 python main.py
root         8     1  1 11:33 pts/0    00:00:00 [python] <defunct>
root        19     1  3 11:33 pts/0    00:00:00 [python] <defunct>
root        30     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        39     1  0 11:33 pts/0    00:00:00 ps -ef --forest

[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  3 11:33 pts/0    00:00:00 python main.py
root         8     1  1 11:33 pts/0    00:00:00 [python] <defunct>
root        19     1  1 11:33 pts/0    00:00:00 [python] <defunct>
root        30     1  4 11:33 pts/0    00:00:00 [python] <defunct>
root        41     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        50     1  0 11:33 pts/0    00:00:00 ps -ef --forest
---

Running from a shell script, or just another python program that handles SIGCHLD by calling wait() takes care of the zombies.
History
Date User Action Args
2021-09-15 13:26:06vstinnersettitle: Resource Tracker is never reaped -> multiprocessing: the Resource Tracker process is never reaped
2021-07-23 13:29:53viktor.ivanovcreate