Author viktor.ivanov
Recipients viktor.ivanov
Date 2021-07-23.13:29:52
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1627046993.43.0.654405119923.issue44724@roundup.psfhosted.org>
In-reply-to
Content
The multiprocessing.resource_tracker instance is never reaped, leaving zombie processes.

There is a waitpid() call for the ResourceTracker's pid but it is in a private method _stop() which seems to be only called from some test modules.

Usually environments have some process handling zombies but if python is the "main" process in a container, for example, and runs another python instance that does something leaking a ResourceTracker process, zombies start to accumulate.

This is easily reproducible with a couple of small python programs as long as they are not run from a shell or another parent process that takes care of forgotten children.

It was originally discovered in a docker container that has a python program as its entry point (celery worker in an airflow container) running other python programs (dbt).

The minimal code is available on Github here: https://github.com/viktorvia/python-multi-issue

The attached multi.py is leaking resource tracker processes, but just running it from a full-fledged development environment will not show the issue.

Instead, run it via another python program from a Docker container:

Dockerfile:
---
FROM python:3.9

WORKDIR /usr/src/multi

COPY . ./

CMD ["python", "main.py"]
---

main.py:
---
from subprocess import run
from time import sleep

while True:
    result = run(["python", "multi.py"], capture_output=True)
    print(result.stdout.decode('utf-8'))
    result = run(["ps", "-ef", "--forest"], capture_output=True)
    print(result.stdout.decode('utf-8'), flush=True)
    sleep(1)
---

When the program is run it will accumulate 1 zombie on each run:
---
$ docker run -it multi python main.py
[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0 11 11:33 pts/0    00:00:00 python main.py
root         8     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        17     1  0 11:33 pts/0    00:00:00 ps -ef --forest

[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  6 11:33 pts/0    00:00:00 python main.py
root         8     1  3 11:33 pts/0    00:00:00 [python] <defunct>
root        19     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        28     1  0 11:33 pts/0    00:00:00 ps -ef --forest

[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  4 11:33 pts/0    00:00:00 python main.py
root         8     1  1 11:33 pts/0    00:00:00 [python] <defunct>
root        19     1  3 11:33 pts/0    00:00:00 [python] <defunct>
root        30     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        39     1  0 11:33 pts/0    00:00:00 ps -ef --forest

[1, 4, 9]

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  3 11:33 pts/0    00:00:00 python main.py
root         8     1  1 11:33 pts/0    00:00:00 [python] <defunct>
root        19     1  1 11:33 pts/0    00:00:00 [python] <defunct>
root        30     1  4 11:33 pts/0    00:00:00 [python] <defunct>
root        41     1  0 11:33 pts/0    00:00:00 [python] <defunct>
root        50     1  0 11:33 pts/0    00:00:00 ps -ef --forest
---

Running from a shell script, or just another python program that handles SIGCHLD by calling wait() takes care of the zombies.
History
Date User Action Args
2021-07-23 13:29:53viktor.ivanovsetrecipients: + viktor.ivanov
2021-07-23 13:29:53viktor.ivanovsetmessageid: <1627046993.43.0.654405119923.issue44724@roundup.psfhosted.org>
2021-07-23 13:29:53viktor.ivanovlinkissue44724 messages
2021-07-23 13:29:52viktor.ivanovcreate