subprocess.Popen.send_signal() should poll the process #82811

vstinner · 2019-10-29T11:18:27Z

BPO	38630
Nosy	@vstinner, @giampaolo, @njsmith, @SpecLad, @oconnor663
PRs	bpo-38630: subprocess: enhance send_signal() on Unix #16984

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-01-15.16:53:35.210>
created_at = <Date 2019-10-29.11:18:26.776>
labels = ['library', '3.9']
title = 'subprocess.Popen.send_signal() should poll the process'
updated_at = <Date 2020-12-03.17:05:56.445>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2020-12-03.17:05:56.445>
actor = 'oconnor663'
assignee = 'none'
closed = True
closed_date = <Date 2020-01-15.16:53:35.210>
closer = 'vstinner'
components = ['Library (Lib)']
creation = <Date 2019-10-29.11:18:26.776>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 38630
keywords = ['patch']
message_count = 25.0
messages = ['355645', '355647', '355706', '355856', '355917', '355918', '355928', '356073', '356074', '356075', '356076', '356077', '356084', '356085', '356571', '356593', '358137', '358142', '358146', '358147', '360064', '360065', '382382', '382397', '382430']
nosy_count = 5.0
nosy_names = ['vstinner', 'giampaolo.rodola', 'njs', 'SpecLad', 'oconnor663']
pr_nums = ['16984']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue38630'
versions = ['Python 3.9']

vstinner · 2019-10-29T11:18:26Z

subprocess.Popen.send_signal() doesn't check if the process exited since the poll() method has been called for the last time. If the process exit just before os.kill() is called, the signal can be sent to the wrong process if the process identified is recycled.

Attached PR simply calls poll() once to reduce the time window when this race condition can occur, but it doesn't fully kill the race condition.

--

See also the new "pidfd" API which only landed very recently in the Linux kernel to prevent this issue:

https://lwn.net/Articles/773459/
"Toward race-free process signaling" (this articles describes this issue)
https://lwn.net/Articles/789023/
"New system calls: pidfd_open() and close_range()"
https://kernel-recipes.org/en/2019/talks/pidfds-process-file-descriptors-on-linux/
"pidfds: Process file descriptors on Linux" by Chrisitan Brauner

Illumos, OpenBSD, NetBSD and FreeBSD have similar concepts.

I don't propose to use pidfd here, but it's just to highlight that it's a real issue and that kernels are evolving to provide more reliable solutions against the kill(pid, sig) race condition ;-)

vstinner · 2019-10-29T11:41:10Z

I noticed this issue by reading subprocess code while working on bpo-38207 ("subprocess: Popen.kill() + Popen.communicate() is blocking until all processes using the pipe close the pipe") and bpo-38502 ("regrtest: use process groups").

vstinner · 2019-10-30T11:26:26Z

subprocess.Popen.send_signal() doesn't check if the process exited since the poll() method has been called for the last time. If the process exit just before os.kill() is called, the signal can be sent to the wrong process if the process identified is recycled.

I'm not sure under which conditions this case happens.

I understood that the pid is released (and so can be reused) as soon as the child process completes *and* waitpid() has been called.

Two cases:

os.waitpid(pid, 0) can be called directly ("raw call")
Popen API used, but called from a different thread: thread A calls send_signal(), thread B calls waitpid()

I'm mostly concerned by the multithreading case. I noticed the race condition while working on regrtest which uses 2 threads. One thread which is responsible for the process (call .communicate(), call .wait(), etc.), and one "main" thread which sends a signal to every child processes to interrupt them.

My first implementation called .wait() after calling .send_signal() in the main thread. But it seems like Popen is not "fully" thread safe: the "waitpid lock" only protects the os.waitpid() call, it doesn't protect the self.returncode attribute.

I modified regrtest to only call .waitpid() in the thread responsible of the process, to avoid such race condition. It seems like the new design is more reliable. The main thread only calls .send_signal(): it doesn't call .wait() (which calls os.waitpid()) anymore.

giampaolo · 2019-11-02T01:50:25Z

-1 about the PR solution to suppress ProcessLookupError in case the process is gone. In psutil I solved the “pid reused problem” by using process creation time to identify a process uniquely (on start).
A decorator can be used to protect the sensibile methods interacting with the PID/handle (communicate(), send_signal(), terminate(), kill()) and raise an exception (maybe ProcessLooKupError(“PID has been reused”)?).

vstinner · 2019-11-04T01:27:23Z

-1 about the PR solution to suppress ProcessLookupError in case the process is gone.

Did someone propose a pull request to fix this issue by ignoring ProcessLookupError?

In psutil I solved the “pid reused problem” by using process creation time to identify a process uniquely (on start).

Well, that requires to write platform specific code.

I would prefer to only use platform specific code when the platform provides something like pidfd: I understand that Linux, FreeBSD, OpenBSD, NetBSD, and Illumos are concerned. I mean reuse an exisiting solution, rather than writing our own "workaround" (solution?).

vstinner · 2019-11-04T01:28:15Z

I would prefer to only use platform specific code when the platform provides something like pidfd: I understand that Linux, FreeBSD, OpenBSD, NetBSD, and Illumos are concerned. I mean reuse an exisiting solution, rather than writing our own "workaround" (solution?).

By the way, I suggest to stick to the initial problem in this issue. If someone is intersted to explore the pidfd idea, I recommend to open a different issue ;-)

giampaolo · 2019-11-04T07:55:41Z

Did someone propose a pull request to fix this issue by ignoring ProcessLookupError?

I misread your PR, sorry. I thought that was the effect.

njsmith · 2019-11-05T23:07:40Z

It sounds like there's actually nothing to do here? (Except maybe eventually switch to pidfd or similar, but Victor says he wants to use a different issue for that.) Can this be closed?

vstinner · 2019-11-05T23:10:40Z

It sounds like there's actually nothing to do here? (Except maybe eventually switch to pidfd or similar, but Victor says he wants to use a different issue for that.) Can this be closed?

I opened this issue to propose PR 16984. Did you notice the PR?

In short, I propose to call poll() before calling os.kill() in send_signal().

njsmith · 2019-11-05T23:12:49Z

But then deeper in the thread it sounded like you concluded that this didn't actually help anything, which is what I would expect :-).

vstinner · 2019-11-05T23:19:01Z

But then deeper in the thread it sounded like you concluded that this didn't actually help anything, which is what I would expect :-).

The PR contains a test which demonstrate one race condition. It fails without the fix.

vstinner · 2019-11-05T23:19:32Z

(Except maybe eventually switch to pidfd or similar, but Victor says he wants to use a different issue for that.)

pidfd is not available on all platforms (ex: Windows).

njsmith · 2019-11-06T00:35:37Z

You can't solve a time-of-check-to-time-of-use race by adding another check. I guess your patch might narrow the race window slightly, but it was a tiny window before and it's a tiny window after, so I don't really get it.

The test doesn't demonstrate a race condition. What it demonstrates is increased robustness against user code that corrupts Popen's internal state by reaping one of its processes behind its back.

That kind of robustness might be a good motivation on its own! (I actually just did a bunch of work to make trio more robust against user code closing fds behind its back, which is a similar kind of thing.) But it's a very different motivation than what you say in this bug.

njsmith · 2019-11-06T00:43:48Z

Hmm, you know, on further thought, it is actually possible to close this race for real, without pidfd.

What you do is split 'wait' into two parts: first it waits for me process to become reapable without actually reaping it. On Linux you can do this with waitid+WNOWAIT. On kqueue platforms you can do it with kqueue.

Then, you take a lock, and use it to atomically call waitpid/waitid to reap the process + update self.returncode to record that you've done so.

In send_signal, we use the same lock to atomically check whether the process has been reaped, and then send the signal if it hasn't.

That doesn't fix your test, but it does fix the race condition: as long as users only interact with the process through the Popen API, it guarantees that send_signal will never rather the wrong process.

giampaolo · 2019-11-14T03:19:22Z

To further elaborate on the creation time solution, the idea in pseudo-code is the following:

class Popen:

    def __init__(self, ...):
        self._execute_child(...)
        try:
            self._ctime = get_create_time(self.pid)
        except ProcessLookupError:
            self._ctime = None

    def send_signal(self, sig):
        if self._ctime is not None:
            if self._ctime != get_create_time(self.pid):
                raise ProecssLookupError("PID has been reused")
        os.kill(self.pid, sig)

Technically there is still a race condition between _execute_child() and get_create_time(), but:

the time between the two calls is supposed to be extremely short
an OS should not reuse a PID that fast, precisely in order to minimize the chances of such a race condition to occur
as a general rule, on most systems PIDs are supposed to be assigned sequentially and wrap around. E.g. on Linux:

~$ for i in $(seq 20); do ps; done | grep ps
3199 pts/0 00:00:00 ps
3200 pts/0 00:00:00 ps
3201 pts/0 00:00:00 ps
3202 pts/0 00:00:00 ps
3203 pts/0 00:00:00 ps
3204 pts/0 00:00:00 ps
3205 pts/0 00:00:00 ps
3207 pts/0 00:00:00 ps
3208 pts/0 00:00:00 ps
3209 pts/0 00:00:00 ps
3210 pts/0 00:00:00 ps
3213 pts/0 00:00:00 ps
3214 pts/0 00:00:00 ps
3215 pts/0 00:00:00 ps
3216 pts/0 00:00:00 ps
3217 pts/0 00:00:00 ps
3218 pts/0 00:00:00 ps
3219 pts/0 00:00:00 ps
3221 pts/0 00:00:00 ps
3223 pts/0 00:00:00 ps

As for Windows, the termination is based on the process handle instead of the PID, so I assume that means Windows is safe in this regard. There was a prior discussion about this actually:
https://bugs.python.org/issue36067#msg336243
Getting process creation time is relatively easy, even though platform-dependent (and should be done in C).
pidfd API would be nicer to use, but it seems it's not portable and relatively recent.

vstinner · 2019-11-14T11:08:49Z

To further elaborate on the creation time solution, the idea in pseudo-code is the following: (...)
Technically there is still a race condition between _execute_child() and get_create_time(), but: (...)

Even if the approach seems to be different, PR 16984 seems to have the same advantages and drawbacks.

PR 16984 rely on the fact that the Linux kernel doesn't use a pid until its parent calls os.waitpid(). Even if the child process completed, the pid remains assigned (and the process status is "zombie").

PR 16984 has the advantage that it reuses existing code which has been battle tested: we are sure that the code is portable and works well.

Whereas get_create_time() will be likely platform specific and will add a little maintenance burden.

I know that PR 16984 is a partial solution (there is still a race condition if waitpid() is called directly, without using the Popen API), but I don't see a strong reason to prefer get_create_time().

SpecLad · 2019-12-09T20:11:43Z

What you do is split 'wait' into two parts: first it waits for me process to become reapable without actually reaping it. On Linux you can do this with waitid+WNOWAIT. On kqueue platforms you can do it with kqueue.

Then, you take a lock, and use it to atomically call waitpid/waitid to reap the process + update self.returncode to record that you've done so.

In send_signal, we use the same lock to atomically check whether the process has been reaped, and then send the signal if it hasn't.

It's a good idea, but it would break the scenario where two threads call wait() concurrently. It would create this race condition:

Thread A reaps the process.
Thread B thinks the process is still running, so it calls waitid+WNOHANG on a stale PID, with unpredictable results.
Thread A sets self.returncode.

What is needed here is a reader-writer lock. subprocess.wait would work like this (pseudocode):

with lock taken for reading:
    os.waitid(..., WNOHANG)
with lock taken for writing:
    os.waitid(...)
    self.returncode = ...

Whereas subprocess.send_signal would work like this:

with lock taken for reading:
    os.kill(...)

Unfortunately, it doesn't seem like Python has reader-writer locks in the library...

njsmith · 2019-12-09T21:18:18Z

Thread B thinks the process is still running, so it calls waitid+WNOHANG on a stale PID, with unpredictable results.

I'm pretty sure you mean WNOWAIT, right? The actual reaping step (which might use WNOHANG) is already protected by a lock, but there's a danger that a process might try to perform the blocking-until-exit step (which uses WNOWAIT) on a stale pid.

Good catch!

I think this can be fixed by using a second lock around all of .wait(). But this is a bit tricky because we also need to account for concurrent .poll() calls. So something like:

def wait(self):
    # Don't take the big lock unless the process is actually still running
    # This isn't just an optimization; it's critical for poll() correctness
    if self.poll() is not None:
        return self.returncode

    with self._blocking_wait_lock:
        with self._returncode_lock:
            revalidate pid licenses

        block_until_child_exits()

        with self._returncode_lock:
            reap the child and mark it as reaped

def poll(self):
    # if someone is already blocked waiting for the child, then it definitely
    # hasn't exited yet, so we don't need to call waitpid ourselves.
    # This isn't just an optimization; it's critical for correctness.
    if not self._blocking_wait_lock.acquire(blocking=False):
        return None
    try:
        with self._returncode_lock:
            do the actual poll + returncode updating
    finally:
        self._blocking_wait_lock.release()

Notes:

If there's already a wait() running, and someone calls poll(), then we have to make sure the poll() can't reap the process out from under the wait(). To fix that, poll skips trying in case wait is already running.

But, this could introduce its own race condition: if a process has exited but we haven't noticed yet, and then someone calls wait() and poll() simultaneously, then the wait() might take the lock, then poll() notices the lock is taken, and concludes that the process can't have exited yet. If course wait() will immediately reap the process and drop the lock, but by this point poll() has already returned the wrong information.

The problem is that poll() needs to be able to treat "the blocking wait lock is taken" as implying "the process hasn't exited yet". So to make that implication true, we add a check at the top of wait().

Of course if a process exits while wait() is running, and someone calls poll() in that brief interval between it exiting and wait() noticing, then poll() could again fail to report it exiting. But I think this is fine: it's ok if poll() is a fraction of a second out of date; that race condition is inherent in its API. The problem would be if the fails to notice a process that exited a while ago and is just sitting around waiting to be reaped.

... Ugh but there is still one more race condition here. I think this fixes all the cases involving send_signal, wait, and poll interactions with each other, BUT we broke the case of two threads calling poll() at the same time. One thread will take the blocking_wait_lock, then the other will take this as evidence that someone is blocked in wait() and exit early, which isn't appropriate.

I think we can patch up this last race condition by adding yet one more lock: a simple

with self._poll_lock:

around the whole body of poll(), so that only one thread can call it at a time.

SpecLad · 2019-12-09T21:46:45Z

I'm pretty sure you mean WNOWAIT, right?

Right.

revalidate pid licenses

What does this mean?

I think we can patch up this last race condition by adding yet one more lock

Wouldn't it be sufficient to add

if self.returncode is not None:
    return self.returncode

at the top of poll()?

(And in fact, you don't even need to add it, since an equivalent of this check is already there.)

I think this makes calling poll() from wait() unnecessary too.

njsmith · 2019-12-09T22:04:42Z

        revalidate pid licenses

It means autocorrect mangled the text... I don't remember what word that's supposed to be, but basically I meant "revalidate the pid hasn't been reaped"

Wouldn't it be sufficient to add

if self.returncode is not None:
return self.returncode

at the top of poll()?

No, the race condition is:

Process exits
[some time passes]
Thread 1 and Thread 2 both call poll() simultaneously
Thread 1 and Thread 2 both see that the process hasn't been reaped
Thread 1 attempts to take the blocking_wait_lock, to make sure no-one is blocked in wait(). It succeeds.
Thread 2 attempts to take the blocking_wait_lock, and it fails (because Thread 1 already has it). Thread 2's poll() returns saying that the process hasn't exited yet (which is wrong!)
Thread 1 calls waitpid(..., WNOHANG) and reaps the process, then releases the lock.

So adding a check at the top (= step 3) doesn't help. The problem is in steps 4/5.

vstinner · 2020-01-15T16:38:59Z

New changeset e85a305 by Victor Stinner in branch 'master':
bpo-38630: Fix subprocess.Popen.send_signal() race condition (GH-16984)
e85a305

vstinner · 2020-01-15T16:53:35Z

While there was no strong agreement on my change, Nathaniel and Giampaolo said that it should not harm to merge it :-) So I merged my change.

It has been said multiple times: my change doesn't fully close the race condition. It only reduces the risk that it happens. I rephrased my change to mention that there is still room for the race condition and that polling only *reduces* the risk, it doesn't fully fix the race condition.

--

Changing locking can help. But I'm not sure that it covers *all* cases. For example, what if another function calls os.waitpid() directly for whatever reason and the process completed? The pid recycling issue can also happen in this case, no?

Linux pidfd really fix the issue. If someone wants to enhance subprocess to use the new os.pidfd_open(), I suggest to open a separated issue.

Another approach is to emulate pidfd in userspace. Giampaolo explains that he also compares the process creation time for that (msg356571). I'm not excited to workaround OS issue this way. It requires to write platform specific get_create_time() code. I would prefer to reuse what the OS provides when available: Linux pidfd.

--

This issue title is quite explicit on purpose: "subprocess.Popen.send_signal() should poll the process". I now close this issue: send_signal() does now poll :-)

Because there is no *strong* consensus on my change, I decided to not backport it to stable branches 3.7 and 3.8.

Thanks everyone for this very interesting discussion!

oconnor663 · 2020-12-03T06:12:38Z

This change caused a crash in the Duct library in Python 3.9. Duct uses the waitid(NOWAIT) strategy that Nathaniel Smith has described in this thread. With this change, the child process is reaped by kill() in a spot we didn't expect, and a subsequent call to os.waitid() raises a ChildProcessError. This is a race condition that only triggers if the child happens to exit before kill() is called.

I just pushed oconnor663/duct.py@5dfae70 to work around this, which I'll release shortly as Duct version 0.6.4.

Broadly speaking, this change could break any program that uses Popen.kill() together with os.waitpid() or os.waitid(). Checking Popen.returncode before calling the os functions is a good workaround for the single-threaded case, but it doesn't fix the multithreaded case. Duct is going to avoid calling Popen.kill() entirely. There are some longer comments about race conditions in that commit I linked.

I don't feel strongly one way the other about keeping this new behavior, but we should probably document it clearly in Popen.send_signal() and all of its callers that these functions might reap the child.

vstinner · 2020-12-03T09:42:55Z

On Linux, a pidfd file descriptor can be created with os.pidfd_open() (added to Python 3.9). It would avoid even more race conditions. The best would be to request a pidfd file descriptor directly when we spawn the process which is possible on recent Linux kernel versions.

we should probably document it clearly in Popen.send_signal() and all of its callers that these functions might reap the child.

Feel free to propose a PR to document the behavior.

I'm not sure of what you mean by send_signal() which "reaps" the child process.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subprocess.Popen.send_signal() should poll the process #82811

subprocess.Popen.send_signal() should poll the process #82811

vstinner commented Oct 29, 2019

vstinner commented Oct 29, 2019

vstinner commented Oct 29, 2019

vstinner commented Oct 30, 2019

giampaolo commented Nov 2, 2019

vstinner commented Nov 4, 2019

vstinner commented Nov 4, 2019

giampaolo commented Nov 4, 2019

njsmith commented Nov 5, 2019

vstinner commented Nov 5, 2019

njsmith commented Nov 5, 2019

vstinner commented Nov 5, 2019

vstinner commented Nov 5, 2019

njsmith commented Nov 6, 2019

njsmith commented Nov 6, 2019

giampaolo commented Nov 14, 2019

vstinner commented Nov 14, 2019

SpecLad mannequin commented Dec 9, 2019

njsmith commented Dec 9, 2019

SpecLad mannequin commented Dec 9, 2019

njsmith commented Dec 9, 2019

vstinner commented Jan 15, 2020

vstinner commented Jan 15, 2020

oconnor663 mannequin commented Dec 3, 2020

vstinner commented Dec 3, 2020

oconnor663 mannequin commented Dec 3, 2020

graingert commented Jul 30, 2022

subprocess.Popen.send_signal() should poll the process #82811

subprocess.Popen.send_signal() should poll the process #82811

Comments

vstinner commented Oct 29, 2019

vstinner commented Oct 29, 2019

vstinner commented Oct 29, 2019

vstinner commented Oct 30, 2019

giampaolo commented Nov 2, 2019

vstinner commented Nov 4, 2019

vstinner commented Nov 4, 2019

giampaolo commented Nov 4, 2019

njsmith commented Nov 5, 2019

vstinner commented Nov 5, 2019

njsmith commented Nov 5, 2019

vstinner commented Nov 5, 2019

vstinner commented Nov 5, 2019

njsmith commented Nov 6, 2019

njsmith commented Nov 6, 2019

giampaolo commented Nov 14, 2019

vstinner commented Nov 14, 2019

SpecLad mannequin commented Dec 9, 2019

njsmith commented Dec 9, 2019

SpecLad mannequin commented Dec 9, 2019

njsmith commented Dec 9, 2019

vstinner commented Jan 15, 2020

vstinner commented Jan 15, 2020

oconnor663 mannequin commented Dec 3, 2020

vstinner commented Dec 3, 2020

oconnor663 mannequin commented Dec 3, 2020

graingert commented Jul 30, 2022