This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sourcejedi
Recipients asvetlov, sourcejedi, yselivanov
Date 2020-10-21.18:44:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603305865.86.0.468384454532.issue42110@roundup.psfhosted.org>
In-reply-to
Content
## Test program ##

import asyncio
import time
import os
import signal
import sys

# This bug happens with the default, ThreadedChildWatcher
# It also happens with MultiLoopChildWatcher,
# but not the other three watcher types.
#asyncio.set_child_watcher(asyncio.MultiLoopChildWatcher())

# Patch os.kill to call sleep(1) first,
# opening up the window for a race condition
os_kill = os.kill
def kill(p, n):
    time.sleep(1)
    os_kill(p, n)

os.kill = kill

async def main():
    p = await asyncio.create_subprocess_exec(sys.executable, '-c', 'import sys; sys.exit(0)')
    p.send_signal(signal.SIGTERM)
    # cleanup
    await p.wait()

asyncio.run(main())


## Test output ##

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/alan-sysop/src/cpython/Lib/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/alan-sysop/src/cpython/Lib/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "<stdin>", line 3, in main
  File "/home/alan-sysop/src/cpython/Lib/asyncio/subprocess.py", line 138, in send_signal
    self._transport.send_signal(signal)
  File "/home/alan-sysop/src/cpython/Lib/asyncio/base_subprocess.py", line 146, in send_signal
    self._proc.send_signal(signal)
  File "/home/alan-sysop/src/cpython/Lib/subprocess.py", line 2081, in send_signal
    os.kill(self.pid, sig)
  File "<stdin>", line 3, in kill
ProcessLookupError: [Errno 3] No such process


## Tested versions ##

* v3.10.0a1-121-gc60394c7fc
* python39-3.9.0-1.fc32.x86_64
* python3-3.8.6-1.fc32.x86_64


## Race condition ##

main thread	vs	ThreadedChildWatcher._do_waitpid() thread

p=create_subprocess_exec(...)
			waitpid(...)  # wait for process exit
<Process p exits>
			<waitpid() returns.  Process p is reaped, and no longer exists>
p.send_signal(9)


## Result ##

A signal is sent to p.pid, after p.pid has been reaped by waitpid().  It might raise an error because p.pid no longer exists.

In the worst case the signal will be sent successfully - because an unrelated process has started with the same PID.


## How easy is it to reproduce? ##

It turns out the window for this race condition has been kept short, due to mitigations in the subprocess module.  IIUC, the mitigation protects against incorrect parallel use of a subprocess object by multiple threads.

        def send_signal(self, sig):
            # bpo-38630: Polling reduces the risk of sending a signal to the
            # wrong process if the process completed, the Popen.returncode
            # attribute is still None, and the pid has been reassigned
            # (recycled) to a new different process. This race condition can
            # happens in two cases [...]
            self.poll()
            if self.returncode is not None:
                # Skip signalling a process that we know has already died.
                return
            os.kill(self.pid, sig)




## Possible workarounds ##

* SafeChildWatcher and FastChildWatcher should not have this defect.  However we use ThreadedChildWatcher and MultiLoopChildWatcher to support running event loops in different threads.

* PidfdChildWatcher should not have this defect.  It is only available on Linux, kernel version 5.3 or above.

It would be possible to avoid the ThreadedChildWatcher race by using native code and pthread_cancel(), so that the corresponding waitpid() call is canceled before sending a signal.  Except the current implementation of pthread_cancel() is also unsound, because of race conditions.

* https://lwn.net/Articles/683118/ "This is why we can't have safe cancellation points"
* https://sourceware.org/bugzilla/show_bug.cgi?id=12683 "Race conditions in pthread cancellation"
History
Date User Action Args
2020-10-21 18:44:25sourcejedisetrecipients: + sourcejedi, asvetlov, yselivanov
2020-10-21 18:44:25sourcejedisetmessageid: <1603305865.86.0.468384454532.issue42110@roundup.psfhosted.org>
2020-10-21 18:44:25sourcejedilinkissue42110 messages
2020-10-21 18:44:25sourcejedicreate