This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author njs
Recipients Rhamphoryncus, eryksun, gregory.p.smith, gvanrossum, jdemeyer, njs, steve.dower, vstinner
Date 2019-04-13.01:35:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1555119360.15.0.116620470397.issue36601@roundup.psfhosted.org>
In-reply-to
Content
Yeah, the check makes sense on a system like the comment says IRIX used to be, where getpid() actually returns a thread id *and* signals are delivered to all threads (wtf). But any modern system should do the POSIX thing, where getpid() returns the same value in all threads, and signals are only delivered to one thread anyway. So I agree with everyone else that the original motivation doesn't make sense any more.

The place where this would change things is in fork/vfork context. For fork it seems unhelpful... like Greg notes, we already reset main_pid in the AfterFork handler, so the only effect is that there's a brief moment where signals can be lost. If we removed the (getpid() == main_thread) check, then fork() would work slightly better.

For vfork, man, I don't even know. Here I do disagree with Greg a little – according to POSIX, trip_signal is *not* safe to call in a vfork'ed process. POSIX says: "behavior is undefined if the process created by vfork() either modifies any data [...] or calls any other function". Yes this is really as extreme as it sounds – you're not allowed to mutate data or use any syscalls. And they explicitly note that this applies to signal handlers too: "If signal handlers are invoked in the child process after vfork(), they must follow the same rules as other code in the child process."

trip_signals sets a flag -> it mutates data -> it technically can invoke undefined behavior if it's called in a child process after a vfork. And it can call write(), which again, undefined behavior.

Of course this is so restrictive that vfork() is almost unusable in Python anyway, because you can't do anything in Python without modifying memory.

And worse: according to a strict reading of POSIX, vfork() should call pthread_atfork() handlers, and our pthread_atfork() handlers mutate memory.

So from the POSIX-rules-lawyer perspective, there's absolutely no way any Python process can ever call vfork() without invoking undefined behavior, no matter what we do here.

Do we care?

It looks like subprocess.py was recently modified to call posix_spawn in some cases: https://github.com/python/cpython/blob/a304b136adda3575898d8b5debedcd48d5072272/Lib/subprocess.py#L610-L654

If we believe the comments in that function, it only does this on macOS – where posix_spawn is a native syscall, so no vfork is involved – or on systems using glibc (i.e., Linux), where posix_spawn *does* invoke vfork(). So in this one case, currently, CPython does use vfork(). Also, users might call os.posix_spawn directly on any system.

However, checking the glibc source, their implementation of posix_spawn actually takes care of this – it doesn't use vfork() directly, but rather clone(), and it takes care to make sure that no signal handlers run in the child (see sysdeps/unix/sysv/linux/spawni.c for details).

AFAICT, the only ways that someone could potentially get themselves into trouble with vfork() on CPython are:

- by explicitly wrapping it themselves, in which case, good luck to them I guess. On Linux they aren't instantly doomed, because Linux intentionally deviates from POSIX and *doesn't* invoke pthread_atfork handlers after vfork(), but they still have their work cut out for them. Not really our problem.

- by explicitly calling os.posix_spawn on some system where this is implemented using vfork internally, but with a broken libc that doesn't handle signals or pthread_atfork correctly. Currently we don't know of any such systems, and if they do exist they have to be pretty rare.

So: I think we shouldn't worry about vfork(), and it's fine to remove the (getpid() == main_pid) check.
History
Date User Action Args
2019-04-13 01:36:00njssetrecipients: + njs, gvanrossum, gregory.p.smith, Rhamphoryncus, vstinner, jdemeyer, eryksun, steve.dower
2019-04-13 01:36:00njssetmessageid: <1555119360.15.0.116620470397.issue36601@roundup.psfhosted.org>
2019-04-13 01:36:00njslinkissue36601 messages
2019-04-13 01:35:59njscreate