Message 333902 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	izbyshev
Recipients	gregory.p.smith, izbyshev, kevans, koobs, nanjekyejoannah, pablogsal, serhiy.storchaka, vstinner
Date	2019-01-17.22:25:48
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1547763948.3.0.708209425596.issue35537@roundup.psfhosted.org>
In-reply-to

Content
> It should be compared to the current code. Currently, _posixsubprocess uses a loop calling execv(). I don't think that calling posix_spawn() in a loop until one doesn't fail is more inefficient. > The worst case would be when applying process attributes and run file actions would dominate performances... I don't think that such case exists. Syscalls like dup2() and close() should be way faster than the final successful execv() in the overall posix_spawn() call. I'm talking about the case when the program exists. I had vfork() used internally by posix_spawn() in mind when I considered repeatedly calling it "prohibitively expensive". While vfork() is much cheaper than fork(), it doesn't mean that its performance is comparable to dup2() and close(). But on systems where posix_spawn is a syscall the overhead could indeed be lesser. Anyway, it should be measured. >> Iterate over PATH entries and perform a quick check for common exec errors (ENOENT, ENOTDIR, EACCES) manually (e.g. by stat()). > I dislike this option. There is a high risk of race condition if the program is created, deleted or modified between the check and exec. It can cause subtle bugs, or worse: security vulnerabilities. IMHO the only valid check, to test if a program exists, is to call exec(). While exec() is of course cleaner, IMHO races don't matter in this case. If our stat()-like check fails, we could as well imagine that it is exec() that failed (while doing the same checks as our stat()), so it doesn't matter what happens with the tested entry afterwards. If the check succeeds, we simply call posix_spawn() that will perform the same checks again. If any race condition occurs, the problem will be detected by posix_spawn(). > Alexey: Do you want to work on a PR to reimplement the "executable_list" and loop used by subprocess with _posixsubproces? I'm currently focused on researching whether it's possible to use vfork()-like approach instead of fork() on Linux, so if anyone wants to implement the PR you suggested, feel free to do it.

> It should be compared to the current code. Currently, _posixsubprocess uses a loop calling execv(). I don't think that calling posix_spawn() in a loop until one doesn't fail is more inefficient.

> The worst case would be when applying process attributes and run file actions would dominate performances... I don't think that such case exists. Syscalls like dup2() and close() should be way faster than the final successful execv() in the overall posix_spawn() call. I'm talking about the case when the program exists.

I had vfork() used internally by posix_spawn() in mind when I considered repeatedly calling it "prohibitively expensive". While vfork() is much cheaper than fork(), it doesn't mean that its performance is comparable to dup2() and close(). But on systems where posix_spawn is a syscall the overhead could indeed be lesser. Anyway, it should be measured.

>> Iterate over PATH entries and perform a quick check for common exec errors (ENOENT, ENOTDIR, EACCES) manually (e.g. by stat()).

> I dislike this option. There is a high risk of race condition if the program is created, deleted or modified between the check and exec. It can cause subtle bugs, or worse: security vulnerabilities. IMHO the only valid check, to test if a program exists, is to call exec().

While exec() is of course cleaner, IMHO races don't matter in this case. If our stat()-like check fails, we could as well imagine that it is exec() that failed (while doing the same checks as our stat()), so it doesn't matter what happens with the tested entry afterwards. If the check succeeds, we simply call posix_spawn() that will perform the same checks again. If any race condition occurs, the problem will be detected by posix_spawn(). 

> Alexey: Do you want to work on a PR to reimplement the "executable_list" and loop used by subprocess with _posixsubproces?

I'm currently focused on researching whether it's possible to use vfork()-like approach instead of fork() on Linux, so if anyone wants to implement the PR you suggested, feel free to do it.

History
Date	User	Action	Args
2019-01-17 22:25:49	izbyshev	set	recipients: + izbyshev, gregory.p.smith, vstinner, serhiy.storchaka, koobs, pablogsal, nanjekyejoannah, kevans
2019-01-17 22:25:48	izbyshev	set	messageid: <1547763948.3.0.708209425596.issue35537@roundup.psfhosted.org>
2019-01-17 22:25:48	izbyshev	link	issue35537 messages
2019-01-17 22:25:48	izbyshev	create