Author vstinner
Recipients gregory.p.smith, izbyshev, nanjekyejoannah, pablogsal, pitrou, serhiy.storchaka, vstinner
Date 2019-01-06.22:46:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1546814807.19.0.387171279635.issue35537@roundup.psfhosted.org>
In-reply-to
Content
Gregory:
> Thanks for all your research and reference links on this!  As a _posixsubprocess maintainer, I am not against either posix_spawn or vfork being used directly in the future when feasible.

Are you against using posix_spawn() in subprocess? Or only in _posixsubprocess?

--

Alexey Izbyshev identified a difference between _posixsubprocess and posix_spawn() PR 11242: error handling depends a lot on the libc implementation.
https://github.com/python/cpython/pull/11242#issuecomment-449478778

* On recent glibc, posix_spawn() fails (non-zero return value) and set errno to ENOENT if the executed program doesn't exist... but not all platforms have this behavior.
* On FreeBSD, if setting posix_spawn() "attributes" or execute posix_spawn() "file actions" fails, posix_spawn() succeed but the child process exits immediately with exit code 127 without trying to call execv(). If execv() fails, posix_spawn() succeed, but the child process exit with exit code 127.
* The worst seems to be: "In my test on Ubuntu 14.04, ./python -c "import subprocess; subprocess.call(['/xxx'], close_fds=False, restore_signals=False)" silently returns with zero exit code."

execv() can fail for a lot of different reasons. Extract of Linux execve() manual page:
---
       E2BIG  The total number of bytes in the environment (envp) and argument list (argv) is too large.

       EACCES Search permission is denied on a component of the path prefix of filename or the name of a script interpreter.  (See also
              path_resolution(7).)

       EACCES The file or a script interpreter is not a regular file.

       EACCES Execute permission is denied for the file or a script or ELF interpreter.

       EACCES The filesystem is mounted noexec.

       EAGAIN (since Linux 3.1)
              Having  changed  its  real  UID  using one of the set*uid() calls, the caller was—and is now still—above its RLIMIT_NPROC
              resource limit (see setrlimit(2)).  For a more detailed explanation of this error, see NOTES.

       EFAULT filename or one of the pointers in the vectors argv or envp points outside your accessible address space.

       EINVAL An ELF executable had more than one PT_INTERP segment (i.e., tried to name more than one interpreter).

       EIO    An I/O error occurred.

       EISDIR An ELF interpreter was a directory.

       ELIBBAD
              An ELF interpreter was not in a recognized format.

       ELOOP  Too many symbolic links were encountered in resolving filename or the name of a script or ELF interpreter.

       ELOOP  The maximum recursion limit was reached during  recursive  script  interpretation  (see  "Interpreter  scripts",  above).
              Before Linux 3.8, the error produced for this case was ENOEXEC.

       EMFILE The per-process limit on the number of open file descriptors has been reached.

       ENAMETOOLONG
              filename is too long.

       ENFILE The system-wide limit on the total number of open files has been reached.

       ENOENT The  file  filename or a script or ELF interpreter does not exist, or a shared library needed for the file or interpreter
              cannot be found.

       ENOEXEC
              An executable is not in a recognized format, is for the wrong architecture, or has some other format error that means  it
              cannot be executed.

       ENOMEM Insufficient kernel memory was available.

       ENOTDIR
              A component of the path prefix of filename or a script or ELF interpreter is not a directory.

       EPERM  The  filesystem  is  mounted  nosuid, the user is not the superuser, and the file has the set-user-ID or set-group-ID bit
              set.

       EPERM  The process is being traced, the user is not the superuser and the file has the set-user-ID or set-group-ID bit set.

       EPERM  A "capability-dumb" applications would not obtain the full set of permitted capabilities granted by the executable  file.
              See capabilities(7).

       ETXTBSY
              The specified executable was open for writing by one or more processes.
---

Simply exit with exit code 127 (FreeBSD behavior) doesn't allow to know if the program have been executed or not.

For example, "git bisect run" depends on the exit code: exit code 127 means "command not found". But with posix_spawn(), exit code 127 can means something else...

I'm disappointed by the qualify of the posix_spawn() implementation on FreeBSD and old glibc...

--

I'm now confused.

Should we still use posix_spawn() on some platforms? For example, posix_spawn() seems to have a well defined error reporting according to its manual page:
https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/posix_spawn.2.html

"""
RETURN VALUES
     If the pid argument is NULL, no pid is returned to the calling process;
     if it is non-NULL, then posix_spawn() and posix_spawnp() functions return
     the process ID of the child process into the pid_t variable pointed to by
     the pid argument and return a 0 on success.  If an error occurs, they
     return a non-zero error code as the function return value, and no child
     process is created.

ERRORS
     The posix_spawn() and posix_spawnp() functions will fail and return to
     the calling process if:

     [EINVAL]           The value specified by file_actions or attrp is
                        invalid.

     [E2BIG]            The number of bytes in the new process's argument list
                        is larger than the system-imposed limit.  This limit
                        is specified by the sysctl(3) MIB variable
                        KERN_ARGMAX.

     [EACCES]           Search permission is denied for a component of the
                        path prefix.

     [EACCES]           The new process file is not an ordinary file.

     [EACCES]           The new process file mode denies execute permission.

     [EACCES]           The new process file is on a filesystem mounted with
                        execution disabled (MNT_NOEXEC in <sys/mount.h>).

     [EFAULT]           The new process file is not as long as indicated by
                        the size values in its header.

     [EFAULT]           Path, argv, or envp point to an illegal address.

     [EIO]              An I/O error occurred while reading from the file sys-tem. system.
                        tem.

     [ELOOP]            Too many symbolic links were encountered in translat-ing translating
                        ing the pathname.  This is taken to be indicative of a
                        looping symbolic link.

     [ENAMETOOLONG]     A component of a pathname exceeded {NAME_MAX} charac-ters, characters,
                        ters, or an entire path name exceeded {PATH_MAX} char-acters. characters.
                        acters.

     [ENOENT]           The new process file does not exist.

     [ENOEXEC]          The new process file has the appropriate access per-mission, permission,
                        mission, but has an unrecognized format (e.g., an
                        invalid magic number in its header).

     [ENOMEM]           The new process requires more virtual memory than is
                        allowed by the imposed maximum (getrlimit(2)).

     [ENOTDIR]          A component of the path prefix is not a directory.

     [ETXTBSY]          The new process file is a pure procedure (shared text)
                        file that is currently open for writing or reading by
                        some process.
"""

Would it be reasonable to use posix_spawn() but only on platforms where the implementation is known to be "good"?

Good would mean that execv() and posix_spawn() errors (setting attributes or file actions failures) are properly reported to the parent process.

For example, use posix_spawn() in subprocess on "recent" glibc (which minimum version?) and macOS (where it's a syscall)?
History
Date User Action Args
2019-01-06 22:46:48vstinnersetrecipients: + vstinner, gregory.p.smith, pitrou, serhiy.storchaka, izbyshev, pablogsal, nanjekyejoannah
2019-01-06 22:46:47vstinnersetmessageid: <1546814807.19.0.387171279635.issue35537@roundup.psfhosted.org>
2019-01-06 22:46:47vstinnerlinkissue35537 messages
2019-01-06 22:46:46vstinnercreate