New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fatal Python error: Py_Initialize: can't initialize sys standard streams #77030
Comments
Sometimes a new Python 3.6.4 process is aborted by the kernel (FreeBSD 11.1) (before loading my Python files). Found in syslog: kernel: pid 22433 (python3.6), uid 2014: exited on signal 6 (core dumped) I've been able to run ktrace on such a Python process, see attachment. See around line 940: "RET fstat -1 errno 9 Bad file descriptor" |
ktrace shows that dup(0) succeeded but fstat(0) failed. The symptom is the same as in bpo-30225. Could you check whether any of the following quick tests produces the same error? python3 -c 'import os, subprocess, sys; r, w = os.pipe(); os.close(w); subprocess.call([sys.executable, "-c", ""], stdin=r)' python3 -c 'import os, subprocess, sys; r, w = os.pipe(); os.close(r); subprocess.call([sys.executable, "-c", ""], stdin=w)' |
I've tried your quick tests a few times but couldn't reproduce it immediately. The problem is a bit hard to reproduce anyway because launching Python processes can go well for a long time (many days; launching many processes every minute) until suddenly all NEW processes get aborted. It seems as if somehow something in the relation to the parent process goes wrong somehow. I've seen it happening with Python as the parent process but also with a plain shell process as the parent. |
Thank you for checking. If this issue happens even when Python is run manually from an ordinary shell, fixing it in the same way as in bpo-30225 is probably not what you want because while the error message will be gone the corresponding std stream will be None (sys.stdin in the case that you ktrace'd). However, if fd 0 really becomes unusable for some reason, there isn't anything Python can do. Given your description and ktrace log, I can't imagine why fd 0 would behave strangely only in Python. I've attached a small C program to check fd 0. Could you compile it and run in an infinite loop from the shell in an attempt to reproduce this? |
OK, never mind with the test. I've finally got to a FreeBSD box and reproduced the problem. It has to do with 'revoke' feature of *BSD. When revoke is called on a terminal device (as part of logout process, for example), all descriptors associated with it are invalidated. They can be dup'ed, but any I/O (including fstat) will fail with EBADF. The attached 'repro.c' demonstrates the same behavior as Python in your ktrace log. # sleep 5; ./repro >&err.txt & So it seems that in your case the parent of your Python processes passed a descriptor referring to the terminal as fd 0, and then terminal got revoked at some point. People have stumbled on that, for example, https://bitbucket.org/tildeslash/monit/issues/649/init_env-fails-if-open-2-returns-an As for Python, it seems OK to fix it as in bpo-30225 since the fd is unusable for I/O anyway. I think that we can even drop dup-based validation from is_valid_fd() since there is a corner case for Linux too: if a descriptor opened with O_PATH inherited as a standard one, dup() will succeed but fstat() will fail in kernels before 3.6. And we do fstat() almost immediately after is_valid_fd() to get blksize, so the dup-based optimization doesn't seem worth the trouble. Victor, do you have an opinion on that? |
For POSIX, that is. There is no fstat on Windows, and dup is probably OK there (or, even better, dup2(fd, fd) -- no need to close). |
Thanks for all the research! My crashing Python process is started by a shell process which is launched by the Freebsd daemon tool, this might explain why stdin in no longer valid. But I'm not sure why it can be solved, sometimes, by restarting the the daemon. |
Could it be simply because daemon is respawned from a process that does have a valid stdin at the time of respawn? Note that daemon has an option to redirect std streams to /dev/null. |
Yes, that could certainly be the case. Thanks! |
I have similar crash with Python 3.7.2 on Linux. Steps to reproduce: send sigint when Python initializes. I've built debug version of Python 3.7.2 and collected core dump: (gdb) thread apply all bt Thread 1 (Thread 0x7f8f5ee67e80 (LWP 13285)): |
Aha, the problem is still the is_valid_fd() function:
The function has been fixed on macOS with: #ifdef __APPLE__
/* bpo-30225: On macOS Tiger, when stdout is redirected to a pipe
and the other side of the pipe is closed, dup(1) succeed, whereas
fstat(1, &st) fails with EBADF. Prefer fstat() over dup() to detect
such error. */
struct stat st;
return (fstat(fd, &st) == 0);
#else I see two options:
I wrote attached PR 12852 to only use dup() on Linux and Windows. |
Alexey Izbyshev: "I think that we can even drop dup-based validation from is_valid_fd() since there is a corner case for Linux too: if a descriptor opened with O_PATH inherited as a standard one, dup() will succeed but fstat() will fail in kernels before 3.6. And we do fstat() almost immediately after is_valid_fd() to get blksize, so the dup-based optimization doesn't seem worth the trouble. Victor, do you have an opinion on that?" I don't understand this case. I don't know O_PATH nor how to inherit such special file descriptor. Would you mind to elaborate? man open:
In following C program, fd 0 is a file descriptor opened by O_PATH: dup(0) and fstat(0) both succeed, which is not surprising, it's a valid file descriptor. #include <fcntl.h>
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#define O_PATH 010000000
int main(void)
{
int path_fd; path_fd = open(".", O_PATH); if (dup2(path_fd, 0)) {
perror("dup2");
}
int fd = dup(0);
if (fd < 0)
perror("dup");
else {
fprintf(stderr, "dup ok: %d\n", fd);
close(fd);
}
struct stat st;
if (fstat(0, &st) < 0) {
perror("fstat");
}
else {
printf("fstat ok\n");
}
return 0;
} |
In short, Python 2.7 doesn't seem to be affected by fstat/dup issues. Python 2.7 doesn't check if file descriptors 0, 1 and 2 at startup. Python 2 uses PyFile_FromFile() to create sys.stdin, sys.stdout and sys.stderr which create a "file" object. The function calls fstat(fd) but it ignores the error: fstat() is only used to fail if the fd is a directory. Python 2.7 doesn't have the is_valid_fd() function. |
Thanks Rudolph Froger for the bug report: the issue is now fixed in 3.7 and master (future Python 3.8) branches. Sorry for the delay. -- Alexey Izbyshev wrote PR 5773 to also use fstat() on Linux. I chose to merge my PR 12852 which is more conservative: it keeps dup() on Linux. I'm not sure why exactly, but I recall that the author of the function, Antoine Pitrou, wanted to use dup() on Linux. I'm not convinced by the O_PATH issue on Linux (described above), so I merged my conservative change instead. Later, we can still move to fstat() on Linux as well if someone comes with a more concrete example against dup(). |
Thanks all for the fixes! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: