classification
Title: Crash in the libc fwrite() on SIGPIPE (segfault with os.popen and SIGPIPE)
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: akira, hanno, neologix, r.david.murray, terry.reedy, vstinner
Priority: normal Keywords:

Created on 2014-03-07 18:22 by hanno, last changed 2014-12-11 08:13 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
sigpipe_crash.py hanno, 2014-03-07 18:22 test case sigpipe_crash.py
Messages (11)
msg212897 - (view) Author: Hanno Boeck (hanno) * Date: 2014-03-07 18:22
I experience a segmentation fault with python 2.7 (both 2.7.5 and 2.7.6 tested on Ubuntu and Gentoo) when a large file is piped, the pipe is passed to os.popen and the process sends a SIGPIPE signal.

To create an easy to reproduce testcase grep can be used. See example attached.

To test first create a dummy file containing zeros, around 1 megabyte is enough:
for i in `seq 1 100000`; do echo "0123456789" >> dummy.txt; done

Then pipe it to the script attached like this:
cat dummy.txt | python2 minimal.py

Result is a Segmentation fault. The same code doesn't segfault with python 3.
msg213590 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-03-14 21:19
Your example is ambiguous at to which of two pipings causes the problem. First you cat a large file into the script, which reads it in its entirety with "data = sys.stdin.read()". If that causes the segfault, they everything that follows is irrelevant. If that works and it is the second piping out that is the problem, then the rigamarole with creating an external file and piping it in irrelevant.  "data = '0123456789' * 10000" would be sufficient. 

In 2.6/7, os.popen is deprecated in favor of using subprocess. In 3.x, popen was, I have been told, re-written to use subprocess. So if popen is the problem here, then the fix is to use subprocess explicitly in 2.7.
msg217818 - (view) Author: Akira Li (akira) * Date: 2014-05-03 06:50
I can't reproduce it on Ubuntu 12.04 with Python 2.7.3, 2.7.6, 3.2,
tip -- no segfault.

It prints the expected output on both Python 2 and 3:

  (standard input)
  io-error

"(standard input)" is printed by grep due to --files-with-match option

io-error (Broken pipe) is because grep exits as soon as it sees the
first decimal zero due to `--files-with-match 0` args without waiting
for all input to arrive therefore the subsequent attempts by the
parent python process to write to grep fail with BrokenPipeError.

You could get the same behaviour using "python -c pass" instead of
grep.

If the input is less than an OS pipe buffer (~64K) then fd.write
succeeds because the system call os.write(pipe, input) succeeds
whether the child process reads its input or not. Introducing a delay
before writing to the child process generates the error reliably even
for small input because the delay allows the child process to
exit. Despite being stdio-based Python 2 io behaves the same in this
case.

SIGPIPE is suppressed in python by default therefore the error is
generated instead of dying on SIGPIPE. If the signal is restored:

  import signal
  signal.signal(signal.SIGPIPE, signal.SIG_DFL) # restore SIGPIPE

then the parent process dies on SIGPIPE (if input is larger than the
OS pipe buffer of if the child process is already exited -- the same
as for BrokenPipeError).

The behaviour is the same ("Broken pipe" for large input) if syscalls
are used directly instead of os.popen:

  #!/usr/bin/env python
  from sys import argv, exit
  from os import close, dup2, execlp, fork, pipe, wait, write
  
  n = int(argv[1]) if len(argv) > 1 else 100000000
  n = (n // 2) * 2 # make it even
  assert n > 1
  
  in_, out = pipe()
  if fork() == 0: # child
      close(out)    # close unused write end of the pipe
      dup2(in_, 0)  # redirect stdin to the pipe
      close(in_)
      execlp('/bin/grep', '/bin/grep', '--files-with-match', '0')
  else: # parent
      close(in_) # close unused read end of the pipe
      while n > 1:
          n -= write(out, b'0\n' * (n // 2)) # write input to the child
      close(out) # no more input
      exit(wait()[1]) # wait for the child to exit
  assert 0

If you meant something else; you could write more specific test.

For reference:

os.popen() in Python 2: http://hg.python.org/cpython/file/2.7/Modules/posixmodule.c#l4560
os.popen() in Python 3: http://hg.python.org/cpython/file/3.4/Lib/os.py#l928
msg217824 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-03 13:13
I can reproduce the crash. It occurs at the line "fd.write(data)". It looks like the crash occurs in the C function fwrite() which doesn't handle EPIPE / SIGPIPE correctly.

Top of the gdb traceback:

#0  0x00000033d0a8968b in __mempcpy_sse2 () from /lib64/libc.so.6
#1  0x00000033d0a79339 in __GI__IO_default_xsputn () from /lib64/libc.so.6
#2  0x00000033d0a77362 in __GI__IO_file_xsputn () from /lib64/libc.so.6
#3  0x00000033d0a6cfad in fwrite () from /lib64/libc.so.6
#4  0x0000000000435cc4 in file_write (f=0x7f46d74a2dc0, 
    args=('0123456789\n0123456789\n0123456789\n...(truncated))
    at Objects/fileobject.c:1852

Last syscalls (strace output):

...
pipe2([3, 4], O_CLOEXEC)                = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f5ce26aca10) = 4711
close(3)                                = 0
fcntl(4, F_SETFD, 0)                    = 0
fstat(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
fstat(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5ce26ca000
write(4, "0123456789\n0123456789\n0123456789"..., 1097728) = 98304
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=4710, si_uid=1000} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=4711, si_status=0, si_utime=0, si_stime=0} ---
write(4, "89\n0123456789\n0123456789\n0123456"..., 999424) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=4710, si_uid=1000} ---
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7f5cdbe87000} ---
+++ killed by SIGSEGV (core dumped) +++
msg217827 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-05-03 19:34
> I can reproduce the crash. It occurs at the line "fd.write(data)". It looks like the crash occurs in the C function fwrite() which doesn't handle EPIPE / SIGPIPE correctly.

Wouldn't be the first time.

Note that in Python 3, we don't fopen/fwrite anymore, so Python 3
isn't affected.

I suggest closing as "won't fix".
msg217828 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-05-03 20:26
I was thinking the same thing. This appears to be one of the 2.x bugs that have been fixed in 3.x but not 2.x because backporting the fix might break working code. If there another sensible fix that would be acceptable in 2.x?
msg217830 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-05-03 21:40
It's segfaulting inside fwrite(), so apart from completely rewriting
the IO layer in 2.x, I don't see.
msg217909 - (view) Author: Akira Li (akira) * Date: 2014-05-05 07:13
Victor, where can you reproduce it (OS, python version, what C lib)?

I don't receive segfault, only sigpipe (see msg217818 ). Here's gdb backtrace after the signal:

Program received signal SIGPIPE, Broken pipe.
0x00007ffff71e1040 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
82      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) backtrace
#0  0x00007ffff71e1040 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007ffff7173883 in _IO_new_file_write (f=0x7e1f10, data=0x7ffff66cb034,
    n=1097728) at fileops.c:1289
#2  0x00007ffff717374a in new_do_write (fp=0x7e1f10,
    data=0x7ffff66cb034 "0123456789\n...01"...,
    to_do=1097728) at fileops.c:543
#3  0x00007ffff71741fe in _IO_new_file_xsputn (n=1100000, data=<optimized out>,
    f=0x7e1f10) at fileops.c:1383
#4  _IO_new_file_xsputn (f=0x7e1f10, data=<optimized out>, n=1100000)
    at fileops.c:1305
#5  0x00007ffff7169cdd in _IO_fwrite (buf=<optimized out>, size=1, count=1100000,
    fp=0x7e1f10) at iofwrite.c:45
#6  0x000000000042c23c in file_write (f=0x7ffff7f16540, args=<optimized out>)
    at Objects/fileobject.c:1851

Note: the line is 1851, not 1852 (as in the latest 2.7 version [1]) as in your traceback.
And the calls inside _IO_new_file_xsputn() are also different.

[1]: http://hg.python.org/cpython/file/b768d41dec0a/Objects/fileobject.c#l1852

python2.7.6 is installed using `pythonz install 2.7.6` command.

Just to make sure that python2.7.6 *can* segfault:

  $ python2.7.6 -c'import ctypes; ctypes.memset(0,0,1)'
  Segmentation fault (core dumped)

core file is not written on my system:

  $ cat /proc/sys/kernel/core_pattern
  |/usr/share/apport/apport %p %s %c

But I can see in the log when a process segfaults e.g.,
the segfault due to memset is logged:

  $ tail -F /var/log/apport.log
  ERROR: apport (pid 8501) ... executable: ~/.pythonz/pythons/CPython-2.7.6/bin/python2.7 \
      (command line "python2.7.6 -cimport\ ctypes;\ ctypes.memset(0,0,1)")


To find out where `fwrite` come from, I've done:

  $ nm $(which python2.7.6) | grep fwrite
                   U fwrite@@GLIBC_2.2.5

  $ cat $(gcc -print-file-name=libc.so)
  /* GNU ld script
   ... */
  OUTPUT_FORMAT(elf64-x86-64)
  GROUP ( /lib/x86_64-linux-gnu/libc.so.6 ... )

  $ ldd $(which python2.7.6)
          ...
          libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
          ...

  $ /lib/x86_64-linux-gnu/libc.so.6
  GNU C Library (Ubuntu EGLIBC 2.15-0ubuntu10.5) stable release version 2.15, by Roland McGrath et al.
  Copyright (C) 2012 Free Software Foundation, Inc.
  ...
  Compiled by GNU CC version 4.6.3.
  Compiled on a Linux 3.2.50 system on 2013-09-30.
  ...

/usr/bin/python behaves similar -- it just has version that is not mentioned by the OP.
msg232291 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-12-07 23:47
Does anyone disagree with closing this as Won't fix'?
msg232296 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-12-08 01:26
I agree it should be closed.  "Rewrite the IO system" was done, and it was even backported to 2.x...it just isn't the default there.
msg232461 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-12-11 08:13
I added this bug to my list of "Bugs that won’t be fixed in Python 2 anymore":
http://haypo-notes.readthedocs.org/python.html#bugs-in-the-c-stdio-used-by-the-python-i-o
History
Date User Action Args
2014-12-11 08:13:00vstinnersetmessages: + msg232461
2014-12-11 08:03:07vstinnersettitle: segfailt with os.popen and SIGPIPE -> Crash in the libc fwrite() on SIGPIPE (segfault with os.popen and SIGPIPE)
2014-12-11 00:05:09terry.reedysetstatus: open -> closed
resolution: fixed
stage: test needed -> resolved
2014-12-08 01:26:35r.david.murraysetnosy: + r.david.murray
messages: + msg232296
2014-12-07 23:47:52terry.reedysetmessages: + msg232291
2014-05-05 07:13:36akirasetmessages: + msg217909
2014-05-03 21:40:49neologixsetmessages: + msg217830
2014-05-03 20:26:16terry.reedysetmessages: + msg217828
2014-05-03 19:34:59neologixsetmessages: + msg217827
2014-05-03 13:13:59vstinnersetnosy: + vstinner, neologix
messages: + msg217824
2014-05-03 06:50:25akirasetnosy: + akira
messages: + msg217818
2014-03-14 21:19:19terry.reedysetnosy: + terry.reedy

messages: + msg213590
stage: test needed
2014-03-07 18:22:29hannocreate