classification
Title: Sporadic EINVAL in nonblocking pipe os.read when forked child fails on Mac OS
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Denis.Bilenko, hynek, ned.deily, neologix, ronaldoussoren, schmir, vitaly
Priority: normal Keywords:

Created on 2012-09-09 23:04 by vitaly, last changed 2012-09-19 20:26 by vitaly. This issue is now closed.

Files
File name Uploaded Description Edit
test_pipe_error.py vitaly, 2012-09-09 23:04 Test file for reproducing the issue
test_fork_pipe_error.py vitaly, 2012-09-10 23:56 Iterates in increasing or decreasing order over initial pipe read sizes and accumulates failed size values
test_fork_pipe_error.cpp vitaly, 2012-09-11 07:22 A C-language program to reproduce this issue without Python
Messages (27)
msg170144 - (view) Author: Vitaly (vitaly) Date: 2012-09-09 23:04
Python 2.6.7 on Mac OS 10.7.4 running on MacBookPro laptop.


run attachment as: python test_pipe_error.py

The test() function makes multiple calls to _execute_child() in a loop. Every other call to _execute_child() fails with:
=====
os.read(3, 1048576) self pid:  43953
EINVAL FROM TEST:  Traceback (most recent call last):
  File "test_pipe_error.py", line 127, in test
    _execute_child()
  File "test_pipe_error.py", line 95, in _execute_child
    newData = os.read(errpipe_read, rSize)
OSError: [Errno 22] Invalid argument
=====

_execute_child() is based on subprocess.Popen._execute_child, but with most Popen code removed and errpipe_read configured as os.O_NONBLOCK in the parent process.  Raising of an Exception() instance is used to simulate failure in os.execvp() instead of calling os.execvp().

We initially experienced the EINVAL issue in the parent process' os.read call when os.execvp failed in the child (e.g., errno=2, "file not found"), so I stripped the code to bare minimum in order to reproduce the problem with a smaller code snippet.
msg170193 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-09-10 14:53
Looks like - another - OS-X bug.

What happens if you reduce the size argument when reading from the error pipe? Try setting it to a value like 512 (minimum guaranteed PIPE_BUF):
"""
newData = os.read(errpipe_read, min(512, rSize))
"""

You could also try to put a short sleep before exiting the child process (after pickling the exception) to see what happens if the reader reads before the pipe is closed by the other end (that would be a huge bug, but hey, who knows...).
msg170227 - (view) Author: Vitaly (vitaly) Date: 2012-09-10 20:09
Per Charles-François Natali (neologix), I tried the following:

1. Reduce the initial read size argument (rSize in my code snippet) from 1048576 (1MB, the same amount as used by subprocess.Popen._executeChild) to 1000000: the problem appears to go away (could be just red herring, though); this actually makes me think that it's a Python implementation problem on MacOS - why should an initial read request with 1048576 read size trigger EINVAL every other time (1,3,5,9, etc.), but using the initial size of 1000000 doesn't?
2. "put a short sleep before exiting the child process (after pickling the exception) to see what happens if the reader reads before the pipe is closed by the other end" -- I took that to mean a delay after the "os.write(errpipe_write, pickle.dumps(exc_value))" call in my snippet: I tried 1, 2, and 4-second delays (via time.sleep()).  One time, I saw the Invalid Argument in only one of the 9 iterations, but in all the other runs Invalid Argument showed up as before on iterations: 1, 3, 5, 7, 9

3. put the delay (tried with 1 and 2 seconds) before "os.write(errpipe_write, pickle.dumps(exc_value))": Invalid Argument showed up as before on iterations: 1, 3, 5, 7, 9
msg170238 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-09-10 21:19
> 1. Reduce the initial read size argument (rSize in my code snippet) from 1048576 (1MB, the same amount as used by subprocess.Popen._executeChild) to 1000000: the problem appears to go away (could be just red herring, though); this actually makes me think that it's a Python implementation problem on MacOS - why should an initial read request with 1048576 read size trigger EINVAL every other time (1,3,5,9, etc.), but using the initial size of 1000000 doesn't?

That's not a Python issue. os.read() juste calls the read() syscall,
nothing more. Failing with EBADF could maybe have an explanation if we
endup up calling read() on a closed FD, but nothing can explain
EINVAL.
We've had many cases of strange OS-X bugs, like this one (EINVAL too):
http://bugs.python.org/issue15594

One last thing I'd like to make sure of, is trying to put a delay
before reading the exception from the pipe. There may be cases where
read() returns EINVAL instead of EAGAIN when there's no data in the
pipe.
msg170251 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 00:18
os.read() must also be allocating the read buffer in addition to calling the system read() function.  I just uploaded a new test script "test_fork_pipe_error.py" that iterates in increasing or decreasing order over initial pipe read sizes and accumulates failed size values.  It iterates in multiples of 1024 (1024*1, 1024*2, 1024*3, ... 1024*1024 or in reverse) .  It makes some interesting (possibly also red-herring) discoveries:

1. When iterating the initial read size in increasing order (1024, 2048, 3072, ...) the test doesn't trigger the EINVAL error.

2. When iterating the initial read size in decreasing order, the test triggers EINVAL every 4KB (1024*1024, 1024*1020, 1024*1016, etc.) all the way down to 1024*128.  However, the test doesn't trigger any more EINVAL errors for initial read sizes of 127KB and lower (1024*127, 1024*126, 1024*125, ... 1024*1).

Failed initialReadSize multipliers of 1024: [1024, 1020, 1016, 1012, 1008, 1004, 1000, 996, 992, 988, 984, 980, 976, 972, 968, 964, 960, 956, 952, 948, 944, 940, 936, 932, 928, 924, 920, 916, 912, 908, 904, 900, 896, 892, 888, 884, 880, 876, 872, 868, 864, 860, 856, 852, 848, 844, 840, 836, 832, 828, 824, 820, 816, 812, 808, 804, 800, 796, 792, 788, 784, 780, 776, 772, 768, 764, 760, 756, 752, 748, 744, 740, 736, 732, 728, 724, 720, 716, 712, 708, 704, 700, 696, 692, 688, 684, 680, 676, 672, 668, 664, 660, 656, 652, 648, 644, 640, 636, 632, 628, 624, 620, 616, 612, 608, 604, 600, 596, 592, 588, 584, 580, 576, 572, 568, 564, 560, 556, 552, 548, 544, 540, 536, 532, 528, 524, 520, 516, 512, 508, 504, 500, 496, 492, 488, 484, 480, 476, 472, 468, 464, 460, 456, 452, 448, 444, 440, 436, 432, 428, 424, 420, 416, 412, 408, 404, 400, 396, 392, 388, 384, 380, 376, 372, 368, 364, 360, 356, 352, 348, 344, 340, 336, 332, 328, 324, 320, 316, 312, 308, 304, 300, 296, 292, 288, 284, 280, 276, 272, 268, 264, 260, 256, 252, 248, 244, 240, 236, 232, 228, 224, 220, 216, 212, 208, 204, 200, 196, 192, 188, 184, 180, 176, 172, 168, 164, 160, 156, 152, 148, 144, 140, 136, 132, 128]

The reason I said above that those might be red-herring discoveries is this: if I insert a short time.sleep(0.001) delay before the outer pipe-read loop, the EINVAL errors don't get triggered either.

Still, these remain a mystery: 

1. Why doesn't the test encounter EINVAL in the range 127KB ... 1KB (when iterating initialReadSize in *decreasing* order).  If the pre-read delay is significant, then does it take significantly more time to allocate a 127KB chunk of memory than it does a 128KB chunk?

2. Why doesn't the test encounter EINVAL at all when iterating initialReadSize in *increasing* order?
msg170268 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 07:22
I wrote a C-language program to reproduce this issue on Mac OS without Python.  It reproduces the issue in both increasing and decreasing order of initial read sizes, and it reliably reproduces it for each KB from 128KB to 1024KB ; the Python version reproduced the issue every 4K and only in decreasing order (probably something related to Python's memory management optimizations).

A new caveat: if the read buffer is allocated once before entering the read loop, then we don't get any EINVAL in the entire run; however, if the read buffer is allocated for each read request inside the loop, then we get EINVAL every other time in the range from 128KB and up.

I would like to file this issue with apple/Mac OS. What's the appropriate URL for this?
msg170270 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 07:23
The C-language program for reproducing the same issue is attached as test_fork_pipe_error.cpp
msg170271 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 07:27
I used g++ to compile test_fork_pipe_error.cpp on both Mac OS and on Linux.  EINVAL showed up only on Mac OS.
msg170272 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-09-11 07:32
> The reason I said above that those might be red-herring discoveries is this: if I insert a short time.sleep(0.001) delay before the outer pipe-read loop, the EINVAL errors don't get triggered either.

That's interesting.
So it really seems that under certain conditions, a non-blocking read from an empty pipe could fail with EINVAL instead of EAGAIN. But this is definitely a bug, in such cases read() should return EAGAIN.

> 1. Why doesn't the test encounter EINVAL in the range 127KB ... 1KB (when iterating initialReadSize in *decreasing* order).  If the pre-read delay is significant, then does it take significantly more time to allocate a 127KB chunk of memory than it does a 128KB chunk?
>
> 2. Why doesn't the test encounter EINVAL at all when iterating initialReadSize in *increasing* order?

I'm not sure it's a delay issue in this case. It may have more to do with the aligment of the buffer passed to read(). You can for example imagine that this error would show up only when the buffer is (or is not) aligned on a page boundary (usually 4K). As for the gap between 127KB and 128KB, it could be that we're allocating a new arena (we have a custom memory allocator over malloc()/mmap()), or that we switch between brk() and mmap(), but that's mere speculation, and there's nothing we can (and should) do here.

I'm not sure about what to do with this, though:
- adding a delay is out of question
- retrying on EINVAL like on EAGAIN is not a good idead, since it could mask real bugs

One thing we could do would be to limit the the call to read() to, let's say 64KB per call:
"""
newData = os.read(errpipe_read, min(65536, rSize))
"""

But this would only work here, there are probably other places where this bug could be encountered (and I don't like adding hacks to avoid platform bugs).

> I wrote a C-language program to reproduce this issue on Mac OS without Python.
> I would like to file this issue with apple/Mac OS. What's the appropriate URL for this?
> I used g++ to compile test_fork_pipe_error.cpp on both Mac OS and on Linux.  EINVAL showed up only on Mac OS.

Told you it was an OS-X bug (we've had several of those) ;-)
As for where to report it, I'm making some OS-X enclined devs nosy.

I'm suggest closing this bug as invalid (unless someone considers that we should try to work around it with the above trick).
msg170273 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 07:44
By the way, the existing code in subprocess.Popen (at least on 2.6.7) reads the pipe incorrectly: It doesn't loop to read all the data until EOF -- it only loops over EINTR until it gets a single successful os.read() call.  However, since this is a pipe read (not a real file read), the system doesn't guarantee that the blocking read will read everything up to requested read size or EOF, whichever comes first.  So, the single os.read call could return a partial read, and the subsequent un-pickling of the exception would fail.
msg170275 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 08:18
I filed this issue with apple: Problem ID: 12274650: https://bugreport.apple.com/cgi-bin/WebObjects/RadarWeb.woa/64/wo/VE1RGG9qEL5OS9KdzFSDHw/19.66
msg170276 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 08:19
Apple bug report URL correction: https://bugreport.apple.com/cgi-bin/WebObjects/RadarWeb.woa/64/wo/VE1RGG9qEL5OS9KdzFSDHw/17.66
msg170277 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 08:23
Sorry, I don't know why the URL comes out all messed up.  I can't seem to find the correct syntax for this.
msg170278 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-09-11 08:38
> By the way, the existing code in subprocess.Popen (at least on 2.6.7) reads the pipe incorrectly: It doesn't loop to read all the data until EOF -- it only loops over EINTR until it gets a single successful os.read() call.  However, since this is a pipe read (not a real file read), the system doesn't guarantee that the blocking read will read everything up to requested read size or EOF, whichever comes first.  So, the single os.read call could return a partial read, and the subsequent un-pickling of the exception would fail.

Indeed.
Do you want to open a new issue for that (and provide a patch)?
msg170280 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 09:07
Filed http://bugs.python.org/issue15918 for the incorrect pipe read logic in subprocess.Popen.
msg170285 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2012-09-11 10:28
What's wrong with working around this bug by reading a smaller amount? How much data is there supposed to be?

BTW. URLs for reports in Apple's tracker are fairly useless as bugreports are private (only you and Apple's engineers can see the report). The bug number is useful though.
msg170316 - (view) Author: Vitaly (vitaly) Date: 2012-09-11 15:52
> What's wrong with working around this bug by reading a smaller amount? How much data is there supposed to be?

This makes sense for working around the issue.  Even in the blocking-read case, such as in subprocess.Popen, attempting to read a 1MB chunk of data in a single os.read call is unhelpful anyway - see http://bugs.python.org/issue15918.  Also, the 1MB read forces os.read() to unnecessarily allocate a huge !MB buffer (even if only for a short lifetime)
msg170656 - (view) Author: Vitaly (vitaly) Date: 2012-09-18 16:12
In the work-around, we need to watch out for what 'man 2 read' on Mac OS refers to as "normal file":

==
Upon successful completion, read(), readv(), and pread() return the number of bytes actually read and placed in the buffer.  *The system      guarantees to read the number of bytes requested if the descriptor references a normal file that has that many bytes left before the end-of-file*, but in no other case.
==

I am guessing that fstat() + S_ISREG() can be used to discern "normal files" from other types of file descriptors.
msg170664 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-09-18 19:02
> What's wrong with working around this bug by reading a smaller amount?
> How much data is there supposed to be?

Nothing, except that there are probably other places in the stdlib
where we can get bitten by this bug. Note that this should eventually
be done for another reason, see http://bugs.python.org/issue15918

> In the work-around, we need to watch out for what 'man 2 read' on Mac OS refers to as
> "normal file":

There's nothing special actually.
This precision concerning normal file is OS-X specific, POSIX makes no
such guarantee: you can always get a short read (e.g. because of
signal delivery), so one should always call read() in a loop.
msg170673 - (view) Author: Vitaly (vitaly) Date: 2012-09-18 20:07
> Nothing, except that there are probably other places in the stdlib
> where we can get bitten by this bug. Note that this should eventually
> be done for another reason, see http://bugs.python.org/issue15918

For greatest benefit, I think that the work-around should be implemented in os.read()
msg170705 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2012-09-19 06:27
The workaround should not be implemented in os.read because it is a very thin wrapper around the system call and should stay that way.
msg170707 - (view) Author: Vitaly (vitaly) Date: 2012-09-19 06:55
> The workaround should not be implemented in os.read because it is a very thin wrapper around the system call and should stay that way.

Although this issue was initially filed as "Sporadic EINVAL in nonblocking pipe os.read when forked child fails on Mac OS", the subsequent investigation revealed that this has nothing to do with a forked child failing.  Rather, it's a bug in the read() syscall on Mac OS X.  What would be a more appropriate place to work around this read() syscall bug than os.read?
msg170708 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2012-09-19 07:24
How can you work around it in os.read, without knowing anything about what the file descriptor represents? Just triggering on getting on EINVAL error when calling read might trigger other problems and might even be a valid result for some file descriptors (for example for custom file systems or device files).

And what kind of workaround do you propose?
msg170731 - (view) Author: Vitaly (vitaly) Date: 2012-09-19 17:54
> And what kind of workaround do you propose?

[os.read(fd, buffersize)]

I am thinking about these options:

Option 1:
Fix breakages as they are discovered at higher level (above os.read) as needed in places where usage semantics are known, and address the issue via errata documentation (i.e., "On Mac OS X, don't make individual pipe read requests that would result in os.read() buffersize arg being above 127KB on non-blocking pipes."); should also check if the same issue occurs with sockets (e.g., socket.socketpair() instead of os.pipe()) to make the errata more complete.  This may be perfectly acceptable and how things have worked for a long time.

Option 2:
Implement a work-around in the lowest common denominator wrapper function, so that anything in Python that needs to call read() and could benefit from this work-around, would call that wrapper instead of read().  The read() work-around might go something like this in *pseudocode*:


if running_on_darwin and buffersize > 127KB:
    # fix up buffer size to work around a Mac OS x bug...
    if stat.S_ISFIFO(os.fstat(fd).st_mode):
        buffersize = 127KB

Then do whatever else the read function is supposed to do.
msg170752 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2012-09-19 19:15
Out of curiosity, has anyone checked whether this is also an issue with any of the supported BSD's?  There have been other issues which were detected first on OS X but turned out to be more general BSD vs Linux differences, rather than OS X "bugs".
msg170764 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-09-19 20:12
> Option 1:
> Fix breakages as they are discovered at higher level (above os.read) as needed in places where usage semantics are known, and address the issue via errata documentation (i.e., "On Mac OS X, don't make individual pipe read requests that would result in os.read() buffersize arg being above 127KB on non-blocking pipes."); should also check if the same issue occurs with sockets (e.g., socket.socketpair() instead of os.pipe()) to make the errata more complete.  This may be perfectly acceptable and how things have worked for a long time.

> Implement a work-around in the lowest common denominator wrapper function, so that anything in Python that needs to call read() and could benefit from this work-around, would call that wrapper instead of read().  The read() work-around might go something like this in *pseudocode*:

It's not Python's job to workaround stupid platform bugs, or document
them: that would lead to unmanagable code or unmaintanable
documentation.
This particular issue will get fixed as part of
http://bugs.python.org/issue15918, and I'm -10 against adding an hack
to posix.read().

So I'd suggest closing this, and urge people to complain to the OS-X folks.
msg170766 - (view) Author: Vitaly (vitaly) Date: 2012-09-19 20:25
> It's not Python's job to workaround stupid platform bugs...
> So I'd suggest closing this, and urge people to complain to the OS-X folks

After digesting the postings, I've come around to this point of view as well, so closing as "rejected".  And, as mentioned earlier, I filed the issue on https://bugreport.apple.com as Problem ID: 12274650.
History
Date User Action Args
2012-09-19 20:26:07vitalysetstatus: open -> closed
2012-09-19 20:25:25vitalysetresolution: rejected
messages: + msg170766
2012-09-19 20:12:40neologixsetmessages: + msg170764
2012-09-19 19:15:55ned.deilysetmessages: + msg170752
2012-09-19 17:54:42vitalysetmessages: + msg170731
2012-09-19 07:24:06ronaldoussorensetmessages: + msg170708
2012-09-19 06:55:55vitalysetmessages: + msg170707
2012-09-19 06:27:01ronaldoussorensetmessages: + msg170705
2012-09-18 20:07:26vitalysetmessages: + msg170673
2012-09-18 19:02:08neologixsetmessages: + msg170664
2012-09-18 16:20:47schmirsetnosy: + schmir
2012-09-18 16:12:14vitalysetmessages: + msg170656
2012-09-11 16:03:11vitalysetversions: + Python 2.7
2012-09-11 15:52:51vitalysetmessages: + msg170316
2012-09-11 10:28:45ronaldoussorensetmessages: + msg170285
2012-09-11 09:07:43vitalysetmessages: + msg170280
2012-09-11 08:38:32neologixsetmessages: + msg170278
2012-09-11 08:23:23vitalysetmessages: + msg170277
2012-09-11 08:19:58vitalysetmessages: + msg170276
2012-09-11 08:18:34vitalysetmessages: + msg170275
2012-09-11 07:44:06vitalysetmessages: + msg170273
2012-09-11 07:32:39neologixsettype: crash -> behavior
2012-09-11 07:32:17neologixsetnosy: + ronaldoussoren, ned.deily, hynek
messages: + msg170272
2012-09-11 07:27:53vitalysetmessages: + msg170271
2012-09-11 07:23:18vitalysetmessages: + msg170270
2012-09-11 07:22:28vitalysetfiles: + test_fork_pipe_error.cpp

messages: + msg170268
2012-09-11 00:19:01vitalysetmessages: + msg170251
2012-09-10 23:56:05vitalysetfiles: + test_fork_pipe_error.py
2012-09-10 21:19:43neologixsetmessages: + msg170238
2012-09-10 20:09:54vitalysetmessages: + msg170227
2012-09-10 14:53:06neologixsetnosy: + neologix
messages: + msg170193
2012-09-10 09:25:06Denis.Bilenkosetnosy: + Denis.Bilenko
2012-09-09 23:04:38vitalycreate