classification
Title: test_io broken on PPC64 Linux
Type: Stage: resolved
Components: Tests Versions: Python 3.4, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: David.Edelsohn, neologix, pitrou, python-dev, spurin
Priority: normal Keywords: patch

Created on 2013-04-24 19:02 by David.Edelsohn, last changed 2014-09-29 13:12 by spurin. This issue is now closed.

Files
File name Uploaded Description Edit
pipe_max_size.patch pitrou, 2013-04-24 21:07 review
Messages (21)
msg187725 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2013-04-24 19:02
Unoptimized debug build (configured using --with-pydebug).
gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -m64

test_interrupted_write_retry_buffered (test.test_io.CSignalsTest) ... ERROR
test_interrupted_write_retry_text (test.test_io.CSignalsTest) ... ERROR
test_interrupted_write_retry_buffered (test.test_io.PySignalsTest) ... ERROR
test_interrupted_write_retry_text (test.test_io.PySignalsTest) ... ERROR

======================================================================
ERROR: test_interrupted_write_retry_buffered (test.test_io.CSignalsTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3219, in test_interrupted_write_retry_buffered
    self.check_interrupted_write_retry(b"x", mode="wb")
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
    t.join()
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
    raise RuntimeError("cannot join thread before it is started")
RuntimeError: cannot join thread before it is started

======================================================================
ERROR: test_interrupted_write_retry_text (test.test_io.CSignalsTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3222, in test_interrupted_write_retry_text
    self.check_interrupted_write_retry("x", mode="w", encoding="latin1")
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
    t.join()
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
    raise RuntimeError("cannot join thread before it is started")
RuntimeError: cannot join thread before it is started

======================================================================
ERROR: test_interrupted_write_retry_buffered (test.test_io.PySignalsTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3219, in test_interrupted_write_retry_buffered
    self.check_interrupted_write_retry(b"x", mode="wb")
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
    t.join()
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
    raise RuntimeError("cannot join thread before it is started")
RuntimeError: cannot join thread before it is started

======================================================================
ERROR: test_interrupted_write_retry_text (test.test_io.PySignalsTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3222, in test_interrupted_write_retry_text
    self.check_interrupted_write_retry("x", mode="w", encoding="latin1")
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
    t.join()
  File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
    raise RuntimeError("cannot join thread before it is started")
RuntimeError: cannot join thread before it is started
msg187727 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 19:16
What does the following say for you?

>>> import fcntl, os
>>> r, w = os.pipe()
>>> fcntl.fcntl(w, 1032)
msg187728 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2013-04-24 19:21
>>> import fcntl, os
>>> r, w = os.pipe()
>>> fcntl.fcntl(w, 1032)
1048576
msg187729 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 19:24
Ah, right. That number is the pipe buffer size (1032 is F_GETPIPE_SZ).

It's 65536 here, so when the test tries to write 1 million bytes on a pipe, the write blocks as expected (you can read the comments to understand why the test is doint that). But with a 1 MiB buffer size, the write doesn't block and therefore doesn't have to wait for the auxiliary thread to start and read from the pipe buffer.

Something else, what does the following say:

>>> r, w = os.pipe()
>>> fcntl.fcntl(r, 1031, 1000)
msg187731 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2013-04-24 19:26
>>> r, w = os.pipe()
>>> fcntl.fcntl(r, 1031, 1000)
65536
msg187738 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 20:40
Ok, what does the following patch do for you:

diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py
--- a/Lib/test/test_io.py
+++ b/Lib/test/test_io.py
@@ -3168,7 +3168,7 @@ class SignalsTest(unittest.TestCase):
         select = support.import_module("select")
         # A quantity that exceeds the buffer size of an anonymous pipe's
         # write end.
-        N = 1024 * 1024
+        N = 1024 * 1024 + 1
         r, w = os.pipe()
         fdopen_kwargs["closefd"] = False
         # We need a separate thread to read from the pipe and allow the
@@ -3191,6 +3191,12 @@ class SignalsTest(unittest.TestCase):
         signal.signal(signal.SIGALRM, alarm1)
         try:
             wio = self.io.open(w, **fdopen_kwargs)
+            if sys.platform.startswith('linux') and fcntl is not None:
+                # Issue #17835: try to limit pipe size using F_SETPIPE_SZ
+                pipe_size = fcntl.fcntl(w, 1031, 4096)
+                if pipe_size >= N:
+                    self.skipTest("can't set pipe size to less than %d bytes: %d"
+                                  % (N, pipe_size))
             signal.alarm(1)
             # Expected behaviour:
             # - first raw write() is partial (because of the limited pipe buffer
msg187739 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-04-24 20:45
Why not simply increase the amount of data written instead of limiting
the pipe size?

By the way, there's support.PIPE_MAX_SIZE for that purpose.
msg187740 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 20:55
> Why not simply increase the amount of data written instead of limiting
> the pipe size?

Hmm, indeed :-)

> By the way, there's support.PIPE_MAX_SIZE for that purpose.

Hardwired to 3 MB. I wonder if it may broken soon.
msg187741 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2013-04-24 20:57
The patch limiting the pipe size resolves the test_io failure. Either of the approaches should work.
msg187742 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-04-24 21:06
>> Why not simply increase the amount of data written instead of limiting
>> the pipe size?
>
> Hmm, indeed :-)
>
>> By the way, there's support.PIPE_MAX_SIZE for that purpose.
>
> Hardwired to 3 MB. I wonder if it may broken soon.

On Linux, for non root users, it's limited to 1048576, and can be set
to /proc/sys/fs/pipe-max-size.
After a quick look at the kernel code, there's apparently no max value
(it must be a multiple of the page size, though).

I think 3MB should be more than enough, so I suggest to update the
test to use support.PIPE_MAX_SIZE.
If this breaks up again, then we could set PIPE_MAX_SIZE dynamically, like:

r, w = os.pipe()
PIPE_MAX_SIZE = 2 * fcntl.fcntl(w, 1032)

But F_GETPIPE_SZ is Linux only, and quite recent (since 2.6.35 apparently).
msg187743 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 21:07
Ok, here is a new patch updating PIPE_MAX_SIZE with a proper value if possible.
msg187744 - (view) Author: David Edelsohn (David.Edelsohn) * Date: 2013-04-24 21:13
The PIPE_MAX_SIZE patch also fixes the failure.
msg187745 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-04-24 21:20
> Ok, here is a new patch updating PIPE_MAX_SIZE with a proper value if
> possible.

The patch is correct, however I think it's a bit overkill, especially
since it's Linux only.
Choosing a fixed large value (3 or 4 MB) just consumes a bit more
memory and should be more than enough.

Anyway, this problem also affects all versions from 2.7, so
PIPE_MAX_SIZE should be backported there.
msg187746 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 21:27
> The patch is correct, however I think it's a bit overkill, especially
> since it's Linux only.
> Choosing a fixed large value (3 or 4 MB) just consumes a bit more
> memory and should be more than enough.

Ok, so let's say 4MB + 1 then.
msg187747 - (view) Author: Roundup Robot (python-dev) Date: 2013-04-24 21:38
New changeset 4b4ed1e11fd0 by Antoine Pitrou in branch '3.3':
Issue #17835: Fix test_io when the default OS pipe buffer size is larger than one million bytes.
http://hg.python.org/cpython/rev/4b4ed1e11fd0

New changeset de35eae9048a by Antoine Pitrou in branch 'default':
Issue #17835: Fix test_io when the default OS pipe buffer size is larger than one million bytes.
http://hg.python.org/cpython/rev/de35eae9048a

New changeset 09811ecd5df1 by Antoine Pitrou in branch '2.7':
Issue #17835: Fix test_io when the default OS pipe buffer size is larger than one million bytes.
http://hg.python.org/cpython/rev/09811ecd5df1
msg187748 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-24 21:39
This should be fixed now. David, please re-open if the failure still occurs.
msg227416 - (view) Author: James Spurin (spurin) Date: 2014-09-24 07:43
I encountered similar issues to those discussed in this issue whilst compiling 3.4.1 on 'Red Hat Enterprise Linux Server release 6.5 (Santiago)'

In particular, the following tests were failing -

[root@lonlx90800 ~]# /local/0/python-3.4.1/bin/python3 /local/0/python-3.4.1/lib/python3.4/test/test_asyncio/test_subprocess.py
F.......F.......
======================================================================
FAIL: test_broken_pipe (__main__.SubprocessFastWatcherTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/local/0/python-3.4.1/lib/python3.4/test/test_asyncio/test_subprocess.py", line 129, in test_broken_pipe
    self.loop.run_until_complete(proc.communicate(large_data))
AssertionError: BrokenPipeError not raised

======================================================================
FAIL: test_broken_pipe (__main__.SubprocessSafeWatcherTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/local/0/python-3.4.1/lib/python3.4/test/test_asyncio/test_subprocess.py", line 129, in test_broken_pipe
    self.loop.run_until_complete(proc.communicate(large_data))
AssertionError: BrokenPipeError not raised

In this case, the issues are being caused by the following kernel parameters that we have for our default build -

#########################
## TIBCO network tuning #
#########################
net.core.rmem_default = 33554432
net.core.wmem_default = 33554432
net.core.rmem_max = 33554432
net.core.wmem_max = 33554432

Toggling the support.PIPE_MAX_SIZE to +32MB or temporarily removing these parameters mitigates the issue.  Is there a better way of calculating support.PIPE_MAX_SIZE so it is reflective of the actual OS value?
msg227476 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-09-24 17:53
> In this case, the issues are being caused by the following kernel parameters that we have for our default build -
>
> #########################
> ## TIBCO network tuning #
> #########################
> net.core.rmem_default = 33554432
> net.core.wmem_default = 33554432
> net.core.rmem_max = 33554432
> net.core.wmem_max = 33554432
>
> Toggling the support.PIPE_MAX_SIZE to +32MB or temporarily removing these parameters mitigates the issue.  Is there a better way of calculating support.PIPE_MAX_SIZE so it is reflective of the actual OS value?

What does:

>>> import fcntl, os
>>> r, w = os.pipe()
>>> fcntl.fcntl(w, 1032)

return?

Note that the kernel buffer sizes above are, well, *really huge*.
msg227516 - (view) Author: James Spurin (spurin) Date: 2014-09-25 09:02
fcntl doesnt seem to like the parameter you mentioned -

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.5 (Santiago)
# /local/0/opt/python-3.4.1/bin/python
Python 3.4.1 (default, Sep 24 2014, 12:23:21)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fcntl, os
>>> r, w = os.pipe()
>>> fcntl.fcntl(w, 1032)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument
>>>
msg227567 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-09-25 19:30
Let's try with this instead:

>>> from socket import socket, SO_SNDBUF, SOL_SOCKET
>>> s = socket()
>>> s.getsockopt(SOL_SOCKET, SO_SNDBUF)
msg227796 - (view) Author: James Spurin (spurin) Date: 2014-09-29 13:12
With both the kernel parameters defined and undefined, I get the following output -

# /local/0/opt/python-3.4.1/bin/python
Python 3.4.1 (default, Sep 29 2014, 13:31:39)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from socket import socket, SO_SNDBUF, SOL_SOCKET
>>> s = socket()
>>> s.getsockopt(SOL_SOCKET, SO_SNDBUF)
16384
History
Date User Action Args
2014-09-29 13:12:33spurinsetmessages: + msg227796
2014-09-25 19:30:39neologixsetmessages: + msg227567
2014-09-25 09:02:26spurinsetmessages: + msg227516
2014-09-24 17:53:22neologixsetmessages: + msg227476
2014-09-24 07:43:44spurinsetnosy: + spurin
messages: + msg227416
2013-04-24 21:39:17pitrousetstatus: open -> closed
versions: + Python 2.7, Python 3.3
messages: + msg187748

resolution: fixed
stage: resolved
2013-04-24 21:38:31python-devsetnosy: + python-dev
messages: + msg187747
2013-04-24 21:27:14pitrousetmessages: + msg187746
2013-04-24 21:20:41neologixsetmessages: + msg187745
2013-04-24 21:13:49David.Edelsohnsetmessages: + msg187744
2013-04-24 21:07:40pitrousetfiles: + pipe_max_size.patch
keywords: + patch
messages: + msg187743
2013-04-24 21:06:14neologixsetmessages: + msg187742
2013-04-24 20:57:19David.Edelsohnsetmessages: + msg187741
2013-04-24 20:55:13pitrousetmessages: + msg187740
2013-04-24 20:45:40neologixsetmessages: + msg187739
2013-04-24 20:40:59pitrousetmessages: + msg187738
2013-04-24 19:26:26David.Edelsohnsetmessages: + msg187731
2013-04-24 19:24:58pitrousetnosy: + neologix
messages: + msg187729
2013-04-24 19:21:36David.Edelsohnsetmessages: + msg187728
2013-04-24 19:16:03pitrousetnosy: + pitrou
messages: + msg187727
2013-04-24 19:02:53David.Edelsohncreate