Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_io broken on PPC64 Linux #62035

Closed
DavidEdelsohn mannequin opened this issue Apr 24, 2013 · 21 comments
Closed

test_io broken on PPC64 Linux #62035

DavidEdelsohn mannequin opened this issue Apr 24, 2013 · 21 comments
Labels
tests Tests in the Lib/test dir

Comments

@DavidEdelsohn
Copy link
Mannequin

DavidEdelsohn mannequin commented Apr 24, 2013

BPO 17835
Nosy @pitrou
Files
  • pipe_max_size.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-04-24.21:39:17.662>
    created_at = <Date 2013-04-24.19:02:53.065>
    labels = ['tests']
    title = 'test_io broken on PPC64 Linux'
    updated_at = <Date 2014-09-29.13:12:33.243>
    user = 'https://bugs.python.org/DavidEdelsohn'

    bugs.python.org fields:

    activity = <Date 2014-09-29.13:12:33.243>
    actor = 'spurin'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-04-24.21:39:17.662>
    closer = 'pitrou'
    components = ['Tests']
    creation = <Date 2013-04-24.19:02:53.065>
    creator = 'David.Edelsohn'
    dependencies = []
    files = ['30008']
    hgrepos = []
    issue_num = 17835
    keywords = ['patch']
    message_count = 21.0
    messages = ['187725', '187727', '187728', '187729', '187731', '187738', '187739', '187740', '187741', '187742', '187743', '187744', '187745', '187746', '187747', '187748', '227416', '227476', '227516', '227567', '227796']
    nosy_count = 5.0
    nosy_names = ['pitrou', 'neologix', 'python-dev', 'David.Edelsohn', 'spurin']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue17835'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @DavidEdelsohn
    Copy link
    Mannequin Author

    DavidEdelsohn mannequin commented Apr 24, 2013

    Unoptimized debug build (configured using --with-pydebug).
    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -m64

    test_interrupted_write_retry_buffered (test.test_io.CSignalsTest) ... ERROR
    test_interrupted_write_retry_text (test.test_io.CSignalsTest) ... ERROR
    test_interrupted_write_retry_buffered (test.test_io.PySignalsTest) ... ERROR
    test_interrupted_write_retry_text (test.test_io.PySignalsTest) ... ERROR

    ======================================================================
    ERROR: test_interrupted_write_retry_buffered (test.test_io.CSignalsTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3219, in test_interrupted_write_retry_buffered
        self.check_interrupted_write_retry(b"x", mode="wb")
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
        t.join()
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
        raise RuntimeError("cannot join thread before it is started")
    RuntimeError: cannot join thread before it is started

    ======================================================================
    ERROR: test_interrupted_write_retry_text (test.test_io.CSignalsTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3222, in test_interrupted_write_retry_text
        self.check_interrupted_write_retry("x", mode="w", encoding="latin1")
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
        t.join()
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
        raise RuntimeError("cannot join thread before it is started")
    RuntimeError: cannot join thread before it is started

    ======================================================================
    ERROR: test_interrupted_write_retry_buffered (test.test_io.PySignalsTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3219, in test_interrupted_write_retry_buffered
        self.check_interrupted_write_retry(b"x", mode="wb")
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
        t.join()
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
        raise RuntimeError("cannot join thread before it is started")
    RuntimeError: cannot join thread before it is started

    ======================================================================
    ERROR: test_interrupted_write_retry_text (test.test_io.PySignalsTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3222, in test_interrupted_write_retry_text
        self.check_interrupted_write_retry("x", mode="w", encoding="latin1")
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_io.py", line 3203, in check_interrupted_write_retry
        t.join()
      File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/threading.py", line 738, in join
        raise RuntimeError("cannot join thread before it is started")
    RuntimeError: cannot join thread before it is started

    @DavidEdelsohn DavidEdelsohn mannequin added the tests Tests in the Lib/test dir label Apr 24, 2013
    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    What does the following say for you?

    >> import fcntl, os
    >> r, w = os.pipe()
    >> fcntl.fcntl(w, 1032)

    @DavidEdelsohn
    Copy link
    Mannequin Author

    DavidEdelsohn mannequin commented Apr 24, 2013

    >>> import fcntl, os
    >>> r, w = os.pipe()
    >>> fcntl.fcntl(w, 1032)
    1048576

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    Ah, right. That number is the pipe buffer size (1032 is F_GETPIPE_SZ).

    It's 65536 here, so when the test tries to write 1 million bytes on a pipe, the write blocks as expected (you can read the comments to understand why the test is doint that). But with a 1 MiB buffer size, the write doesn't block and therefore doesn't have to wait for the auxiliary thread to start and read from the pipe buffer.

    Something else, what does the following say:

    >> r, w = os.pipe()
    >> fcntl.fcntl(r, 1031, 1000)

    @DavidEdelsohn
    Copy link
    Mannequin Author

    DavidEdelsohn mannequin commented Apr 24, 2013

    >>> r, w = os.pipe()
    >>> fcntl.fcntl(r, 1031, 1000)
    65536

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    Ok, what does the following patch do for you:

    diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py
    --- a/Lib/test/test_io.py
    +++ b/Lib/test/test_io.py
    @@ -3168,7 +3168,7 @@ class SignalsTest(unittest.TestCase):
             select = support.import_module("select")
             # A quantity that exceeds the buffer size of an anonymous pipe's
             # write end.
    -        N = 1024 * 1024
    +        N = 1024 * 1024 + 1
             r, w = os.pipe()
             fdopen_kwargs["closefd"] = False
             # We need a separate thread to read from the pipe and allow the
    @@ -3191,6 +3191,12 @@ class SignalsTest(unittest.TestCase):
             signal.signal(signal.SIGALRM, alarm1)
             try:
                 wio = self.io.open(w, **fdopen_kwargs)
    +            if sys.platform.startswith('linux') and fcntl is not None:
    +                # Issue python/cpython#62035: try to limit pipe size using F_SETPIPE_SZ
    +                pipe_size = fcntl.fcntl(w, 1031, 4096)
    +                if pipe_size >= N:
    +                    self.skipTest("can't set pipe size to less than %d bytes: %d"
    +                                  % (N, pipe_size))
                 signal.alarm(1)
                 # Expected behaviour:
                 # - first raw write() is partial (because of the limited pipe buffer

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 24, 2013

    Why not simply increase the amount of data written instead of limiting
    the pipe size?

    By the way, there's support.PIPE_MAX_SIZE for that purpose.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    Why not simply increase the amount of data written instead of limiting
    the pipe size?

    Hmm, indeed :-)

    By the way, there's support.PIPE_MAX_SIZE for that purpose.

    Hardwired to 3 MB. I wonder if it may broken soon.

    @DavidEdelsohn
    Copy link
    Mannequin Author

    DavidEdelsohn mannequin commented Apr 24, 2013

    The patch limiting the pipe size resolves the test_io failure. Either of the approaches should work.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 24, 2013

    > Why not simply increase the amount of data written instead of limiting
    > the pipe size?

    Hmm, indeed :-)

    > By the way, there's support.PIPE_MAX_SIZE for that purpose.

    Hardwired to 3 MB. I wonder if it may broken soon.

    On Linux, for non root users, it's limited to 1048576, and can be set
    to /proc/sys/fs/pipe-max-size.
    After a quick look at the kernel code, there's apparently no max value
    (it must be a multiple of the page size, though).

    I think 3MB should be more than enough, so I suggest to update the
    test to use support.PIPE_MAX_SIZE.
    If this breaks up again, then we could set PIPE_MAX_SIZE dynamically, like:

    r, w = os.pipe()
    PIPE_MAX_SIZE = 2 * fcntl.fcntl(w, 1032)

    But F_GETPIPE_SZ is Linux only, and quite recent (since 2.6.35 apparently).

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    Ok, here is a new patch updating PIPE_MAX_SIZE with a proper value if possible.

    @DavidEdelsohn
    Copy link
    Mannequin Author

    DavidEdelsohn mannequin commented Apr 24, 2013

    The PIPE_MAX_SIZE patch also fixes the failure.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 24, 2013

    Ok, here is a new patch updating PIPE_MAX_SIZE with a proper value if
    possible.

    The patch is correct, however I think it's a bit overkill, especially
    since it's Linux only.
    Choosing a fixed large value (3 or 4 MB) just consumes a bit more
    memory and should be more than enough.

    Anyway, this problem also affects all versions from 2.7, so
    PIPE_MAX_SIZE should be backported there.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    The patch is correct, however I think it's a bit overkill, especially
    since it's Linux only.
    Choosing a fixed large value (3 or 4 MB) just consumes a bit more
    memory and should be more than enough.

    Ok, so let's say 4MB + 1 then.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 24, 2013

    New changeset 4b4ed1e11fd0 by Antoine Pitrou in branch '3.3':
    Issue bpo-17835: Fix test_io when the default OS pipe buffer size is larger than one million bytes.
    http://hg.python.org/cpython/rev/4b4ed1e11fd0

    New changeset de35eae9048a by Antoine Pitrou in branch 'default':
    Issue bpo-17835: Fix test_io when the default OS pipe buffer size is larger than one million bytes.
    http://hg.python.org/cpython/rev/de35eae9048a

    New changeset 09811ecd5df1 by Antoine Pitrou in branch '2.7':
    Issue bpo-17835: Fix test_io when the default OS pipe buffer size is larger than one million bytes.
    http://hg.python.org/cpython/rev/09811ecd5df1

    @pitrou
    Copy link
    Member

    pitrou commented Apr 24, 2013

    This should be fixed now. David, please re-open if the failure still occurs.

    @pitrou pitrou closed this as completed Apr 24, 2013
    @spurin
    Copy link
    Mannequin

    spurin mannequin commented Sep 24, 2014

    I encountered similar issues to those discussed in this issue whilst compiling 3.4.1 on 'Red Hat Enterprise Linux Server release 6.5 (Santiago)'

    In particular, the following tests were failing -

    [root@lonlx90800 ~]# /local/0/python-3.4.1/bin/python3 /local/0/python-3.4.1/lib/python3.4/test/test_asyncio/test_subprocess.py
    F.......F.......
    ======================================================================
    FAIL: test_broken_pipe (main.SubprocessFastWatcherTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/local/0/python-3.4.1/lib/python3.4/test/test_asyncio/test_subprocess.py", line 129, in test_broken_pipe
        self.loop.run_until_complete(proc.communicate(large_data))
    AssertionError: BrokenPipeError not raised

    ======================================================================
    FAIL: test_broken_pipe (main.SubprocessSafeWatcherTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/local/0/python-3.4.1/lib/python3.4/test/test_asyncio/test_subprocess.py", line 129, in test_broken_pipe
        self.loop.run_until_complete(proc.communicate(large_data))
    AssertionError: BrokenPipeError not raised

    In this case, the issues are being caused by the following kernel parameters that we have for our default build -

    #########################
    ## TIBCO network tuning #
    #########################
    net.core.rmem_default = 33554432
    net.core.wmem_default = 33554432
    net.core.rmem_max = 33554432
    net.core.wmem_max = 33554432

    Toggling the support.PIPE_MAX_SIZE to +32MB or temporarily removing these parameters mitigates the issue. Is there a better way of calculating support.PIPE_MAX_SIZE so it is reflective of the actual OS value?

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Sep 24, 2014

    In this case, the issues are being caused by the following kernel parameters that we have for our default build -

    #########################

    TIBCO network tuning

    #########################
    net.core.rmem_default = 33554432
    net.core.wmem_default = 33554432
    net.core.rmem_max = 33554432
    net.core.wmem_max = 33554432

    Toggling the support.PIPE_MAX_SIZE to +32MB or temporarily removing these parameters mitigates the issue. Is there a better way of calculating support.PIPE_MAX_SIZE so it is reflective of the actual OS value?

    What does:

    >> import fcntl, os
    >> r, w = os.pipe()
    >> fcntl.fcntl(w, 1032)

    return?

    Note that the kernel buffer sizes above are, well, *really huge*.

    @spurin
    Copy link
    Mannequin

    spurin mannequin commented Sep 25, 2014

    fcntl doesnt seem to like the parameter you mentioned -

    # cat /etc/redhat-release
    Red Hat Enterprise Linux Server release 6.5 (Santiago)
    # /local/0/opt/python-3.4.1/bin/python
    Python 3.4.1 (default, Sep 24 2014, 12:23:21)
    [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import fcntl, os
    >>> r, w = os.pipe()
    >>> fcntl.fcntl(w, 1032)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: [Errno 22] Invalid argument
    >>>

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Sep 25, 2014

    Let's try with this instead:

    >> from socket import socket, SO_SNDBUF, SOL_SOCKET
    >> s = socket()
    >> s.getsockopt(SOL_SOCKET, SO_SNDBUF)

    @spurin
    Copy link
    Mannequin

    spurin mannequin commented Sep 29, 2014

    With both the kernel parameters defined and undefined, I get the following output -

    # /local/0/opt/python-3.4.1/bin/python
    Python 3.4.1 (default, Sep 29 2014, 13:31:39)
    [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from socket import socket, SO_SNDBUF, SOL_SOCKET
    >>> s = socket()
    >>> s.getsockopt(SOL_SOCKET, SO_SNDBUF)
    16384

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant