Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_socket: testCongestion() hangs on my Fedora 28 #78768

Closed
vstinner opened this issue Sep 5, 2018 · 9 comments
Closed

test_socket: testCongestion() hangs on my Fedora 28 #78768

vstinner opened this issue Sep 5, 2018 · 9 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

vstinner commented Sep 5, 2018

BPO 34587
Nosy @ncoghlan, @vstinner, @encukou, @yan12125, @miss-islington, @tirkarthi
PRs
  • bpo-34587, test_socket: remove RDSTest.testCongestion() #9277
  • [3.7] bpo-34587, test_socket: remove RDSTest.testCongestion() (GH-9277) #9368
  • [3.6] bpo-34587, test_socket: remove RDSTest.testCongestion() (GH-9277) #9369
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-09-17.23:01:53.869>
    created_at = <Date 2018-09-05.15:01:19.877>
    labels = ['3.8', '3.7', 'tests']
    title = 'test_socket: testCongestion() hangs on my Fedora 28'
    updated_at = <Date 2018-09-17.23:01:53.867>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-09-17.23:01:53.867>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-09-17.23:01:53.869>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2018-09-05.15:01:19.877>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 34587
    keywords = ['patch']
    message_count = 9.0
    messages = ['324643', '324668', '324674', '325248', '325286', '325576', '325579', '325581', '325594']
    nosy_count = 6.0
    nosy_names = ['ncoghlan', 'vstinner', 'petr.viktorin', 'yan12125', 'miss-islington', 'xtreak']
    pr_nums = ['9277', '9368', '9369']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue34587'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 5, 2018

    Hi,

    test_socket started to hang recently on my Fedora 28 laptop. No idea why it started to hang.

    vstinner@apu$ ./python -m test -v test_socket -m testCongestion --timeout=5
    == CPython 3.8.0a0 (heads/master-dirty:39487196c8, Sep 4 2018, 23:08:20) [GCC 8.1.1 20180712 (Red Hat 8.1.1-5)]
    == Linux-4.17.19-200.fc28.x86_64-x86_64-with-glibc2.26 little-endian
    == cwd: /home/vstinner/prog/python/master/build/test_python_29510
    == CPU count: 8
    == encodings: locale=UTF-8, FS=utf-8
    Run tests sequentially
    0:00:00 load avg: 1.34 [1/1] test_socket
    testCongestion (test.test_socket.RDSTest) ... Timeout (0:00:05)!
    Thread 0x00007fccf51b1700 (most recent call first):
    File "/home/vstinner/prog/python/master/Lib/test/test_socket.py", line 2074 in _testCongestion
    File "/home/vstinner/prog/python/master/Lib/test/test_socket.py", line 332 in clientRun

    Thread 0x00007fcd082ee080 (most recent call first):
    File "/home/vstinner/prog/python/master/Lib/threading.py", line 296 in wait
    File "/home/vstinner/prog/python/master/Lib/threading.py", line 552 in wait
    File "/home/vstinner/prog/python/master/Lib/test/test_socket.py", line 2059 in testCongestion
    File "/home/vstinner/prog/python/master/Lib/unittest/case.py", line 610 in run
    File "/home/vstinner/prog/python/master/Lib/unittest/case.py", line 658 in __call__
    File "/home/vstinner/prog/python/master/Lib/unittest/suite.py", line 122 in run
    File "/home/vstinner/prog/python/master/Lib/unittest/suite.py", line 84 in __call__
    File "/home/vstinner/prog/python/master/Lib/unittest/suite.py", line 122 in run
    File "/home/vstinner/prog/python/master/Lib/unittest/suite.py", line 84 in __call__
    File "/home/vstinner/prog/python/master/Lib/unittest/runner.py", line 176 in run
    File "/home/vstinner/prog/python/master/Lib/test/support/init.py", line 1900 in _run_suite
    File "/home/vstinner/prog/python/master/Lib/test/support/init.py", line 1990 in run_unittest
    File "/home/vstinner/prog/python/master/Lib/test/test_socket.py", line 6032 in test_main
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/runtest.py", line 179 in runtest_inner
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/runtest.py", line 140 in runtest
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/main.py", line 384 in run_tests_sequential
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/main.py", line 488 in run_tests
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/main.py", line 566 in _main
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/main.py", line 531 in main
    File "/home/vstinner/prog/python/master/Lib/test/libregrtest/main.py", line 584 in main
    File "/home/vstinner/prog/python/master/Lib/test/main.py", line 2 in <module>
    File "/home/vstinner/prog/python/master/Lib/runpy.py", line 85 in _run_code
    File "/home/vstinner/prog/python/master/Lib/runpy.py", line 193 in _run_module_as_main

    @vstinner vstinner added 3.8 only security fixes tests Tests in the Lib/test dir labels Sep 5, 2018
    @tirkarthi
    Copy link
    Member

    It seems there was a similar report pointing to the same line in the test using Fedora 28. Ref : https://bugs.python.org/issue34354

    Thanks

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 6, 2018

    Linux RDS manual page says:
    https://linux.die.net/man/7/rds

    "The receive queue size limits how much data RDS will put on the receive queue of a socket before marking the socket as congested. When a socket becomes congested, RDS will send a congestion map update to the other participating hosts, who are then expected to stop sending more messages to this port."

    => "other participating hosts (...) are (...) expected to stop sending"

    By design, it seems like the Python unit test is going to fail, so I suggest to remove the test.

    I don't think that the role of Python is to check how the kernel handles congestion on local RDS sockets.

    @ncoghlan
    Copy link
    Contributor

    Same problem here. However, checking the test code, it seems that what's happening is that even though the sending socket has been put into non-blocking mode, self.cli.sendto in the _testCongestion helper method invoked by the ThreadableTest base class [1] has *not* thrown OSError, and hence the finally clause setting the event has *not* been triggered, and hence the test is hanging.

    Neither socket.py nor test_socket.py have changed recently though, so it seems to me that this is either a recent Fedora bug (where the socket is blocking when it shouldn't), or else a Fedora change that has uncovered a latent defect in the socket module code.

    [1] https://github.com/python/cpython/blob/master/Lib/test/test_socket.py#L228

    @vstinner
    Copy link
    Member Author

    I proposed PR 9277 to remove the test: see the PR for the rationale.

    Neither socket.py nor test_socket.py have changed recently though, so it seems to me that this is either a recent Fedora bug (where the socket is blocking when it shouldn't), or else a Fedora change that has uncovered a latent defect in the socket module code.

    IMHO it's a change in the implementation of the RDS protocol in Linux, likely in the kernel.

    @vstinner
    Copy link
    Member Author

    New changeset 7484bdf by Victor Stinner in branch 'master':
    bpo-34587, test_socket: remove RDSTest.testCongestion() (GH-9277)
    7484bdf

    @miss-islington
    Copy link
    Contributor

    New changeset b7f58d7 by Miss Islington (bot) in branch '3.7':
    bpo-34587, test_socket: remove RDSTest.testCongestion() (GH-9277)
    b7f58d7

    @miss-islington
    Copy link
    Contributor

    New changeset 68a8f04 by Miss Islington (bot) in branch '3.6':
    bpo-34587, test_socket: remove RDSTest.testCongestion() (GH-9277)
    68a8f04

    @vstinner
    Copy link
    Member Author

    I removed the test from Python 3.6, 3.7 and master.

    @vstinner vstinner added the 3.7 (EOL) end of life label Sep 17, 2018
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants