Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple tests (test_socket, test_multiprocessing_*) fail due to incorrect recvmsg(2) buffer lengths, causing failures on FreeBSD CURRENT #78702

Closed
pablogsal opened this issue Aug 27, 2018 · 25 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@pablogsal
Copy link
Member

BPO 34521
Nosy @vstinner, @ned-deily, @koobs, @pablogsal, @miss-islington
PRs
  • bpo-34521: Fix tests in test_socket to use correctly CMSG_LEN #9594
  • [3.7] bpo-34521: Fix tests in test_socket to use correctly CMSG_LEN (GH-9594) #9597
  • [3.6] bpo-34521: Fix tests in test_socket to use correctly CMSG_LEN (GH-9594) #9598
  • bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE #9613
  • bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE #9613
  • [3.6] bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613) #9618
  • [3.7] bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613) #9619
  • bpo-34521: Add NEWS entry for changes in GH-9613 #9850
  • [3.7] bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850) #9887
  • [3.6] bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850) #9888
  • [2.7] bpo-34521: Fix FD transfer in multiprocessing on FreeBSD #15422
  • Files
  • freebsd-current.python.3.x.build.301.stdio.log
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-08-23.13:51:32.657>
    created_at = <Date 2018-08-27.17:12:58.290>
    labels = ['3.7', '3.8', 'type-bug', 'tests']
    title = 'Multiple tests (test_socket, test_multiprocessing_*) fail due to incorrect recvmsg(2) buffer lengths, causing failures on FreeBSD CURRENT'
    updated_at = <Date 2019-08-23.13:51:32.657>
    user = 'https://github.com/pablogsal'

    bugs.python.org fields:

    activity = <Date 2019-08-23.13:51:32.657>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-08-23.13:51:32.657>
    closer = 'vstinner'
    components = ['Tests', 'FreeBSD']
    creation = <Date 2018-08-27.17:12:58.290>
    creator = 'pablogsal'
    dependencies = []
    files = ['47835']
    hgrepos = []
    issue_num = 34521
    keywords = ['patch', 'buildbot']
    message_count = 25.0
    messages = ['324188', '324443', '326537', '326561', '326562', '326569', '326601', '326624', '326627', '326632', '326635', '327623', '327714', '327789', '327814', '328161', '328164', '328171', '330654', '330656', '344554', '350278', '350279', '350280', '350284']
    nosy_count = 5.0
    nosy_names = ['vstinner', 'ned.deily', 'koobs', 'pablogsal', 'miss-islington']
    pr_nums = ['9594', '9597', '9598', '9613', '9613', '9618', '9619', '9850', '9887', '9888', '15422']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue34521'
    versions = ['Python 2.7', 'Python 3.6', 'Python 3.7', 'Python 3.8']

    @pablogsal
    Copy link
    Member Author

    test_socket.RecvmsgIntoSCMRightsStreamTest fails on AMD64 FreeBSD CURRENT Debug 3.x:

    https://buildbot.python.org/all/#/builders/60/builds/248

    ======================================================================
    FAIL: testFDPassSeparateMinSpace (test.test_socket.RecvmsgIntoSCMRightsStreamTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/pablo/cpython/Lib/test/test_socket.py", line 2881, in testFDPassSeparateMinSpace
        maxcmsgs=2, ignoreflags=socket.MSG_CTRUNC)
      File "/usr/home/pablo/cpython/Lib/test/test_socket.py", line 2798, in checkRecvmsgFDs
        self.assertEqual(len(fds), numfds)
    AssertionError: 1 != 2

    Ran 18 tests in 0.084s

    FAILED (failures=1)
    test test_socket failed
    test_socket failed

    == Tests result: FAILURE ==

    1 test failed:
    test_socket

    I was able to reproduce the bug in the buildbot but this same test fails in 3.6 and 3.5 in the buildbot so it can be some configuration in the buildbot or some regression in FreeBSD.

    @pablogsal pablogsal added tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error labels Aug 27, 2018
    @pablogsal
    Copy link
    Member Author

    This error happens a lot in the same buildbot in different tests:

    raceback (most recent call last):
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/test/_test_multiprocessing.py", line 3335, in child_access
    w = conn.recv()
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/connection.py", line 960, in rebuild_connection
    fd = df.detach()
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
    File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
    RuntimeError: received 0 items of ancdata

    @koobs
    Copy link

    koobs commented Sep 27, 2018

    Copying in my original email to Pablo that elucidated the test failure cause and resolution, including upstream commit/review/issue references for the recvmsg(2) changes IN CURRENT.

    Full buildbot worker log attached

    While I'm here, update list of versions (branches) affected.

    ------------------

    The test suite fails:

    File
    "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/multiprocessing/reduction.py",
    line 164, in recvfds
    raise RuntimeError('received %d items of ancdata' %
    RuntimeError: received 0 items of ancdata

    For these two tests (further up in the log [1]) ...

    test_fd_transfer
    (test.test_multiprocessing_forkserver.WithProcessesTestConnection) ... FAIL
    test_large_fd_transfer
    (test.test_multiprocessing_forkserver.WithProcessesTestConnection) ... FAIL

    Further up we see (in the initial test run) ...

    4 tests failed:
    test_multiprocessing_fork test_multiprocessing_forkserver
    test_multiprocessing_spawn test_socket

    Then further up we see:

    ======================================================================
    FAIL: testFDPassSeparateMinSpace
    (test.test_socket.RecvmsgSCMRightsStreamTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File
    "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
    line 3186, in testFDPassSeparateMinSpace
        self.checkRecvmsgFDs(2,
      File
    "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
    line 3107, in checkRecvmsgFDs
        self.assertEqual(len(fds), numfds)
    AssertionError: 1 != 2

    ======================================================================
    FAIL: testFDPassSeparateMinSpace
    (test.test_socket.RecvmsgIntoSCMRightsStreamTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File
    "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
    line 3186, in testFDPassSeparateMinSpace
        self.checkRecvmsgFDs(2,
      File
    "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
    line 3107, in checkRecvmsgFDs
        self.assertEqual(len(fds), numfds)
    AssertionError: 1 != 2

    Ran 554 tests in 27.636s
    FAILED (failures=2, skipped=57)

    I looked for recvmsg() changes in FreeBSD recently, and found ...

    Commit: https://svnweb.freebsd.org/base?view=revision&revision=337423
    Review: https://reviews.freebsd.org/D16561
    Bug ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=131876
    Bug Desc:[socket] FD leak by receiving SCM_RIGHTS by recvmsg with small
    control message buffer

    The commit/bug/review details above point to recvmsg specific changes,
    specifically, but maybe not *just*, for SCM_RIGHTS messages, which (in
    my opinion) is way too similar to the test case specifics not to be related.

    Also, the commit date of Aug 7 16:36:48 2018 seems right around the
    time I updated the machine and the tests began to fail.

    I can't say yet whether or not a bug crept in with the change, *but*,
    the commit does say that it *fixes* bugs, which the python tests, or the
    socket code in Python may be have been relying on.

    It may also be that FreeBSD specific-behaviour isn't being
    handled by the code and/or tests (truncated now where it wasn't before?)

    It could also be the buffer sizes the socket code/tests use.

    ------------------

    @koobs koobs added 3.7 (EOL) end of life 3.8 only security fixes labels Sep 27, 2018
    @koobs koobs changed the title test_socket.RecvmsgIntoSCMRightsStreamTest fails on AMD64 FreeBSD CURRENT Debug 3.x test_socket.Recvmsg[Into]SCMRightsStreamTest tests fail on AMD64 FreeBSD CURRENT Sep 27, 2018
    @vstinner
    Copy link
    Member

    Commit merged into master:

    commit 7291108
    Author: Pablo Galindo <Pablogsal@gmail.com>
    Date: Thu Sep 27 10:25:03 2018 +0100

    Fix tests in test_socket to use correctly CMSG_LEN (GH-9594)
    
    After some failures in AMD64 FreeBSD CURRENT Debug 3.x buildbots
    regarding tests in test_socket that are using
    testFDPassSeparateMinSpace(), FreeBDS revision 337423 was pointed
    out to be the reason the test started to fail.
    
    A close examination of the manpage for cmsg_space(3) reveals that
    the number of file descriptors needs to be taken into account when
    using CMSG_LEN().
    
    This commit fixes tests in test_socket to use correctly CMSG_LEN, taking
    into account the number of FDs.
    

    3.7:

    commit addef07 (upstream/3.7, 3.7)
    Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
    Date: Thu Sep 27 06:30:47 2018 -0700

    3.6:

    commit fe48b6d (upstream/3.6, 3.6)
    Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
    Date: Thu Sep 27 06:30:55 2018 -0700

    (the commit message doesn't contain bpo-34521, so bots didn't added comments here about the commits)

    @pablogsal
    Copy link
    Member Author

    It looks like the test_socket is fixed but some related issues are still left in multiprocessing.

    @vstinner
    Copy link
    Member

    Is test_socket of Python 2.7 affected by the issue? If not, I suggest to remove 2.7 from this issue and close it.

    @koobs
    Copy link

    koobs commented Sep 28, 2018

    To clarify affected tests/versions:

    Python 3.x - test_socket test_multiprocessing_spawn
    Python 2.7 - test_multiprocessing_spawn

    All of the issues are related to incorrect recvmsg(2) buffer length use, so I've amended the issue summary to describe the complete root problem.

    The be explicit,

    • test_socket fix requires backporting to all supported 3.x branches
    • test_multiprocessing_* test fixes (via multiprocessing.reduction fix) requires backporting to all supported 3.x branches *and* 2.7

    @koobs koobs changed the title test_socket.Recvmsg[Into]SCMRightsStreamTest tests fail on AMD64 FreeBSD CURRENT Multiple tests (test_socket, test_multiprocessing_*) fail due to incorrect recvmsg(2) buffer lengths, causing failures on FreeBSD CURRENT Sep 28, 2018
    @vstinner
    Copy link
    Member

    New changeset 077061a by Victor Stinner (Pablo Galindo) in branch 'master':
    bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613)
    077061a

    @miss-islington
    Copy link
    Contributor

    New changeset 5d33ee1 by Miss Islington (bot) in branch '3.6':
    bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613)
    5d33ee1

    @vstinner
    Copy link
    Member

    New changeset 007fda4 by Victor Stinner (Miss Islington (bot)) in branch '3.7':
    bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613) (GH-9619)
    007fda4

    @vstinner
    Copy link
    Member

    Pablo: I merged your PR on master, and backport it to 3.6 and 3.7. But I let you handle 2.7, since I expect a conflict :-)

    @ned-deily
    Copy link
    Member

    Shouldn't there be a NEWS entry for this change since the change seems to be to Lib/multiprocessing and not just its tests?

    @pablogsal
    Copy link
    Member Author

    New changeset bd036d3 by Pablo Galindo in branch 'master':
    bpo-34521: Add NEWS entry for changes in #53822 (GH-9850)
    bd036d3

    @miss-islington
    Copy link
    Contributor

    New changeset d991ede by Miss Islington (bot) in branch '3.7':
    bpo-34521: Add NEWS entry for changes in #53822 (GH-9850)
    d991ede

    @miss-islington
    Copy link
    Contributor

    New changeset ae011e0 by Miss Islington (bot) in branch '3.6':
    bpo-34521: Add NEWS entry for changes in #53822 (GH-9850)
    ae011e0

    @ned-deily
    Copy link
    Member

    New changeset d404ffa by Ned Deily (Miss Islington (bot)) in branch '3.6':
    bpo-34521: Add NEWS entry for changes in #53822 (GH-9850)
    d404ffa

    @ned-deily
    Copy link
    Member

    New changeset ce3b5a8 by Ned Deily (Miss Islington (bot)) in branch '3.7':
    bpo-34521: Add NEWS entry for changes in #53822 (GH-9850)
    ce3b5a8

    @ned-deily
    Copy link
    Member

    Thanks, Pablo, I cherry-picked the NEWS entries into 3.7.1 and 3.6.7. So I guess the only remaining question here is what to do about 2.7. I'll also leave that for you!

    @vstinner
    Copy link
    Member

    Is test_socket of Python 2.7 affected by the issue? If not, I suggest to remove 2.7 from this issue and close it.

    Some "FD transfer" tests of test_multiprocessing still fail on AMD64 FreeBSD CURRENT Shared 2.7:

    https://buildbot.python.org/all/#/builders/169/builds/36

    Re-running test 'test_multiprocessing' in verbose mode
    FAIL: test_fd_transfer (test.test_multiprocessing.WithProcessesTestConnection)
    FAIL: test_large_fd_transfer (test.test_multiprocessing.WithProcessesTestConnection)

    Modules/_multiprocessing/multiprocessing.c uses CMSG_SPACE() *and* CMSG_LEN(). Are they used properly? I don't understand the code, but I propose:

    diff --git a/Modules/_multiprocessing/multiprocessing.c b/Modules/_multiprocessing/multiprocessing.c
    index d192a074ba..eecace887e 100644
    --- a/Modules/_multiprocessing/multiprocessing.c
    +++ b/Modules/_multiprocessing/multiprocessing.c
    @@ -167,7 +167,7 @@ multiprocessing_recvfd(PyObject *self, PyObject *args)
         cmsg = CMSG_FIRSTHDR(&msg);
         cmsg->cmsg_level = SOL_SOCKET;
         cmsg->cmsg_type = SCM_RIGHTS;
    -    cmsg->cmsg_len = CMSG_LEN(sizeof(int));
    +    cmsg->cmsg_len = CMSG_SPACE(sizeof(int));
         msg.msg_controllen = cmsg->cmsg_len;
     
         Py_BEGIN_ALLOW_THREADS

    @pablogsal
    Copy link
    Member Author

    I did not backport this still because the code is very different and it required more time to understand how it works on 2.7.

    I do not have access currently to koobs buildbot (the IP has changed) so I cannot check if that works :(

    @koobs
    Copy link

    koobs commented Jun 4, 2019

    Access to the buildbot has been restored (a while ago), and this issue being resolved (merge to 2.7) is the only test failing to get 2.7 back to green, which has been failing for months.

    Can we test with Victors proposed patch in msg330654 ? I don't believe the change/fix/merge will be difficult, given the very small change scope of the master commits.

    @vstinner
    Copy link
    Member

    I am able to reproduce test_multiprocessing.test_fd_transfer() failure on the FreeBSD CURRENT buildbot:

    CURRENT-amd64% ./python -m test -m test_fd_transfer test_multiprocessing
    (...)
    Tests result: FAILURE

    I converted my msg330654 patch into PR 15422: I confirm that this PR fix test_fd_transfer() and test_large_fd_transfer() of test_multiprocessing.

    @vstinner
    Copy link
    Member

    Note: I also wrote PR 15421 to remove temporary files created by these 2 tests.

    @vstinner
    Copy link
    Member

    New changeset c364221 by Victor Stinner in branch '2.7':
    bpo-34521: Fix FD transfer in multiprocessing on FreeBSD (GH-15422)
    c364221

    @vstinner
    Copy link
    Member

    "AMD64 FreeBSD CURRENT Shared 2.7 #159 is complete: Success"

    https://buildbot.python.org/all/#/builders/169/builds/159

    The FreeBSD 2.7 buildbot is back to green, I close the issue.

    Thanks everyone was helped to get these issues to be fixed!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants