classification
Title: Multiple tests (test_socket, test_multiprocessing_*) fail due to incorrect recvmsg(2) buffer lengths, causing failures on FreeBSD CURRENT
Type: behavior Stage: resolved
Components: FreeBSD, Tests Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: koobs, miss-islington, ned.deily, pablogsal, vstinner
Priority: normal Keywords: buildbot, patch

Created on 2018-08-27 17:12 by pablogsal, last changed 2019-08-23 13:51 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
freebsd-current.python.3.x.build.301.stdio.log koobs, 2018-09-27 10:01
Pull Requests
URL Status Linked Edit
PR 9594 merged pablogsal, 2018-09-26 21:07
PR 9597 merged miss-islington, 2018-09-27 10:17
PR 9598 merged miss-islington, 2018-09-27 10:17
PR 9613 merged pablogsal, 2018-09-28 00:24
PR 9613 merged pablogsal, 2018-09-28 00:24
PR 9618 merged miss-islington, 2018-09-28 11:13
PR 9619 merged miss-islington, 2018-09-28 11:13
PR 9850 merged pablogsal, 2018-10-13 18:16
PR 9887 merged miss-islington, 2018-10-15 08:47
PR 9888 merged miss-islington, 2018-10-15 08:47
PR 15422 merged vstinner, 2019-08-23 12:47
Messages (25)
msg324188 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-27 17:12
test_socket.RecvmsgIntoSCMRightsStreamTest fails on AMD64 FreeBSD CURRENT Debug 3.x:

https://buildbot.python.org/all/#/builders/60/builds/248


======================================================================
FAIL: testFDPassSeparateMinSpace (test.test_socket.RecvmsgIntoSCMRightsStreamTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/pablo/cpython/Lib/test/test_socket.py", line 2881, in testFDPassSeparateMinSpace
    maxcmsgs=2, ignoreflags=socket.MSG_CTRUNC)
  File "/usr/home/pablo/cpython/Lib/test/test_socket.py", line 2798, in checkRecvmsgFDs
    self.assertEqual(len(fds), numfds)
AssertionError: 1 != 2

----------------------------------------------------------------------
Ran 18 tests in 0.084s

FAILED (failures=1)
test test_socket failed
test_socket failed

== Tests result: FAILURE ==

1 test failed:
    test_socket


I was able to reproduce the bug in the buildbot but this same test fails in 3.6 and 3.5 in the buildbot so it can be some configuration in the buildbot or some regression in FreeBSD.
msg324443 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-31 21:40
This error happens a lot in the same buildbot in different tests:

raceback (most recent call last):
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/test/_test_multiprocessing.py", line 3335, in child_access
    w = conn.recv()
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/connection.py", line 960, in rebuild_connection
    fd = df.detach()
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
  File "/usr/home/buildbot/python/3.7.koobs-freebsd-current/build/Lib/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata
msg326537 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2018-09-27 10:01
Copying in my original email to Pablo that elucidated the test failure cause and resolution, including upstream commit/review/issue references for the recvmsg(2) changes IN CURRENT.

Full buildbot worker log attached

While I'm here, update list of versions (branches) affected.

------------------

The test suite fails:

  File
"/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/multiprocessing/reduction.py",
line 164, in recvfds
    raise RuntimeError('received %d items of ancdata' %
RuntimeError: received 0 items of ancdata

For these two tests (further up in the log [1]) ...

test_fd_transfer
(test.test_multiprocessing_forkserver.WithProcessesTestConnection) ... FAIL
test_large_fd_transfer
(test.test_multiprocessing_forkserver.WithProcessesTestConnection) ... FAIL

Further up we see (in the initial test run) ...

4 tests failed:
    test_multiprocessing_fork test_multiprocessing_forkserver
    test_multiprocessing_spawn test_socket

Then further up we see:

======================================================================
FAIL: testFDPassSeparateMinSpace
(test.test_socket.RecvmsgSCMRightsStreamTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
line 3186, in testFDPassSeparateMinSpace
    self.checkRecvmsgFDs(2,
  File
"/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
line 3107, in checkRecvmsgFDs
    self.assertEqual(len(fds), numfds)
AssertionError: 1 != 2
======================================================================
FAIL: testFDPassSeparateMinSpace
(test.test_socket.RecvmsgIntoSCMRightsStreamTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
line 3186, in testFDPassSeparateMinSpace
    self.checkRecvmsgFDs(2,
  File
"/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_socket.py",
line 3107, in checkRecvmsgFDs
    self.assertEqual(len(fds), numfds)
AssertionError: 1 != 2
----------------------------------------------------------------------
Ran 554 tests in 27.636s
FAILED (failures=2, skipped=57)

I looked for recvmsg() changes in FreeBSD recently, and found ...

Commit: https://svnweb.freebsd.org/base?view=revision&revision=337423
Review: https://reviews.freebsd.org/D16561
Bug ID: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=131876
Bug Desc:[socket] FD leak by receiving SCM_RIGHTS by recvmsg with small
control message buffer

The commit/bug/review details above point to recvmsg specific changes,
specifically, but maybe not *just*, for SCM_RIGHTS messages, which (in
my opinion) is way too similar to the test case specifics not to be related.

Also, the commit date of Aug 7 16:36:48 2018  seems right around the
time I updated the machine and the tests began to fail.

I can't say yet whether or not a bug crept in with the change, *but*,
the commit does say that it *fixes* bugs, which the python tests, or the
socket code in Python may be have been relying on.

It may also be that FreeBSD specific-behaviour isn't being
handled by the code and/or tests (truncated now where it wasn't before?)

It could also be the buffer sizes the socket code/tests use.

------------------
msg326561 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-27 13:38
Commit merged into master:

commit 7291108d88ea31d205da4db19d202d6cbffc6d93
Author: Pablo Galindo <Pablogsal@gmail.com>
Date:   Thu Sep 27 10:25:03 2018 +0100

    Fix tests in test_socket to use correctly CMSG_LEN (GH-9594)
    
    After some failures in AMD64 FreeBSD CURRENT Debug 3.x buildbots
    regarding tests in test_socket that are using
    testFDPassSeparateMinSpace(), FreeBDS revision 337423 was pointed
    out to be the reason the test started to fail.
    
    A close examination of the manpage for cmsg_space(3) reveals that
    the number of file descriptors needs to be taken into account when
    using CMSG_LEN().
    
    This commit fixes tests in test_socket to use correctly CMSG_LEN, taking
    into account the number of FDs.

3.7:

commit addef07ca7d7b6971d59c062c3229e91a99e5f5e (upstream/3.7, 3.7)
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Thu Sep 27 06:30:47 2018 -0700

3.6:

commit fe48b6df101aac10dc846fa6fd1a41f877e77025 (upstream/3.6, 3.6)
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Thu Sep 27 06:30:55 2018 -0700

(the commit message doesn't contain bpo-34521, so bots didn't added comments here about the commits)
msg326562 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-09-27 13:53
It looks like the test_socket is fixed but some related issues are still left in multiprocessing.
msg326569 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-27 15:50
Is test_socket of Python 2.7 affected by the issue? If not, I suggest to remove 2.7 from this issue and close it.
msg326601 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2018-09-28 02:59
To clarify affected tests/versions:

Python 3.x - test_socket test_multiprocessing_spawn
Python 2.7 - test_multiprocessing_spawn

All of the issues are related to incorrect recvmsg(2) buffer length use, so I've amended the issue summary to describe the complete root problem.

The be explicit,

- test_socket fix requires backporting to all supported 3.x branches
- test_multiprocessing_* test fixes (via multiprocessing.reduction fix) requires backporting to all supported 3.x branches *and* 2.7
msg326624 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-28 09:51
New changeset 077061a7b24917aaf31057885c69919c5a553c88 by Victor Stinner (Pablo Galindo) in branch 'master':
bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613)
https://github.com/python/cpython/commit/077061a7b24917aaf31057885c69919c5a553c88
msg326627 - (view) Author: miss-islington (miss-islington) Date: 2018-09-28 11:31
New changeset 5d33ee1595767d68800fda13bc68b7b01dba5117 by Miss Islington (bot) in branch '3.6':
bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613)
https://github.com/python/cpython/commit/5d33ee1595767d68800fda13bc68b7b01dba5117
msg326632 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-28 13:08
New changeset 007fda436f707ac95c2fa8f8886efd9e09d5b630 by Victor Stinner (Miss Islington (bot)) in branch '3.7':
bpo-34521: Fix tests in test_multiprocessing_spawn to use correctly CMSG_SPACE (GH-9613) (GH-9619)
https://github.com/python/cpython/commit/007fda436f707ac95c2fa8f8886efd9e09d5b630
msg326635 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-28 13:12
Pablo: I merged your PR on master, and backport it to 3.6 and 3.7. But I let you handle 2.7, since I expect a conflict :-)
msg327623 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-10-13 04:27
Shouldn't there be a NEWS entry for this change since the change seems to be to Lib/multiprocessing and not just its tests?
msg327714 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-10-14 16:41
New changeset bd036d3d15fc1310ccc32a43a3296b8c157ac221 by Pablo Galindo in branch 'master':
bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850)
https://github.com/python/cpython/commit/bd036d3d15fc1310ccc32a43a3296b8c157ac221
msg327789 - (view) Author: miss-islington (miss-islington) Date: 2018-10-15 21:20
New changeset d991ede16b34399c5ea9bd30e9a5c520bed6bea8 by Miss Islington (bot) in branch '3.7':
bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850)
https://github.com/python/cpython/commit/d991ede16b34399c5ea9bd30e9a5c520bed6bea8
msg327814 - (view) Author: miss-islington (miss-islington) Date: 2018-10-16 07:31
New changeset ae011e00189d9083dd84c357718264e24fe77314 by Miss Islington (bot) in branch '3.6':
bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850)
https://github.com/python/cpython/commit/ae011e00189d9083dd84c357718264e24fe77314
msg328161 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-10-20 16:51
New changeset d404ffa8adf252d49731fbd09b740360577274c8 by Ned Deily (Miss Islington (bot)) in branch '3.6':
bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850)
https://github.com/python/cpython/commit/d404ffa8adf252d49731fbd09b740360577274c8
msg328164 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-10-20 16:57
New changeset ce3b5a80cc6ee4f9aff7b673f457294d7e054349 by Ned Deily (Miss Islington (bot)) in branch '3.7':
bpo-34521: Add NEWS entry for changes in GH-9613 (GH-9850)
https://github.com/python/cpython/commit/ce3b5a80cc6ee4f9aff7b673f457294d7e054349
msg328171 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-10-20 17:13
Thanks, Pablo, I cherry-picked the NEWS entries into 3.7.1 and 3.6.7.  So I guess the only remaining question here is what to do about 2.7.  I'll also leave that for you!
msg330654 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-29 01:14
> Is test_socket of Python 2.7 affected by the issue? If not, I suggest to remove 2.7 from this issue and close it.

Some "FD transfer" tests of test_multiprocessing still fail on AMD64 FreeBSD CURRENT Shared 2.7:

https://buildbot.python.org/all/#/builders/169/builds/36

Re-running test 'test_multiprocessing' in verbose mode
FAIL: test_fd_transfer (test.test_multiprocessing.WithProcessesTestConnection)
FAIL: test_large_fd_transfer (test.test_multiprocessing.WithProcessesTestConnection)

Modules/_multiprocessing/multiprocessing.c uses CMSG_SPACE() *and* CMSG_LEN(). Are they used properly? I don't understand the code, but I propose:

diff --git a/Modules/_multiprocessing/multiprocessing.c b/Modules/_multiprocessing/multiprocessing.c
index d192a074ba..eecace887e 100644
--- a/Modules/_multiprocessing/multiprocessing.c
+++ b/Modules/_multiprocessing/multiprocessing.c
@@ -167,7 +167,7 @@ multiprocessing_recvfd(PyObject *self, PyObject *args)
     cmsg = CMSG_FIRSTHDR(&msg);
     cmsg->cmsg_level = SOL_SOCKET;
     cmsg->cmsg_type = SCM_RIGHTS;
-    cmsg->cmsg_len = CMSG_LEN(sizeof(int));
+    cmsg->cmsg_len = CMSG_SPACE(sizeof(int));
     msg.msg_controllen = cmsg->cmsg_len;
 
     Py_BEGIN_ALLOW_THREADS
msg330656 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-11-29 01:18
I did not backport this still because the code is very different and it required more time to understand how it works on 2.7.

I do not have access currently to koobs buildbot (the IP has changed) so I cannot check if that works :(
msg344554 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2019-06-04 12:28
Access to the buildbot has been restored (a while ago), and this issue being resolved (merge to 2.7) is the only test failing to get 2.7 back to green, which has been failing for months.

Can we test with Victors proposed patch in msg330654 ? I don't believe the change/fix/merge will be difficult, given the very small change scope of the master commits.
msg350278 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-23 12:50
I am able to reproduce test_multiprocessing.test_fd_transfer() failure on the FreeBSD CURRENT buildbot:

CURRENT-amd64% ./python -m test -m test_fd_transfer test_multiprocessing 
(...)
Tests result: FAILURE

I converted my msg330654 patch into PR 15422: I confirm that this PR fix test_fd_transfer() and test_large_fd_transfer() of test_multiprocessing.
msg350279 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-23 12:50
Note: I also wrote PR 15421 to remove temporary files created by these 2 tests.
msg350280 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-23 13:00
New changeset c3642219090f2564c1790330cbf0ba31f19dcaf4 by Victor Stinner in branch '2.7':
bpo-34521: Fix FD transfer in multiprocessing on FreeBSD (GH-15422)
https://github.com/python/cpython/commit/c3642219090f2564c1790330cbf0ba31f19dcaf4
msg350284 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-23 13:51
"AMD64 FreeBSD CURRENT Shared 2.7 #159 is complete: Success"

https://buildbot.python.org/all/#/builders/169/builds/159

The FreeBSD 2.7 buildbot is back to green, I close the issue.

Thanks everyone was helped to get these issues to be fixed!
History
Date User Action Args
2019-08-23 13:51:32vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg350284

stage: patch review -> resolved
2019-08-23 13:00:41vstinnersetmessages: + msg350280
2019-08-23 12:50:32vstinnersetmessages: + msg350279
2019-08-23 12:50:01vstinnersetmessages: + msg350278
2019-08-23 12:47:51vstinnersetpull_requests: + pull_request15121
2019-06-04 12:28:11koobssetmessages: + msg344554
2018-11-29 01:18:58pablogsalsetmessages: + msg330656
2018-11-29 01:14:17vstinnersetmessages: + msg330654
2018-10-20 17:13:36ned.deilysetmessages: + msg328171
2018-10-20 16:57:26ned.deilysetmessages: + msg328164
2018-10-20 16:51:01ned.deilysetmessages: + msg328161
2018-10-16 07:31:03miss-islingtonsetmessages: + msg327814
2018-10-15 21:20:15miss-islingtonsetmessages: + msg327789
2018-10-15 08:47:52miss-islingtonsetpull_requests: + pull_request9251
2018-10-15 08:47:44miss-islingtonsetpull_requests: + pull_request9250
2018-10-14 16:41:15pablogsalsetmessages: + msg327714
2018-10-13 18:16:59pablogsalsetpull_requests: + pull_request9220
2018-10-13 04:27:04ned.deilysetnosy: + ned.deily
messages: + msg327623
2018-09-28 13:12:16vstinnersetmessages: + msg326635
2018-09-28 13:08:01vstinnersetmessages: + msg326632
2018-09-28 11:31:56miss-islingtonsetnosy: + miss-islington
messages: + msg326627
2018-09-28 11:13:54miss-islingtonsetpull_requests: + pull_request9018
2018-09-28 11:13:46miss-islingtonsetpull_requests: + pull_request9017
2018-09-28 09:51:10vstinnersetmessages: + msg326624
2018-09-28 02:59:51koobssetmessages: + msg326601
title: test_socket.Recvmsg[Into]SCMRightsStreamTest tests fail on AMD64 FreeBSD CURRENT -> Multiple tests (test_socket, test_multiprocessing_*) fail due to incorrect recvmsg(2) buffer lengths, causing failures on FreeBSD CURRENT
2018-09-28 00:24:41pablogsalsetpull_requests: + pull_request9013
2018-09-28 00:24:39pablogsalsetpull_requests: + pull_request9012
2018-09-27 15:50:24vstinnersetmessages: + msg326569
2018-09-27 13:53:15pablogsalsetmessages: + msg326562
2018-09-27 13:38:58vstinnersetnosy: + vstinner
messages: + msg326561
2018-09-27 10:17:14miss-islingtonsetpull_requests: + pull_request8999
2018-09-27 10:17:04miss-islingtonsetstage: backport needed -> patch review
pull_requests: + pull_request8998
2018-09-27 10:01:49koobssetfiles: + freebsd-current.python.3.x.build.301.stdio.log
versions: + Python 2.7, Python 3.6, Python 3.7, Python 3.8
title: test_socket.RecvmsgIntoSCMRightsStreamTest fails on AMD64 FreeBSD CURRENT Debug 3.x -> test_socket.Recvmsg[Into]SCMRightsStreamTest tests fail on AMD64 FreeBSD CURRENT
messages: + msg326537

keywords: + buildbot
stage: patch review -> backport needed
2018-09-26 21:07:37pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request8992
2018-08-31 21:40:37pablogsalsetmessages: + msg324443
2018-08-27 17:12:58pablogsalcreate