This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_socket leaks file descriptors on macOS
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: giampaolo.rodola, pablogsal, remi.lapeyre, vstinner, xtreak
Priority: normal Keywords:

Created on 2019-04-29 08:26 by xtreak, last changed 2022-04-11 14:59 by admin.

Messages (4)
msg341052 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-04-29 08:26
This PR https://github.com/python/cpython/pull/12271 has consistent build failures in test_socket even after merging the master branch.

Sample build failure : https://dev.azure.com/Python/cpython/_build/results?buildId=41411

I tried reproducing this and I can't reproduce it in Ubuntu. Mac OS seems to fail with leaked references which I think is unrelated to the above Azure failure. The Mac issue is open (issue35092 reported by Victor closed as duplicate of issue23828) but it was about warning though running with regrtest seems to leak references in addition to warnings.

Ubuntu build : 

karthi@ubuntu-s-1vcpu-1gb-blr1-01:~/cpython$ ./python -m test -R 3:3 test_socket
Run tests sequentially
0:00:00 load avg: 0.01 [1/1] test_socket
beginning 6 repetitions
123456
......
test_socket passed in 3 min 7 sec

== Tests result: SUCCESS ==

1 test OK.

Total duration: 3 min 7 sec
Tests result: SUCCESS

Mac OS build (Mac OS 10.10.4 (14E46))

➜  cpython git:(master) ./python.exe -m test -R 3:3 test_socket
Run tests sequentially
0:00:00 load avg: 2.00 [1/1] test_socket
beginning 6 repetitions
123456
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2419: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2510: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2419: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2510: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2419: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2510: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2419: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2510: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2419: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2510: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2419: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/test/test_socket.py:2510: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
.
test_socket leaked [20, 20, 20] file descriptors, sum=60
test_socket failed in 2 min 31 sec

== Tests result: FAILURE ==

1 test failed:
    test_socket

Total duration: 2 min 31 sec
Tests result: FAILURE
msg341053 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-04-29 08:33
> issue35092 reported by Victor closed as duplicate of issue23828

Sorry, it was reported by Pablo in issue35092. I haven't seen any old reports of this in search. The other buildbots on Ubuntu run fine so not sure if it's only for Azure and Ubuntu for the PR since master is green on Azure and the buildbots so feel free to close this if needed.
msg341136 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-04-30 05:59
The test failure is reproducible on the PR 12271 merging master branch on Ubuntu with -R 3:3 test_socket. Sorry, I tried it initially on master thinking it was a master branch problem. I don't see C related code in the PR. There was a merge commit [0] after which the failure is consistently reproducible. I tried reverting parts of the PR to narrow down the failure and happened to come across the below statement in the diff where the test object and elapsed test for time are stored to be printed later. Commenting out the call makes tests pass for test_socket and much more baffling is that changing the code to use list instead of tuple causes the error to go away. Also I tried changing append to use extend or just to append test object instead of a tuple of test object and elapsed time which all pass. Instead of using test object in tuple just using (object(), elapsed) as a tuple also causes failure. I guess there is something with tuples here but it doesn't make much sense.

On trying to bisect the merge commit I have narrowed down the below commits where the PR passes with tuple itself and then fails. In between the success and failure commit there were some changes made to hunterleaks and socket.create_server utility was added.

 f66e336f45 (success)
 58721a9030 (success)
 2b00db68554422ec37faba2a80179a0172df6349 (fails)

# Sample failure on PR's HEAD on Ubuntu 

⋊> ~/cpython on pr_12271  ./python -m test --fail-env-changed -R 3:3 test_socket                                                                                                                    05:45:52
Run tests sequentially
0:00:00 load avg: 0.22 [1/1] test_socket
beginning 6 repetitions
123456
Warning -- threading_cleanup() failed to cleanup 0 threads (count: 0, dangling: 5)
Dangling thread: <Thread(Thread-3, stopped 140152613795584)>
Dangling thread: <Thread(Thread-1, stopped 140152613795584)>
Dangling thread: <Thread(Thread-4, stopped 140152613795584)>
Dangling thread: <Thread(Thread-2, stopped 140152613795584)>
Dangling thread: <_MainThread(MainThread, started 140152731227904)>
......
test_socket failed (env changed) in 3 min 6 sec

== Tests result: ENV CHANGED ==

1 test altered the execution environment:
    test_socket

Total duration: 3 min 6 sec
Tests result: ENV CHANGED

The below patch on the PR to use list causes the tests to pass but I am highly confused over how changing to tuple to list fixes this and this might potentially be hiding a bug.

$ git diff
diff --git a/Lib/unittest/result.py b/Lib/unittest/result.py
index 273ca3beef..b8fd714679 100644
--- a/Lib/unittest/result.py
+++ b/Lib/unittest/result.py
@@ -160,7 +160,7 @@ class TestResult(object):

     def addDuration(self, test, elapsed):
         """Called when a test finished to run, regardless of its outcome."""
-        self.collectedDurations.append((test, elapsed))
+        self.collectedDurations.append([test, elapsed])

     def wasSuccessful(self):
         """Tells whether or not this result was a success."""

[0] https://github.com/python/cpython/commit/3c4af91c1e7c02561efa752885011ff642725bd8
msg370908 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2020-06-07 17:06
I think I stumbled on this too, I think the test that causes some issue is testCmsgTrunc2Int:


./python -m test -R: test_socket -m testCmsgTrunc2Int    
0:00:00 load avg: 1.66 Run tests sequentially
0:00:00 load avg: 1.66 [1/1] test_socket
beginning 9 repetitions
123456789
.........
test_socket leaked [2, 2, 2, 2] file descriptors, sum=8
test_socket failed

== Tests result: FAILURE ==

1 test failed:
    test_socket

Total duration: 2.0 sec
Tests result: FAILURE



There is some file descriptors that are created by newFDs() but they are supposed to be cleaned up, maybe it is some race condition as I see that test_socket create a thread for each test?
History
Date User Action Args
2022-04-11 14:59:14adminsetgithub: 80931
2020-06-07 17:06:50remi.lapeyresetnosy: + remi.lapeyre

messages: + msg370908
versions: + Python 3.9, Python 3.10
2019-04-30 05:59:32xtreaksetmessages: + msg341136
2019-04-29 09:12:33vstinnersetnosy: + pablogsal

title: test_socket failed (env changed) on Azure pipeline -> test_socket leaks file descriptors on macOS
2019-04-29 08:33:34xtreaksetmessages: + msg341053
2019-04-29 08:26:44xtreakcreate