classification
Title: urllib: socket is not closed explicitly
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: nadeem.vawda Nosy List: haypo, nadeem.vawda, orsenthil, python-dev
Priority: normal Keywords: patch

Created on 2011-01-10 22:32 by haypo, last changed 2011-07-25 10:19 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
ftp_close.patch haypo, 2011-06-17 13:01 review
issue10883.patch nadeem.vawda, 2011-07-02 10:52 review
issue10883-v2.patch nadeem.vawda, 2011-07-07 22:07 review
Messages (17)
msg125944 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-01-10 22:32
test_urllibnet.py and test_urllibnet2.py emit ResourceWarning:
==============
$ ./python Lib/test/test_urllibnet.py 
testURLread (__main__.URLTimeoutTest) ... ok
test_bad_address (__main__.urlopenNetworkTests) ... ok
test_basic (__main__.urlopenNetworkTests) ... ok
test_fileno (__main__.urlopenNetworkTests) ... ok
test_getcode (__main__.urlopenNetworkTests) ... /home/haypo/prog/GIT/py3k/Lib/socket.py:333: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=1, proto=6>
  self._sock = None
ok
test_geturl (__main__.urlopenNetworkTests) ... ok
test_info (__main__.urlopenNetworkTests) ... ok
test_readlines (__main__.urlopenNetworkTests) ... ok
test_basic (__main__.urlretrieveNetworkTests) ... /home/haypo/prog/GIT/py3k/Lib/socket.py:333: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=2049, proto=6>
  self._sock = None
ok
test_data_header (__main__.urlretrieveNetworkTests) ... /home/haypo/prog/GIT/py3k/Lib/socket.py:333: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=2049, proto=6>
  self._sock = None
ok
test_header (__main__.urlretrieveNetworkTests) ... /home/haypo/prog/GIT/py3k/Lib/socket.py:333: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=2049, proto=6>
  self._sock = None
ok
test_specified_path (__main__.urlretrieveNetworkTests) ... /home/haypo/prog/GIT/py3k/Lib/socket.py:333: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=2049, proto=6>
  self._sock = None
ok

----------------------------------------------------------------------
Ran 12 tests in 17.473s
==============

It looks like these warning are real bugs: the socket is not closed explicitly by urllib.

Nadeem Vawda suggests a first fix:

  diff --git a/Lib/urllib/request.py b/Lib/urllib/request.py
  --- a/Lib/urllib/request.py
  +++ b/Lib/urllib/request.py
  @@ -2151,7 +2151,9 @@
               conn = self.ftp.ntransfercmd(cmd)
           self.busy = 1
           # Pass back both a suitably decorated object and a retrieval length
  -        return (addclosehook(conn[0].makefile('rb'), self.endtransfer), conn[1])
  +        fp = addclosehook(conn[0].makefile('rb'), self.endtransfer)
  +        conn[0].close()
  +        return (fp, conn[1])
       def endtransfer(self):
           if not self.busy:
               return
msg131455 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-03-19 22:58
Would it be possible to commit this partial fix now? It gets rid of 4 of the 8 warnings that I am currently seeing in test_urllib2net.

(As an aside, for anyone reading this who hasn't seen issue11563, test_urllibnet is now warning-free)
msg131461 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-03-19 23:06
I saw the partial fix suggested by the patch, but for some reason I
did not see ResourceWarning being shutup. Let's look at this again.
The warnings are all from the (old) ftp portion of the code..
msg131600 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-03-21 01:38
> I saw the partial fix suggested by the patch, but for some reason I
> did not see ResourceWarning being shutup.

Do you mean that you aren't getting ResourceWarnings in the first place?
Or that you are getting warnings, and the partial fix isn't getting rid
of any of them?

Note: it doesn't get rid of *all* of the warnings for test_urllib2net;
only some of them.

About the remaining warnings, it seems that FTPHandler.ftp_open() creates
an ftpwrapper object to connect to the server, and then the ftpwrapper
doesn't get closed anywhere. It has to stay open until the caller has
finished reading from the connection, so finding the right place to close
it might be a bit tricky.
msg131955 - (view) Author: Roundup Robot (python-dev) Date: 2011-03-24 03:47
New changeset 2e5aff2a9e54 by Senthil Kumaran in branch '3.2':
issue10883 - Silence some ftp related ResourceWarnings in test_urllib2net. Patch by Nadeem Vawda.
http://hg.python.org/cpython/rev/2e5aff2a9e54

New changeset 0937b3618b86 by Senthil Kumaran in branch 'default':
issue10883 - Silence some ftp related ResourceWarnings in test_urllib2net. Patch by Nadeem Vawda
http://hg.python.org/cpython/rev/0937b3618b86
msg138394 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-15 21:57
Is there still something to do in this issue? The initial report is fixed.
msg138400 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-06-15 22:10
Yes, the fix I provided only eliminated some of the warnings. As of fd6446a88fe3, test_urllib2net still leaks 5 sockets.
msg138504 - (view) Author: Roundup Robot (python-dev) Date: 2011-06-17 12:53
New changeset ca18f7f35c28 by Victor Stinner in branch '3.2':
Issue #10883: test_urllib2net closes socket explicitly
http://hg.python.org/cpython/rev/ca18f7f35c28

New changeset 6d38060f290c by Victor Stinner in branch 'default':
(Merge 3.2) Issue #10883: test_urllib2net closes socket explicitly
http://hg.python.org/cpython/rev/6d38060f290c
msg138505 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-06-17 13:01
ftp_close.patch:
 - (in passive mode) FTP.ntransfercmd() closes explicitly the socket on error: the caller has not access to the socket on error
 - OtherNetworkTests of test_urllib2net clears CacheFTPHandler cache: add a CacheFTPHandler.clear_cache() for that (I didn't document this new method because other methods are also not documented)
 - the last change is the worst one (ugly hack): FTPHandler.ftp_open() changes the close callback of the addclosehook object in ftpwrapper.retrfile(), but not in CacheFTPHandler.

I don't like the implementation of the last change:
 - it adds a protected class attribute
 - ftpwrapper.retrfile() requires an explicit close=True argument: it would be better to use a "reference counter" (or something like that), like socket._io_refs
msg139635 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-07-02 10:52
Here's an updated patch implementing reference counting for ftpwrapper.
It changes the semantics of ftpwrapper.close() to postpone actually closing
the connection until all files have also been closed (like socket.close()).
msg139835 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-07-05 09:13
With the patch applied, test_urllib2net fails at test_ftp test case
when a valid and invalid url are presented in sequence. I think test
needs a change or a further look is needed at the patch.
msg139907 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-07-05 21:24
The failure seems to occur sporadically. I'm looking into it.
msg139912 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-07-06 00:20
The problem seems to be that CacheFTPHandler inherits ftp_open() from
FTPHandler - FTPHandler.ftp_open() marks the ftpwrapper object to be closed as
soon as the current transfer is complete. So CacheFTPHandler's cache ends up
full of closed ftpwrappers. I don't have time to put together a solution now,
but I'll work on something over the weekend.

Another thing: CacheFTPHandler.clear_cache() sometimes breaks the cache,
because it fails to clear self.timeout. Is there any reason why the timeouts
need to be in a separate dict from the cached connections themselves? It seems
like a very ugly and error-prone way of organizing things.
msg139996 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2011-07-07 22:07
Updated patch with fixed refcounting mechanism. Also fixes clear_cache() in
CacheFTPWrapper to leave the cache in a consistent state for subsequent use.
msg140981 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-23 12:26
New changeset c741ba9e37ef by Nadeem Vawda in branch '3.2':
Issue #10883: Fix socket leaks in urllib.request.
http://hg.python.org/cpython/rev/c741ba9e37ef

New changeset d68765bd6490 by Nadeem Vawda in branch 'default':
Merge: #10883: Fix socket leaks in urllib.request.
http://hg.python.org/cpython/rev/d68765bd6490
msg140987 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-23 14:02
New changeset dbf1e1a27427 by Nadeem Vawda in branch '2.7':
Issue #10883: Fix socket leaks in urllib.request.
http://hg.python.org/cpython/rev/dbf1e1a27427
msg141084 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-07-25 10:19
Thanks for your patch. ResourceWarning are really useful!
History
Date User Action Args
2011-07-25 10:19:25hayposetmessages: + msg141084
2011-07-23 21:21:56nadeem.vawdasetstatus: open -> closed
resolution: accepted -> fixed
stage: patch review -> resolved
2011-07-23 14:02:02python-devsetmessages: + msg140987
2011-07-23 12:26:12python-devsetmessages: + msg140981
2011-07-07 22:07:45nadeem.vawdasetfiles: + issue10883-v2.patch

messages: + msg139996
2011-07-06 00:20:43nadeem.vawdasetassignee: nadeem.vawda
messages: + msg139912
2011-07-05 21:24:03nadeem.vawdasetmessages: + msg139907
2011-07-05 09:13:41orsenthilsetmessages: + msg139835
2011-07-05 05:04:16nadeem.vawdasetresolution: accepted
stage: patch review
2011-07-02 10:52:12nadeem.vawdasetfiles: + issue10883.patch

messages: + msg139635
2011-06-17 13:01:42hayposetfiles: + ftp_close.patch
keywords: + patch
messages: + msg138505
2011-06-17 12:53:39python-devsetmessages: + msg138504
2011-06-15 22:10:20nadeem.vawdasetmessages: + msg138400
2011-06-15 21:57:24hayposetmessages: + msg138394
versions: - Python 2.6, Python 3.1
2011-03-24 03:47:54python-devsetnosy: + python-dev
messages: + msg131955
2011-03-21 01:38:13nadeem.vawdasetnosy: orsenthil, haypo, nadeem.vawda
messages: + msg131600
2011-03-19 23:06:03orsenthilsetnosy: orsenthil, haypo, nadeem.vawda
messages: + msg131461
2011-03-19 22:58:26nadeem.vawdasetnosy: orsenthil, haypo, nadeem.vawda
messages: + msg131455
2011-01-10 22:32:42haypocreate