This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: httplib.HTTPResponse.read could potentially leave the socket opened forever
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, eranrund, orsenthil, piotr.dobrogost, pitrou, python-dev, slingamn
Priority: normal Keywords: patch

Created on 2012-10-22 11:32 by eranrund, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
httplib-no-content-length-close-sock-fix.patch eranrund, 2012-11-07 01:20
Messages (10)
msg173505 - (view) Author: Eran Rundstein (eranrund) Date: 2012-10-22 11:32
When calling HTTPResponse.read() on a response that is:
a. not chunked
b. contains no content-length header
the underlying socket (referenced by self.fp) will never get closed (through self.close())

The offending code is at the bottom of the read() function:
        s = self.fp.read(amt)
        if self.length is not None:
            self.length -= len(s)
            if not self.length:
                self.close()
        return s
As seen, if self.length is None, even when the server closes the connection (causing self.fp.read to return ''), the socket will not get closed.

btw, this may be the cause of Issue15633 (http://bugs.python.org/issue15633)
msg174037 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-10-28 12:18
Please publish patch for the issue.
msg174613 - (view) Author: Eran Rundstein (eranrund) Date: 2012-11-03 12:52
The patch is probably trivial - however I would still like some verification.
Would it be correct to call self.close() when fp.read returns ''? In case self.length is not present, I don't see a way around this anyway. When it is present, and fp.read returns '', how should we go about that? We can either return less data, or raise an exception to indicate that the connection terminated prematurely.

Thanks
msg174807 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-04 14:47
> The patch is probably trivial - however I would still like some
> verification.
> Would it be correct to call self.close() when fp.read returns ''? In
> case self.length is not present, I don't see a way around this anyway.
> When it is present, and fp.read returns '', how should we go about
> that? We can either return less data, or raise an exception to
> indicate that the connection terminated prematurely.

It's probably better to return less data. No need to break user programs
when they download from a slightly misbehaved Web site.

The patch should include some kind of unit test, if possible. See
Lib/test/test_httplib.py.
msg175039 - (view) Author: Eran Rundstein (eranrund) Date: 2012-11-07 01:20
Hello

I have attached a patch that includes a (slightly broken) fix and a test case. Note that there is currently an unresolved issue:
If the user reads the exact amount of bytes the server sent, read() on the socket will never have a chance to return '' and inform the user about the connection termination. This will also happen if the user reads more data than the server is sending. In that case, again, read() will not get called again so we will never know the socket is gone. One possible way to address this is perform the actual read() calls in a loop, attempting to read the exact amount the user specified. If we do this and the user attempts to read more bytes than the server sent, we will properly detect it. If, however, the user reads the exact length, then we still have a problem.

I am not sure what would be the correct way of solving this.
msg177539 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-15 17:13
Sorry for the delay. I am not sure I understand your concern here:

> If the user reads the exact amount of bytes the server sent, read() on > the socket will never have a chance to return '' and inform the user
> about the connection termination.

Certainly read(), called once again, would return ''? That's how I understand your unit test anyway.
msg177540 - (view) Author: Eran Rundstein (eranrund) Date: 2012-12-15 17:23
Hm, it's been a while and I'm no longer sure :(
You're right - since there is no length the user will have to call read() until he gets back ''. It's possible I forgot that assumption when speculating about this.
msg177545 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-12-15 18:28
New changeset 2186f7b99c28 by Antoine Pitrou in branch '2.7':
Issue #16298: In HTTPResponse.read(), close the socket when there is no Content-Length and the incoming stream is finished.
http://hg.python.org/cpython/rev/2186f7b99c28

New changeset b47d342c449b by Antoine Pitrou in branch '3.2':
Issue #16298: In HTTPResponse.read(), close the socket when there is no Content-Length and the incoming stream is finished.
http://hg.python.org/cpython/rev/b47d342c449b

New changeset 59358f991c00 by Antoine Pitrou in branch '3.3':
Issue #16298: In HTTPResponse.read(), close the socket when there is no Content-Length and the incoming stream is finished.
http://hg.python.org/cpython/rev/59358f991c00

New changeset 5d6c2c7bc5d4 by Antoine Pitrou in branch 'default':
Issue #16298: In HTTPResponse.read(), close the socket when there is no Content-Length and the incoming stream is finished.
http://hg.python.org/cpython/rev/5d6c2c7bc5d4
msg177546 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-15 18:30
Ok, the patch looked fine to me, so I've committed it to 2.7 and adapted it for 3.x. Thank you!
By the way, it's probably easier to produce patches using Mercurial rather than using manual `diff`. You can have a look at the devguide for more information: http://docs.python.org/devguide/
msg177548 - (view) Author: Eran Rundstein (eranrund) Date: 2012-12-15 18:56
My pleasure.
I had no idea about the Mercurial patch, this is the first time I have submitted a Python bug report :)
I'll have a look.

Thanks for merging the fix!
History
Date User Action Args
2022-04-11 14:57:37adminsetgithub: 60502
2012-12-15 18:56:56eranrundsetmessages: + msg177548
2012-12-15 18:30:17pitrousetstatus: open -> closed
resolution: fixed
messages: + msg177546

stage: needs patch -> resolved
2012-12-15 18:28:23python-devsetnosy: + python-dev
messages: + msg177545
2012-12-15 17:23:26eranrundsetmessages: + msg177540
2012-12-15 17:13:50pitrousetmessages: + msg177539
2012-11-13 20:10:15slingamnsetnosy: + slingamn
2012-11-07 01:20:22eranrundsetfiles: + httplib-no-content-length-close-sock-fix.patch
keywords: + patch
messages: + msg175039
2012-11-04 14:47:41pitrousetmessages: + msg174807
2012-11-03 12:52:07eranrundsetmessages: + msg174613
2012-10-28 12:18:05asvetlovsetnosy: + asvetlov
messages: + msg174037
2012-10-22 17:08:33pitrousetnosy: + orsenthil, pitrou
stage: needs patch

versions: + Python 3.2, Python 3.3, Python 3.4
2012-10-22 13:21:40piotr.dobrogostsetnosy: + piotr.dobrogost
2012-10-22 11:32:32eranrundcreate