HTTPResponse may drop buffer holding next response #67566

vadmium · 2015-02-02T12:56:16Z

BPO	23377
Nosy	@pitrou, @vadmium, @demianbrecht
Files	http-buffer.patch http-buffer.v2.patch http-buffer.v3.patch http-buffer.v4.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-02-02.12:56:16.093>
labels = ['type-bug', 'library']
title = 'HTTPResponse may drop buffer holding next response'
updated_at = <Date 2015-06-22.12:44:07.896>
user = 'https://github.com/vadmium'

bugs.python.org fields:

activity = <Date 2015-06-22.12:44:07.896>
actor = 'martin.panter'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2015-02-02.12:56:16.093>
creator = 'martin.panter'
dependencies = []
files = ['37977', '38875', '39472', '39769']
hgrepos = []
issue_num = 23377
keywords = ['patch', 'needs review']
message_count = 13.0
messages = ['235251', '240315', '240501', '240527', '240839', '243516', '243550', '243600', '243708', '243901', '244509', '244976', '245625']
nosy_count = 3.0
nosy_names = ['pitrou', 'martin.panter', 'demian.brecht']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue23377'
versions = ['Python 3.5']

vadmium · 2015-02-02T12:56:08Z

This is the same issue raised at <https://bugs.python.org/issue4879#msg91597\>. Currently, every time a new response is to be received, HTTPConnection passes its raw socket object to HTTPResponse, which calls sock.makefile("rb") and creates a BufferedReader. The BufferedReader is used to parse the header section and read the response body. The problem is that the BufferedReader is closed at the end of reading the response, potentially losing buffered data read from a subsequent response.

Normally no data is lost, because most users would read the full response before triggering a new request, and the server would wait for a request before sending a response. But if a user pipelined a second request without reading all of the first response, and the server happened to send the end of the first response and the start of the second response in the same packet, it could trigger the problem. I have added a test called test_httplib.ServerTest.testDoubleResponse() which emulates this scenario. The problem also makes it hard to detect misbehaving servers, or use HTTPConnection to test that a server is behaving correctly.

I am adding a patch which creates the BufferedReader once for each connection. This involves changing the API of the HTTPResponse constructor. I think this should be okay because even though it is documented, it says “Not instantiated directly by user”. It did require changing the tests that call the HTTPResponse constructor though. If absolutely necessary, it may be possible to maintain backwards compatibility if we added a new constructor parameter, and carefully juggled how the close() calls work.

vadmium · 2015-04-09T09:24:26Z

http-buffer.v2.patch:

Merged with recent changes
Made the changes to the test suite slightly less intrusive. Unfortunately there are still a lot of changes left where mock sockets were being sent into the HTTPResponse constructor.

pitrou · 2015-04-11T21:57:01Z

I think it's ok to slightly break a non-public API since it's required to fix an obvious bug.
By the way, I guess we don't support HTTP pipelining, right?

vadmium · 2015-04-12T03:02:20Z

HTTP pipelining is not supported in general. According to the module doc string, you can actually pipeline a second request if you have read the first response’s headers but not its body:

>>> conn = HTTPSConnection("bugs.python.org")
>>> conn.request("GET", "/")
>>> response1 = conn.getresponse()  # Read header section only
>>> conn.request("GET", "/issue23377")
>>> response1.read()[:100]  # Read first body after pipelining second request
b'<!--\n This is the default body that is displayed when people visit the\n tracker. The tag below lists'
>>> response2 = conn.getresponse()
>>> response2.read()[:100]
b'\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xht'

But I suspect this might deadlock with some servers (that don’t properly support pipelining themselves) if the second request was a large POST or something and filled up the buffers that this mode relies on.

demianbrecht · 2015-04-14T04:42:29Z

I think it's ok to slightly break a non-public API since it's required to fix an obvious bug.

To play devil's advocate here, /is/ this really non-public API? According to documented Python naming conventions, HTTPResponse is part of the public API. From code I've written myself, this would likely create a number of breakages. Granted, it would largely be in lower level testing code, but it would still create a headache. Is something that's lightly documented as being non-public take precedence over well known and used naming conventions? I'd argue no, it shouldn't.

That said, I'm not overly enthused about the alternative Martin describes, so at first thought, I'm pretty much square on the fence about this.

By the way, I guess we don't support HTTP pipelining, right?

As Martin pointed out, sadly no. There's currently an internal state machine that seems to possibly be remnants of an initial implementation (just a guess, I haven't done any research into it to prove that though) that prevents pipelining. It's on my list of things to tackle as it seems to be a natural progression to HTTP/2.0 multiplexing, but unfortunately my time is a little shorter than usual at the moment (taking a new job has cut into my spare time a little).

@martin:

I haven't had a chance to review the change in detail yet, but I think the idea's sound. I'll try to get some time over the next few days to dig into it further.

pitrou · 2015-05-18T18:50:30Z

Is this pending a review?

vadmium · 2015-05-19T00:20:10Z

Yes I would like someone to review my code changes for this; I don’t think anyone has done so yet. But I don’t consider this a very important bug, so there is no hurry.

demianbrecht · 2015-05-19T15:40:01Z

If nobody else gets to it first, I'll try to get to review this later today or tomorrow. Apologies for the delay on the review Martin, a new baby (coming next month), moving /and/ taking on a new job all at the same time seems to limit free time a little ;)

demianbrecht · 2015-05-21T00:03:01Z

Added comments to Rietveld.

vadmium · 2015-05-23T09:03:44Z

Thanks for the reviewing. Here is http-buffer.v3.patch:

Merged with current code
Better HTTPResponse(sock) compatibility suggested by Demian
New HTTPConnection.request(close=True) feature also suggested by Demian. This sends “Connection: close” in the request, and arranges for the response to be detached from the HTTPConnection instance, making some hacks in urllib.request redundant.

vadmium · 2015-05-31T01:12:17Z

Just realized that the makefile() call in _tunnel() still lets this bug occur when a HTTPSConnection is being used with a proxy using the CONNECT method. So while my patch shouldn’t make things any worse than they already are, it needs more work to fix this instance. Maybe we would have to use the new SSL memory BIO support.

demianbrecht · 2015-06-07T21:40:28Z

Actually had a few free minutes so stuck a couple minor comments in Rietveld. Will have another go as soon as possible.

vadmium · 2015-06-22T12:44:07Z

Minor update based on Demian’s review

vadmium added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 2, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTPResponse may drop buffer holding next response #67566

HTTPResponse may drop buffer holding next response #67566

vadmium commented Feb 2, 2015

vadmium commented Feb 2, 2015

vadmium commented Apr 9, 2015

pitrou commented Apr 11, 2015

vadmium commented Apr 12, 2015

demianbrecht mannequin commented Apr 14, 2015

pitrou commented May 18, 2015

vadmium commented May 19, 2015

demianbrecht mannequin commented May 19, 2015

demianbrecht mannequin commented May 21, 2015

vadmium commented May 23, 2015

vadmium commented May 31, 2015

demianbrecht mannequin commented Jun 7, 2015

vadmium commented Jun 22, 2015

HTTPResponse may drop buffer holding next response #67566

HTTPResponse may drop buffer holding next response #67566

Comments

vadmium commented Feb 2, 2015

vadmium commented Feb 2, 2015

vadmium commented Apr 9, 2015

pitrou commented Apr 11, 2015

vadmium commented Apr 12, 2015

demianbrecht mannequin commented Apr 14, 2015

pitrou commented May 18, 2015

vadmium commented May 19, 2015

demianbrecht mannequin commented May 19, 2015

demianbrecht mannequin commented May 21, 2015

vadmium commented May 23, 2015

vadmium commented May 31, 2015

demianbrecht mannequin commented Jun 7, 2015

vadmium commented Jun 22, 2015