Issue1580738
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2006-10-19 19:06 by djmitche, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
httplib.py.diff | facundobatista, 2007-10-13 02:09 |
Messages (7) | |||
---|---|---|---|
msg30322 - (view) | Author: Dustin J. Mitchell (djmitche) * | Date: 2006-10-19 19:06 | |
I'm building an interface to Amazon's S3, using httplib. It uses a single object for multiple transactions. What's happening is this: HTTP > PUT /unitest-temp-1161039691 HTTP/1.1 HTTP > Date: Mon, 16 Oct 2006 23:01:32 GMT HTTP > Authorization: AWS <<cough>>:KiTWRuq/ 6aay0bI2J5DkE2TAWD0= HTTP > (end headers) HTTP < HTTP/1.1 200 OK HTTP < content-length: 0 HTTP < x-amz-id-2: 40uQn0OCpTiFcX+LqjMuzG6NnufdUk/.. HTTP < server: AmazonS3 HTTP < x-amz-request-id: FF504E8FD1B86F8C HTTP < location: /unitest-temp-1161039691 HTTP < date: Mon, 16 Oct 2006 23:01:33 GMT HTTPConnection.__state before response.read: Idle HTTPConnection.__response: closed? False length: 0 reading response HTTPConnection.__state after response.read: Idle HTTPConnection.__response: closed? False length: 0 ..later in the same connection.. HTTPConnection.__state before putrequest: Idle HTTPConnection.__response: closed? False length: 0 HTTP > DELETE /unitest-temp-1161039691 HTTP/1.1 HTTP > Date: Mon, 16 Oct 2006 23:01:33 GMT HTTP > Authorization: AWS <<cough>>: a5OizuLNwwV7eBUhha0B6rEJ+CQ= HTTP > (end headers) HTTPConnection.__state before getresponse: Request-sent HTTPConnection.__response: closed? False length: 0 File "/usr/lib64/python2.4/httplib.py", line 856, in getresponse raise ResponseNotReady() If the first request does not precede it, the second request is fine. To avoid excessive memory use, I'm calling request.read(16384) repeatedly, instead of just calling request.read(). This seems to be key to the problem -- if I omit the 'amt' argument to read(), then the last line of the first request reads HTTPConnection.__response: closed? True length: 0 and the later call to getresponse() doesn't raise ResponseNotReady. Looking at the source for httplib.HTTPResponse.read, self.close() gets called in the latter (working) case, but not in the former (non-working). It would seem sensible to add 'if self.length == 0: self.close()' to the end of that function (and, in fact, this change makes the whole thing work), but this comment makes me hesitant: # we do not use _safe_read() here because this may be a .will_close # connection, and the user is reading more bytes than will be provided # (for example, reading in 1k chunks) I suspect that either (a) this is a bug or (b) the client is supposed to either call read() with no arguments or calculate the proper inputs to read(amt) based on the Content-Length header. If (b), I would think the docs should be updated to reflect that? Thanks for any assistance. |
|||
msg30323 - (view) | Author: Mark Hammond (mhammond) * | Date: 2006-10-27 02:21 | |
Logged In: YES user_id=14198 The correct answer is indeed (b) - but note that httplib will itself do the content-length magic for you, including the correct handling of 'chunked' encoding. If the .length attribute is not None, then that is exactly how many bytes you should read. If .length is None, then either chunked encoding is used (in which case you can call read() with a fixed size until it returns an empty string), or no content-length was supplied (which can be treated the same as chunked, but the connection will close at the end. Checking ob.will_close can give you some insight into that. Its not reasonable to add 'if self.length==0: self.close()' - it is perfectly valid to have a zero byte response within a keep-alive connection - we don't want to force a new (expensive) connection to the server just because a zero byte response was requested. The HTTP semantics are hard to get your head around, but I believe httplib gets it right, and a ResponseNotReady exception in particular points at an error in the code attempting to use the library. Working with connections that keep alive is tricky - you just jump through hoops to ensure you maintain the state of the httplib object correctly - in general, that means you must *always* consume the entire response (even if it is zero bytes) before attempting to begin a new request. This requirement doesn't exists for connections that close - if you fail to read the entire response it can be thrown away as the next request will happen on a new connection. |
|||
msg30324 - (view) | Author: Dustin J. Mitchell (djmitche) * | Date: 2006-10-27 22:53 | |
Logged In: YES user_id=7446 Excellent -- the first paragraph, where you talk about the .length attribute, makes things quite clear, so I agree that (b) is the correct solution: include the content of that paragraph in the documentation. Thanks! |
|||
msg30325 - (view) | Author: John J Lee (jjlee) | Date: 2007-01-31 23:45 | |
I disagree with mhammond. Mark is correct that closing the connection would be wrong, but self.close() in the function in question does *not* close the connection. Client code is not expected to explicitly call HTTPResponse.read() with an explicit amt argument even when persistent connections are involved. All that's required is (of course) that you have to read to the end of the current response before you get to read the next response on that connection. I think Dustin is more or less right in his diagnosis, but I haven't written a patch yet... |
|||
msg56384 - (view) | Author: Marcos Dione (StyXman) * | Date: 2007-10-13 01:36 | |
with facundo we were tracking this bug down. mhammond is right about that read() should allow 0 size responses, but the bug is that the response is "not closed". HTTPResponse.read() says: if amt is None: # unbounded read if self.length is None: s = self.fp.read() else: s = self._safe_read(self.length) self.length = 0 self.close() # we read everything return s see that if self.close()s, which really closes the fp created with makefile() in the constructor. this does not closes the underlying socket, as you can see trying this short example: import httplib c= httplib.HTTPConnection ('www.python.org', 80) c.request ('GET', '/index.html') a1= c.getresponse () data1= a1.read() c.request ('GET', '/404.html') a2= c.getresponse () data2= a2.read() and run it under strace -e network,file. if the last part of read is changed to this, read(n) works just like read() does: # we do not use _safe_read() here because this may be a .will_close # connection, and the user is reading more bytes than will be provided # (for example, reading in 1k chunks) s = self.fp.read(amt) if self.length is not None: self.length -= len(s) if len(s)==0: self.close () return s |
|||
msg56385 - (view) | Author: Facundo Batista (facundobatista) * | Date: 2007-10-13 02:09 | |
Mark is ok, zero length responses are ok. But that has nothing to do with self.length. self.lenght reaching zero means that everything that needed to be read is already read. If the read() method is called without an argument, it reads everything until it reaches self.lenght, and then it closes self. So, when is called with an argument, is ok to close self if after reading something self.length reaches zero (note that this actually means that something was read, and self.length was lowered down...) If self is closed also in this instance (see the patch), all tests pass ok, and how you use HTTPConnection reading small pieces and not the whole thing, is actually simplified... |
|||
msg56512 - (view) | Author: Facundo Batista (facundobatista) * | Date: 2007-10-18 03:17 | |
Fixed in rev 58530 (also added a test case) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:20 | admin | set | github: 44147 |
2007-10-18 03:17:02 | facundobatista | set | status: open -> closed resolution: fixed messages: + msg56512 |
2007-10-13 02:09:36 | facundobatista | set | files:
+ httplib.py.diff assignee: facundobatista messages: + msg56385 keywords: + patch versions: + Python 2.5, - Python 2.4 |
2007-10-13 01:36:18 | StyXman | set | nosy:
+ StyXman, facundobatista messages: + msg56384 |
2006-10-19 19:06:16 | djmitche | create |