Issue1486335
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2006-05-11 09:14 by kxroberto, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
python-sf-1486335.patch | gward, 2006-07-24 19:44 | patch (relative to Python 2.3.5) that might help here -- needs more investigation! | ||
sf1486335-test-hack.patch | gward, 2006-07-26 02:13 | crude hack to test_httplib.py to demonstrate the problem |
Messages (6) | |||
---|---|---|---|
msg28473 - (view) | Author: kxroberto (kxroberto) | Date: 2006-05-11 09:14 | |
This occasionally shows up in a logged trace, when a application crahes on ValueError on a http(s)_response.read() : (py2.3.5 - yet relevant httplib code is still the same in current httplib) .... \' File "socket.pyo", line 283, in read\\n\', \' File "httplib.pyo", line 389, in read\\n\', \' File "httplib.pyo", line 426, in _read_chunked\\n\', \'ValueError: invalid literal for int(): \\n\'] ::: its the line: chunk_left = int(line, 16) Don't know what this line is about. Yet, that should be protected, as a http_response.read() should not fail with ValueError, but only with IOError/EnvironmentError, socket.error - otherwise Error Exception handling becomes a random task. -Robert Side note regarding IO exception handling: See also FR #1481036 (IOBaseError): why socket.error.__bases__ is (<class exceptions.Exception at 0x011244E0>,) ? |
|||
msg28474 - (view) | Author: Greg Ward (gward) | Date: 2006-07-24 19:38 | |
Logged In: YES user_id=14422 I'm seeing this with Python 2.3.5 and 2.4.3 hitting a PHP app and getting a large error page. It looks as though the server is incorrectly chunking the response: lwp-request at least gives a better error message than httplib.py: $ GET "http://..." 500 EOF when chunk header expected I'm unclear on precisely what the server is doing wrong. The response looks like this: HTTP/1.1 200 OK Date: Mon, 24 Jul 2006 19:18:47 GMT Server: Apache/2.0.54 (Fedora) X-Powered-By: PHP/4.3.11 Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8 2169\r\n \r\n [...first 0x2169 bytes of response...]\r\n 20b2\r\n [...next 0x20b2 bytes...] [...repeat many times...] 20b2\r\n [...the last 0x20b2 bytes...] \r\n The blank line at eof appears to be confusing httplib.py: it bombs because int('', 16) raises ValueError. Observation #1: if this is indeed a protocol error (ie. the server is in the wrong), httplib.py should turn the ValueError into an HTTPException. Perhaps it should define a new exception class for low-level protocol errors (bad chunking). Maybe it should reuse IncompleteRead. Observation #2: gee, my web browser doesn't barf on this response, so why should httplib.py? If there is an error here, it's at EOF, so it's not that big a deal. |
|||
msg28475 - (view) | Author: Greg Ward (gward) | Date: 2006-07-26 02:13 | |
Logged In: YES user_id=14422 OK, I've been working on this some more and I have a very crude addition to test_httplib.py. I'm going to attach it here and solicit feedback on python-dev: I'm not sure how many kinds of bad response chunking I really want to worry about. |
|||
msg28476 - (view) | Author: John J Lee (jjlee) | Date: 2006-08-08 00:23 | |
Logged In: YES user_id=261020 I think it's only worth worrying about bad chunking that a) has been observed in the wild (though not necessarily by us) and b) popular browsers can cope with. Greg: """If there is an error here, it's at EOF, so it's not that big a deal.""" That's only if the response will be closed at the end of the current transaction. Quoting from 1411097: """if the connection will not close at the end of the transaction, the behaviour should not change from what's currently in SVN (we should not assume that the chunked response has ended unless we see the proper terminating CRLF).""" Perhaps we don't need to be quite as strict as that, but the point is that otherwise, how do we know the server hasn't already sent that last CRLF, and that it will turn up in three weeks' time?-) If that happens, not sure exactly how httplib will treat the CRLF and possible chunked encoding trailers, but I suspect something bad happens. Perhaps we could just always close the connection in this case? I'm not confident I know yet how best to fix these issues. I just tried reading curl's transfer.c and http_chunks.c. I discovered only that I have to be fully awake to read a 1200 line function :-/ |
|||
msg28477 - (view) | Author: Patrick Altman (altman) | Date: 2007-03-14 15:39 | |
I am attempting to use a HEAD request against Amazon S3 to check whether a file exists or not and if it does parse the md5 hash from the ETag in the response to verify the contents of the file so as to save on bandwidth of uploading files when it is not necessary. If the file exist, the HEAD works as expected and I get valid headers back that I can parse and pull the ETag out of the dictionary using getheader('ETag')[1:-1] (using the slice to trim off the double-quotes in the string. The problem lies when I attempt to send a HEAD request when no file exists. As expected, a 404 Not Found response is sent back from Amazon however, my test scripts seem to hang. I run python with trace.py and it hangs here: --- modulename: httplib, funcname: _read_chunked httplib.py(536): assert self.chunked != _UNKNOWN httplib.py(537): chunk_left = self.chunk_left httplib.py(538): value = '' httplib.py(542): while True: httplib.py(543): if chunk_left is None: httplib.py(544): line = self.fp.readline() --- modulename: socket, funcname: readline socket.py(321): data = self._rbuf socket.py(322): if size < 0: socket.py(324): if self._rbufsize <= 1: socket.py(326): assert data == "" socket.py(327): buffers = [] socket.py(328): recv = self._sock.recv socket.py(329): while data != "\n": socket.py(330): data = recv(1) It eventually completes with an exception here: File "C:\Python25\lib\httplib.py", line 509, in read return self._read_chunked(amt) File "C:\Python25\lib\httplib.py", line 548, in _read_chunked chunk_left = int(line, 16) ValueError: invalid literal for int() with base 16: '' For reference, ethereal captured the following request and response: HEAD <REMOVED> HTTP/1.1 Host: s3.amazonaws.com Accept-Encoding: identity Date: Tue, 13 Mar 2007 02:54:12 GMT Authorization: AWS <REMOVED> HTTP/1.1 404 Not Found x-amz-request-id: E20B4C0D0C48B2EF x-amz-id-2: <REMOVED> Content-Type: application/xml Transfer-Encoding: chunked Date: Tue, 13 Mar 2007 02:54:16 GMT Server: AmazonS3 |
|||
msg62849 - (view) | Author: Georg Brandl (georg.brandl) * | Date: 2008-02-24 00:03 | |
Fixed for bug #900744. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:17 | admin | set | github: 43344 |
2008-02-24 00:03:56 | georg.brandl | set | status: open -> closed nosy: + georg.brandl resolution: duplicate messages: + msg62849 |
2008-01-05 14:00:39 | vila | set | nosy: + vila |
2006-05-11 09:14:42 | kxroberto | create |