Title: urrlib2/httplib doesn't reset file position between requests
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 2.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: Anthony.Kong, ajaksu2, ggenellina, jjlee, martin.panter, matejcik, orsenthil
Priority: normal Keywords: easy

Created on 2009-01-23 17:07 by matejcik, last changed 2015-04-16 02:13 by martin.panter.

Messages (4)
msg80419 - (view) Author: jan matejek (matejcik) * Date: 2009-01-23 17:06
since 2.6 httplib supports reading from file-like objects.

Now consider the following situation:
There are two handlers in urrlib2, first is plain http, second is basic
I want to POST a file to a service, and pass the open file object as
data parameter to urllib2.urlopen.
First handler is invoked, it sends the file data, but gets 401
Unauthorized return code and fails with that.
Second handler in chain is invoked (at least that's how i understand
urrlib2, please correct me if i'm talking rubbish). At that point the
open file is at EOF, so empty data is sent.

furthermore, the obvious solution "you can't do this through urllib so
go read the file yourself" doesn't apply that well - the file object in
question is actually a mmap.mmap instance.
This code is in production since python 2.4. Until file object support
in httplib was introduced, it worked fine, handling the mmap'ed file as
a string. Now it is picked up as read()-able and this problem occurs.
Only workaround to restore pre-2.6 behavior that comes to mind is
building a wrapper class for the mmap object that hides its read() method.
msg80422 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-01-23 23:28
This happens in other implementations too, not just urllib2.

If the server supports it, the best way is to send an 'Expect: 100-
Continue' header field before attempting to send the actual file.
msg185512 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-29 19:49
I think, this requires triaging in terms of is the feature request still applicable. Except 100 is sent by httplib and the support for this was added few years ago, much later then this bug was originally raised.
msg241191 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-04-16 02:13
Actually, I do not think any “Expect: 100-continue” headers are explicitly sent by the Python standard library. The Python client does not support waiting for a “100 Continue” response; see Issue 1346874.

There is Issue 23740 opened about fixing or clarifying the various data types accepted by “http.client”.

On the other hand, the documentation for urlopen() says only bytes and iterables are supported. If mmap objects are being treated as file objects by urlopen() that is unexpected, and the documentation or implementation needs fixing there. Also, iterating a mmap() object is different from iterating either the equivalent bytearray() or file object, so there is something weird going on there.
