classification
Title: httplib gzip support
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.1, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: transparent gzip compression in urllib
View: 1508475
Assigned To: Nosy List: Buck.Golemon, ajaksu2, georg.brandl, georg.brandl, martin.panter, mooonz, sonderblade
Priority: normal Keywords: patch

Created on 2005-07-23 18:51 by mooonz, last changed 2015-05-27 22:41 by martin.panter. This issue is now closed.

Files
File name Uploaded Description Edit
httplib.patch mooonz, 2005-07-23 18:54 diif with cvs rev 1.95
Messages (12)
msg48607 - (view) Author: Moonz (mooonz) Date: 2005-07-23 18:51
Add gzip support for httplib. It seems to work
correctly - according to the tests I did done, but some
points should be altered (I think to the putrequest
method, where I didn't change anything - except the two
lines of comments who said that the gzip support is not
included)
msg48608 - (view) Author: Moonz (mooonz) Date: 2005-07-23 18:54
Logged In: YES 
user_id=826215

There's no uploaded file!  You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file. In addition, even if you
*did* check this checkbox, a bug in SourceForge
prevents attaching a file when *creating* an issue.

Please try again.

(This is a SourceForge annoyance that we can do
nothing about. :-( )
msg48609 - (view) Author: Moonz (mooonz) Date: 2005-07-23 18:54
Logged In: YES 
user_id=826215

It's better with the patch ;)
msg48610 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2005-08-31 22:38
Logged In: YES 
user_id=1188172

This will need documentation and test suite changes, too.
msg48611 - (view) Author: Björn Lindqvist (sonderblade) Date: 2007-03-12 00:40
I have applied and tested this patch. The code seems to be
correct. Fetching gzipped data from www.mozilla.org works as it
should. 

What the patch does is change Accept-Encoding from identity to
identity,gzip;q=0.9. That hints the HTTP server that we can handle
gzipped data. Some servers take the hint and sends us gzipped data
(www.mozilla.org is such a server). We check if that is so by checking
the Content-Encoding header in the HTTP response headers. If that is
set to gzip, the body of the response is gzipped data.

What then happens is that we create a hacked GzipFile object that
works on a wrapped version of the HTTPResponse itself. It has to be
hacked, because GzipFile works by seeking to the end of the file
object. That ofcourse does not work for us, because the whole file is
not available. But this hacked version employs some kind of StringIO
trick so that instead of seeking to the end of the file, it seeks to
the end of the read data.

So HTTPResponse aquires a reference to GzipFile2 which it reads
from. GzipFile2 in turn, has a reference to GzipedHTTPIO (the wrapper)
which in turn references the HTTPResponse. The read method in
HTTPResponse invokes the read method on GzipedFile2 which invokes the
read of GzipedHTTPIO which invokes the read of HTTPResponse. But
GzipedHTTPIO breaks the potential recursion by specifying raw=True
which means that it want HTTPResponse to feed it uncompressed data. A
very, very clever scheme.

I hope this information is useful. It took me way to long to get this
far. Originally I thought that this patch should be rejected because it
is just to damn complicated, but then I saw that the rest of
httplib.py is equally complicated. :)
msg48612 - (view) Author: Moonz (mooonz) Date: 2007-03-13 19:35
Wow, you digged up something I forgot for my own sake :)
Seriously, did I really made this horrible piece of crap ?
All my apologies for that...

Allthough, if sb is interested in gzip support in httplib, I should have something somewhere on my computer which should do the job in a better way (can't make worst, anyway ;)). It's a complete rewrite of the GzipFile class which doesn't need a random access to the data... I'll upload it tomorrow...
msg48613 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-03-13 19:37
Note that we already have a patch for that, in #1675951.
msg48614 - (view) Author: Moonz (mooonz) Date: 2007-03-13 19:49
I saw it a few seconds after clicking the "submit" button ;)
OK, please forget my last message and this complete thread.
msg84694 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-03-30 22:36
Should this be closed in favor of #1675951?
msg84880 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-03-31 19:35
Sounds reasonable.
msg209139 - (view) Author: Buck Golemon (Buck.Golemon) Date: 2014-01-25 00:51
I believe this issue is still extant.

The tip httplib client neither sends accept-encoding gzip nor supports content-encoding gzip.

http://hg.python.org/cpython/file/tip/Lib/http/client.py#l1012

There is a diff to httplib in this attached patch, where there was none in #1675951.
msg226398 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-09-05 05:23
Agreed, this issue is not a duplicate of the marked “gzip” seek issue, however it _does_ duplicate Issue 1508475.
History
Date User Action Args
2015-05-27 22:41:12martin.pantersetsuperseder: Performance for small reads and fix seek problem -> transparent gzip compression in urllib
2014-09-05 05:23:47martin.pantersetmessages: + msg226398
2014-02-10 12:14:33martin.pantersetnosy: + martin.panter
2014-01-25 00:51:49Buck.Golemonsetnosy: + Buck.Golemon
messages: + msg209139
2009-03-31 19:35:37georg.brandlsetstatus: open -> closed
resolution: duplicate
superseder: Performance for small reads and fix seek problem
messages: + msg84880
2009-03-30 22:36:54ajaksu2setversions: + Python 3.1, Python 2.7, - Python 2.5
nosy: + ajaksu2

messages: + msg84694

type: enhancement
stage: patch review
2005-07-23 18:51:28mooonzcreate