This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Should urllib2.urlopen send an Accept-Encoding header?
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: dabrahams, demian.brecht, eric.araujo, karlcow, martin.panter, orsenthil
Priority: normal Keywords:

Created on 2010-05-16 14:47 by dabrahams, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg105870 - (view) Author: Dave Abrahams (dabrahams) Date: 2010-05-16 14:47
According to the RFC, the server is allowed to send back any encoding it likes when no Accept-Encoding header is supplied, but all the examples I can find of urllib2.urlopen usage assume they're getting plain text back.  I think it would be better to inject an Accept-Encoding header when none is explicitly supplied so that nobody else trips over this issue.

See http://support.github.com/discussions/site/1510
msg105937 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-05-17 20:30
HTTP Ref says that Server can send any encoding, if client does not
specify Accept-Encoding header. But if 'identity' is one of the
encoding that server recognizes (?), then it should send it as
identity, which indicates untransformed content.

I also see in the httplib that Accept-Encoding = 'identity' is added in the
request level to the headers. I shall see what is missing here, if it
is not being sent for all requests.

BTW, I could not figure out the problem you are facing from the url
mentioned. I specifically do not see any interleaving gzip and no-gzip
request behaviours at different points.
msg105959 - (view) Author: Dave Abrahams (dabrahams) Date: 2010-05-18 10:02
How many tests did you run?  My two tests were minutes apart.  I have the feeling that this has something to do with cacheing behavior on the server.
msg183573 - (view) Author: karl (karlcow) * Date: 2013-03-06 02:32
What was the content of http://support.github.com/discussions/site/1510
I can't find it. Is the issue still going on?
msg239926 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2015-04-02 15:32
This doesn't seem to be an issue in 3.4+, the following headers are injected in a call to urlopen():

GET / HTTP/1.1
Accept-Encoding: identity
Host: example.com
User-Agent: Python-urllib/3.4
Connection: close

However, this is not the same behaviour in 2.7:

GET / HTTP/1.0
Host: example.com
User-Agent: Python-urllib/1.17

That said, I wouldn't see this as a bug but a feature request, so it should be invalid for 2.7.

Setting this to pending to close unless anyone has any objections or further details.
msg265526 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-14 12:46
I suspect for Demian’s 2.7 experiment, he used the older urllib.urlopen(), rather than urllib2.urlopen() as given in the original description. When I use urllib2.urlopen("http://localhost/"), I see

GET / HTTP/1.1
Accept-Encoding: identity
Host: localhost
Connection: close
User-Agent: Python-urllib/2.7

Even in the urllib (no 2) case, since it is using HTTP 1.0, I suspect not having Accept-Encoding is not such a problem.

The underlying HTTP library has always added “Accept-Encoding: identity” for HTTP 1.1 by default (https://hg.python.org/cpython/annotate/4a3e9871b41b/Lib/httplib.py#l444), so I am closing this.
History
Date User Action Args
2022-04-11 14:57:01adminsetgithub: 52978
2016-05-14 12:46:10martin.pantersetstatus: pending -> closed
title: Should urrllib2.urlopen send an Accept-Encoding header? -> Should urllib2.urlopen send an Accept-Encoding header?
nosy: + martin.panter

messages: + msg265526

resolution: works for me
2015-04-02 15:32:23demian.brechtsetstatus: open -> pending
nosy: + demian.brecht
messages: + msg239926

2013-03-06 02:32:12karlcowsetnosy: + karlcow
messages: + msg183573
2010-12-22 07:48:02eric.araujosetnosy: + eric.araujo

versions: - Python 2.6
2010-05-18 10:02:40dabrahamssetmessages: + msg105959
2010-05-17 20:30:17orsenthilsetmessages: + msg105937
2010-05-16 18:24:46pitrousetassignee: orsenthil

type: behavior
nosy: + orsenthil
versions: + Python 3.1, Python 2.7, Python 3.2
2010-05-16 14:47:09dabrahamscreate