This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib.request.Request accepts but doesn't check bytes headers
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, maciej.szulik, martin.panter, orsenthil
Priority: normal Keywords:

Created on 2017-03-23 22:03 by ezio.melotti, last changed 2022-04-11 14:58 by admin.

Messages (2)
msg290063 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2017-03-23 22:03
urllib.request.Request allows the user to create a request object like:
  req = Request(url, headers={b'Content-Type': b'application/json'})

When calling urlopen(req, data), urllib will check if a 'Content-Type' header is present and fail to recognize b'Content-Type' because it's bytes.
urrlib will therefore add the default Content-Type 'application/x-www-form-urlencoded', and the request will then be sent with both Content-Types.  This will result in difficult-to-debug errors because the server will sometimes pick one and sometimes the other, depending on the order.

urllib should either reject bytes headers, or check for both bytes and strings.  The docs also don't seem to specify that the headers should be strings.
msg290457 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-25 01:01
If you enable BytesWarning (python -b) you do get an error:

>>> urlopen(req, data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/urllib/request.py", line 162, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 463, in open
    req = meth(req)
  File "/usr/lib/python3.5/urllib/request.py", line 1171, in do_request_
    if not request.has_header('Content-type'):
  File "/usr/lib/python3.5/urllib/request.py", line 356, in has_header
    return (header_name in self.headers or
BytesWarning: Comparison between bytes and string

I believe the “urllib.request” module is only written with text (str) field names in mind, not byte strings. Same for http.client.HTTPConnection.request(headers=...). But the lower-level HTTPConnection.putheader method has special code to handle byte strings: <http://svn.python.org/view?view=revision&revision=58823>, although this is not documented either.
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 74077
2017-03-25 01:01:46martin.pantersetnosy: + martin.panter
messages: + msg290457
2017-03-23 22:17:07maciej.szuliksetnosy: + maciej.szulik
2017-03-23 22:03:56ezio.melotticreate