Message401429
urllib.request.Request internally .capitalize()s header names before adding them, as can be seen here: https://github.com/python/cpython/blob/3.9/Lib/urllib/request.py#L399
Since HTTP headers are case-insensitive, but dicts are not, this ensures that add_header and add_unredirected_header overwrite an existing header (as documented) even if they were passed in different cases. However, this also carries two problems with it:
1. has_header, get_header, and remove_header do not apply this normalisation to their header_name parameter, causing them to fail unexpectedly when the header is passed in the wrong case.
2. Some servers do not comply with the standard and check some headers case-sensitively. If the case they expect is different from the result of .capitalize(), those headers effectively cannot be passed to them via urllib.
These problems have already been discussed quite some time ago, and yet they still are present:
https://bugs.python.org/issue2275
https://bugs.python.org/issue12455
Or did I overlook something and there is a good reason why things are this way?
If not, I suggest that add_header and add_unredirected_header store the headers in the case they were passed (while preserving the case-insensitive overwriting behaviour) and that has_header, get_header, and remove_header find headers independent of case.
Here is a possible implementation:
# Helper outside class
# Stops after the first hit since there should be at most one of each header in the dict
def _find_key_insensitive(d, key):
key = key.lower()
for key2 in d:
if key2.lower() == key:
return key2
return None # Unnecessary, but explicit is better than implicit ;-)
# Methods of Request
def add_header(self, key, val):
# useful for something like authentication
existing_key = _find_key_insensitive(self.headers, key)
if existing_key:
self.headers.pop(existing_key)
self.headers[key] = val
def add_unredirected_header(self, key, val):
# will not be added to a redirected request
existing_key = _find_key_insensitive(self.unredirected_hdrs, key)
if existing_key:
self.unredirected_hdrs.pop(existing_key)
self.unredirected_hdrs[key] = val
def has_header(self, header_name):
return bool(_find_key_insensitive(self.headers, header_name) or
_find_key_insensitive(self.unredirected_hdrs, header_name))
def get_header(self, header_name, default=None):
key = _find_key_insensitive(self.headers, header_name)
if key:
return self.headers[key]
key = _find_key_insensitive(self.unredirected_hdrs, header_name)
if key:
return self.unredirected_hdrs[key]
return default
def remove_header(self, header_name):
key = _find_key_insensitive(self.headers, header_name)
if key:
self.headers.pop(key)
key = _find_key_insensitive(self.unredirected_hdrs, header_name)
if key:
self.unredirected_hdrs.pop(key)
I’m sorry if it is frowned upon to post code suggestions here like that; I didn’t have the confidence to create a pull request right away. |
|
Date |
User |
Action |
Args |
2021-09-09 00:39:40 | emphoeller | set | recipients:
+ emphoeller |
2021-09-09 00:39:40 | emphoeller | set | messageid: <1631147980.88.0.915702969887.issue45145@roundup.psfhosted.org> |
2021-09-09 00:39:40 | emphoeller | link | issue45145 messages |
2021-09-09 00:39:40 | emphoeller | create | |
|