classification
Title: httplib sets unbefitting "Host" in request header when requests an ipv6 format url.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder: httplib/http.client HTTPConnection._set_hostport() regression
View: 28539
Assigned To: Nosy List: martin.panter, prudvinit, visionwun, xtreak
Priority: normal Keywords:

Created on 2018-08-27 06:52 by visionwun, last changed 2018-09-04 06:24 by visionwun. This issue is now closed.

Messages (7)
msg324149 - (view) Author: chen wu (visionwun) Date: 2018-08-27 06:52
when I tried to request a url like "https://[fc00:0a08::2]:35357", I got 400. 
The code is like:
    import requests
    requests.get("https://[fc00:0a08::2]:35357", verify=False)
And the apache logs:
    vhost.c(889): [client fc00:ac1c::9a5:58692] AH00550: Client sent malformed Host header: [[fc00::0a08::2]]:35357
If user no set "Host" in header, httpslib will pase it from url and set it.
The paser function is urllib3.util.url.pase_url. When url is "https://[fc00:0a08::2]:35357", we got host [fc00:0a08::2].
And then httplib sets host in putrequest, "[" and "]" will be added to [fc00:0a08::2], which is not a valid format for host.
The part of codes are:
974                    # Wrap the IPv6 Host Header with [] (RFC 2732)
975                    if host_enc.find(':') >= 0:
976                        host_enc = "[" + host_enc + "]"

maybe the judgement condition for wrap ipv6 host header with [] is not very well?
msg324190 - (view) Author: Prudvi RajKumar Maddala (prudvinit) * Date: 2018-08-27 18:12
I don't think urllib module is part of core python library. It's a 3rd party library which we usually install using pip
msg324210 - (view) Author: chen wu (visionwun) Date: 2018-08-28 02:18
Thanks so much for your reply. 

when httplib.HTTPConnection is inited with host [fc00::0a08::2] and port 35357, we can make a request normally. only the 'Host' set in header is wrong. I think the most simple way to fix this is adding judgement condition, maybe like this:
974                    # Wrap the IPv6 Host Header with [] (RFC 2732)
975                    if host_enc.find(':') >= 0 and host_enc.find(']') < 0:
976                        host_enc = "[" + host_enc + "]"

or rules should be given, because when port is not default, only (host=[aaa:bbb]:123, port=None) and (host=aaa:bbb, port=123) are valid for httplib now.

so sorry for my poor English. hope you can understand what im saying. :)
msg324220 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-08-28 08:15
urrlib3 seems to handle this case at https://github.com/urllib3/urllib3/blob/c41fa8c7ed8cb7315195dc15e67958754ea276d5/src/urllib3/util/url.py#L184 . 

Test cases : https://github.com/urllib3/urllib3/blob/0f85e05af9ef2ded671a7b47506dfd24b32decf0/test/test_util.py#L80

Thanks
msg324222 - (view) Author: chen wu (visionwun) Date: 2018-08-28 08:39
yeah, i noticed that. but this function also return host with '[]'.

183    # IPv6
184    if url and url[0] == '[':
185        host, url = url.split(']', 1)
186        host += ']'

if url is [aaa:bbb]:123, host is [aaa:bbb] and url is ':123'after this process.

when host=[aaa:bbb] passed to httplib.HTTPConnection, its function 'putrequest' will put 'Host:[[aaa:bbb]]:123' in headers.
msg324252 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018-08-28 13:42
This sounds like a duplicate of Issue 28539. My understanding of that report is that Urllib3 half parses the URL by splitting out the port number, but returns a hostname with square brackets intact. Requests then passes the hostname (string with brackets) and port number (integer) to Python’s HTTPConnection constructor.

I think this is a bug in how Requests or “urllib3” is using Python’s HTTPConnection class. Requests should either leave the port number with the hostname in the string, or extract the raw hostname by removing the brackets.
msg324558 - (view) Author: chen wu (visionwun) Date: 2018-09-04 06:24
to fix this, we change the code of our urilib3. before passing params to httplib, we set Host in headers if it's ipv6 address.

Thanks so much.
History
Date User Action Args
2018-09-04 06:24:18visionwunsetstatus: pending -> closed
resolution: duplicate -> not a bug
messages: + msg324558

stage: resolved
2018-08-28 13:42:34martin.pantersetstatus: open -> pending

nosy: + martin.panter
messages: + msg324252

superseder: httplib/http.client HTTPConnection._set_hostport() regression
resolution: duplicate
2018-08-28 08:39:17visionwunsetmessages: + msg324222
2018-08-28 08:15:15xtreaksetmessages: + msg324220
2018-08-28 02:18:14visionwunsetmessages: + msg324210
2018-08-27 18:12:43prudvinitsetnosy: + prudvinit
messages: + msg324190
2018-08-27 13:10:46xtreaksetnosy: + xtreak
2018-08-27 06:52:34visionwuncreate