Author martin.panter
Recipients gregory.p.smith, martin.panter, orange, serhiy.storchaka, vstinner, ware, xiang.zhang, xtreak
Date 2019-04-10.11:28:11
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1554895691.95.0.502607260147.issue30458@roundup.psfhosted.org>
In-reply-to
Content
Gregory, I haven’t tried recent Python code, but I expect the problem with percent decoding is still there. If you did try my example, what results did you see? Be aware that these techniques only work if the OS co-operates and connects to localhost when you give it the longer host string. At the moment I have glibc 2.26 on x86-64 Linux.

In the Python 3 master branch, the percent-encoding should be decoded in “urllib.request.Request._parse”:

def _parse(self):
    ...
    self.host, self.selector = _splithost(rest)
    if self.host:
        self.host = unquote(self.host)

Then in “AbstractHTTPHandler.do_request_” the decoded host string becomes the “Host” header field value, without any encoding:

def do_request_(self, request):
    host = request.host
    ...
    sel_host = host
    ...
    if not request.has_header('Host'):
        request.add_unredirected_header('Host', sel_host)

Perhaps one solution to both my version and Orange’s original version is to encode the “Host” header field value properly. This might also apply to the “http.client” code.
History
Date User Action Args
2019-04-10 11:28:11martin.pantersetrecipients: + martin.panter, gregory.p.smith, vstinner, serhiy.storchaka, xiang.zhang, orange, xtreak, ware
2019-04-10 11:28:11martin.pantersetmessageid: <1554895691.95.0.502607260147.issue30458@roundup.psfhosted.org>
2019-04-10 11:28:11martin.panterlinkissue30458 messages
2019-04-10 11:28:11martin.pantercreate