classification
Title: httplib/http.client in method _tunnel used HTTP/1.0 CONNECT method
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, demian.brecht, martin.panter, r.david.murray, vova
Priority: normal Keywords: patch

Created on 2014-10-23 06:21 by vova, last changed 2017-04-15 02:01 by berker.peksag.

Files
File name Uploaded Description Edit
py2.7.httplib.patch vova, 2014-10-23 06:21 pach for python 2.7 review
issue22708.patch demian.brecht, 2015-04-02 15:14 review
Pull Requests
URL Status Linked Edit
PR 742 open python-dev, 2017-03-20 18:42
Messages (6)
msg229856 - (view) Author: Vova (vova) * Date: 2014-10-23 06:21
At my workplace I have to use corporate Internet proxy server with AD/domain/ntlm authorization. I use local cntlm proxy server to authorize myself on corporate proxy. Programs are send requests to cntlm proxy without any authorization information. Cntlm proxy communicate with corporate proxy and handle all authorization stuff and return response to programs. 

But programs which use httplib, like pip, and want to open https url can't work in my network scheme. Because to open https connection httplib send to cntlm proxy 

"CONNECT encrypted.google.com:443 HTTP/1.0\r\n"

HTTP/1.0 does not assume persistent connection so corporate proxy return http response 407 (need authorization) and close connection. Cntlm proxy detect closed connection and return http response 407 to pip/httplib which can't handle this response or begin ntlm negotiation, throw exception 

ProxyError('Cannot connect to proxy.', error('Tunnel connection failed: 407 Proxy Authentication Required',))  

and close.

So I suggest change HTTP CONNECT method to 

"CONNECT %s:%d HTTP/1.1\r\n"

This change allow cntlm proxy keep alive connection to corporate proxy do all authorization stuff and return proper response. 

And also in header of httplib is stated what it is "HTTP/1.1 client library"
msg229876 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-23 15:04
See issue 21224 and issue 9740.  It's not 100% clear to me from reading those issues, but it *sounds* like the support is there, you just have to turn it on.  So I'm closing this issue as not a bug...either it already works (and you just have to set your code to use 1.1) or this is a duplicate of one of those two issues, and you should comment on whichever one is more appropriate.

Your point about PIP is a good one, though...that might be worth a specific issue for 1.1 support in pip, if it doesn't already have a way to switch it on.
msg229915 - (view) Author: Vova (vova) * Date: 2014-10-24 04:39
The issue http://bugs.python.org/issue21224 is about http server implementations. The issue http://bugs.python.org/issue9740 is more relevant for what I talking about, but not exactly.

Look, in this line 

https://hg.python.org/cpython/file/6a2f74811240/Lib/http/client.py#l786

http protocol version is setted and in this line 

https://hg.python.org/cpython/file/6a2f74811240/Lib/http/client.py#l1036

variable used to send http method (GET, PUT etc) and it work for direct connection or proxy with http connections. But if required to use CONNECT method through proxy (usually used for https connection) will be used _tunnel() method from http.client (py3k) or httplib (py2.7)

https://hg.python.org/cpython/file/6a2f74811240/Lib/http/client.py#l871

and here version off http is hardcoded to HTTP/1.0

PIP use urllib3, but urllib3 for actual network working (work with socket and send request) use httplib or http.client. So I think it would be better to make changes in httplib than override _tunnel() method in urllib3.

P.S. I'm not sure about rules how to open/close issues, so I open this issue again. I'm sorry if this causes some inconvenience.
msg239921 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-04-02 15:14
Thanks for the report and follow up Vova. I've come across this line and it puzzled me as well as to why it's hardcoded. Rather than the hardcoded approach in your patch, I've attached a patch that uses _http_vsn_str which is consistent with other areas of the library.

Additionally, I noticed that a failure will occur if the destination address contains characters outside of ASCII range, so I've added support for IDN.

I assume that the patch should be committed to tip and maintenance branches as both issues are bugs (there is IDN support elsewhere in the library).
msg239952 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-04-03 02:50
Patch looks good to me. I’m not an expert on non-ASCII domain names, but enabling HTTP 1.1 should be okay, at least for new versions of Python. CONNECT for HTTP 1.1 is described at <https://tools.ietf.org/html/rfc7231#section-4.3.6>.

Also, see the code review about removing an comment about HTTP 1.1 blessing.
msg241237 - (view) Author: Vova (vova) * Date: 2015-04-16 16:57
I don't know why I hardcoded version of HTTP, it was almost 6 months ago. Maybe because it just POC, or maybe because HTTP 1.0 is not allowed CONNECT method. Thank you for your patch and also for IDN support.
History
Date User Action Args
2017-04-15 02:01:52berker.peksagsetnosy: + berker.peksag

versions: + Python 3.7, - Python 3.4
2017-03-20 18:42:21python-devsetpull_requests: + pull_request655
2015-04-16 16:57:25vovasetmessages: + msg241237
2015-04-03 02:50:10martin.pantersetnosy: + martin.panter
messages: + msg239952
2015-04-02 15:14:57demian.brechtsetfiles: + issue22708.patch
versions: - Python 3.2, Python 3.3
nosy: + demian.brecht

messages: + msg239921

stage: resolved -> patch review
2014-10-24 04:39:32vovasetstatus: closed -> open
resolution: not a bug ->
messages: + msg229915
2014-10-23 15:04:56r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg229876

resolution: not a bug
stage: resolved
2014-10-23 06:21:23vovacreate