classification
Title: urllib2 cannot handle https with proxy requiring auth
Type: Stage:
Components: Library (Lib) Versions: Python 2.7, Python 2.6
process
Status: open Resolution: accepted
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: b.a.scott, dieresys, mbeachy, orsenthil, ronaldoussoren, tsujikawa
Priority: normal Keywords: patch

Created on 2009-11-09 04:38 by tsujikawa, last changed 2011-02-21 11:19 by b.a.scott.

Files
File name Uploaded Description Edit
https_proxy_auth.patch tsujikawa, 2009-11-09 06:56 patch to send Proxy-Authorization header in CONNECT method(https proxy) review
urllib2_with_proxy_auth_comparison.py dieresys, 2009-12-23 22:16
2_7_x.patch mbeachy, 2011-02-19 17:23 2.7 maintenance branch patch review
monkey_2_6_4.py mbeachy, 2011-02-19 17:25 2.6.4 monkey patch
urllib2_tests.tar.gz b.a.scott, 2011-02-21 11:09 Test code, results and instructions
http_proxy_https.patch b.a.scott, 2011-02-21 11:19 Fix handling of 407 and 401 in urllib2 and httplib review
Messages (17)
msg95058 - (view) Author: Tatsuhiro Tsujikawa (tsujikawa) Date: 2009-11-09 04:38
urllib2 cannot handle https with proxy requiring authorization.

After https_proxy is set correctly,

Python 2.6.4 (r264:75706, Oct 29 2009, 15:38:25)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> c=urllib2.urlopen("https://sourceforge.net")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/urllib2.py", line 124, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 389, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 407, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 367, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1154, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1121, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error Tunnel connection failed: 407 Proxy
Authentication Required>

This is because HTTPConnection::_tunnel() in httplib.py doesn't send
Proxy-Authorization header.
msg95060 - (view) Author: Tatsuhiro Tsujikawa (tsujikawa) Date: 2009-11-09 06:56
I created a patch.
I added additional argument 'headers' to HTTPConnection::set_tunnel()
method,
which is a mapping of HTTP headers to sent with CONNECT method. Since
authorization
credential is already set to Request object, in
AbstractHTTPHandler::do_open(),
if "Proxy-Authorization" header is found, pass it to set_tunnel().

It works fine for me.
msg95373 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2009-11-17 10:40
The patch looks good to me.

IMHO this should be backported to 2.6 as well.
msg95377 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2009-11-17 10:52
I've tested a backport of the patch to 2.6 (just replace set_proxy by 
_set_proxy in the patch) and the resulting version of urllib2 can login to 
the proxy (as expected).

Thanks for the patch.
msg96659 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-12-20 06:05
Fixed and Committed revision 76908 in the trunk.
msg96660 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-12-20 07:22
Fixed through reversions r76908, r76909, r76910, r76911

Thanks for the patch, Tatsuhiro Tsujikawa.
msg96661 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-12-20 07:22
meant revisions.
msg96844 - (view) Author: Manuel Muradás (dieresys) Date: 2009-12-23 22:11
Hi! 2.6 backport is missing an argument in _set_tunnel definition. It
should be:

    def _set_tunnel(self, host, port=None, headers=None):
msg96845 - (view) Author: Manuel Muradás (dieresys) Date: 2009-12-23 22:16
The patch fixes only when you pass the authentication info in the proxy
handler's URL. Like:

    proxy_handler = urllib2.ProxyHandler({'https':
'http://user:pass@proxy-example.com:3128/'})

But setting the authentication using a ProxyBasicAuthHandler is still
broken:

    proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
    proxy_auth_handler.add_password('realm', 'proxy-example.com:3128',
'user', 'pass')

In the attached file (urllib2_with_proxy_auth_comparison.py) we've wrote
a comparison between what works with HTTP and HTTPS.

The problem is the 407 error never reaches the ProxyBasicAuthHandler
because HTTPConnection._tunnel raises an exception when the http
response status code is not 200.
msg96846 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-12-24 00:53
Thanks for the note, Manuel. Fixed it in revision 77013.
msg100840 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-03-11 09:31
In this ticket, setting the authentication using a ProxyBasicAuthHandler is not yet addressed yet. (this was informed in the last note). Reopening this one to track it.
msg103090 - (view) Author: Mike Beachy (mbeachy) Date: 2010-04-13 23:11
I have worked up a monkey patch for urllib2/httplib for the issue of setting the authentication using a Proxy(Basic|Digest)AuthHandler.

The basic approach was to create a new httplib exception (ProxyTunnelError) and raise that with the http response attached so that the HTTPSHandler can determine when 407 Proxy authentication required is present, and then reroute the urllib2.OpenerDirector to error handling mode.

Unfortunately, there is a backwards compatibility issue - cases where a socket.error was previously being raised now get an ProxyTunnelError. Not that you could do much useful with the socket.error in the first place, but I suppose you could look for '407' in the text. Ugh.

If you think this might prove useful, let me know and I can rework it into a real patch - just let me know what branch/version to base it off of. (My monkey patch is for python 2.6.4.)
msg128861 - (view) Author: Mike Beachy (mbeachy) Date: 2011-02-19 17:23
I've been in contact w/ Barry Scott offline re: the monkey patch previously mentioned. I'm attaching a 2.7 maintenance branch patch that he has needed to extend, and plans to follow up on.
msg128862 - (view) Author: Mike Beachy (mbeachy) Date: 2011-02-19 17:25
Attached to this comment (can you attach multiple files at once?) is the somewhat moldy 2.6.4 monkey patch, mercilessly ripped from my own code and probably not good for much.
msg128952 - (view) Author: Barry Scott (b.a.scott) Date: 2011-02-21 10:41
The attached patch builds on Mike's work.

The core of the problem is that the Request object
did not know what was going on. This means that it
was not possible for get_authorization() to work
for proxy-auth and www-auth.

I change Request to know which of the four types of
connection it represents. There are new methods on
Request that return the right information based on
the connection type.

To understand how to make this work I needed to
instrument the code. There is now a set_debuglevel
on the OpenerDirector object that turns on debug in
all the handlers and the director. I have added
more debug messages to help understand this code.

This code now passes the 72 test cases I run. I'll
attach the code I used to test as a follow up to this.
msg128953 - (view) Author: Barry Scott (b.a.scott) Date: 2011-02-21 11:09
Attached is the code I used to test these changes.
See the README.txt file for details include
the results of a test run.
msg128955 - (view) Author: Barry Scott (b.a.scott) Date: 2011-02-21 11:19
I left out some white space changes to match the style
of the std lib code. Re posting with white space cleanup.
History
Date User Action Args
2011-02-21 11:19:31b.a.scottsetfiles: - http_proxy_https.patch
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa, b.a.scott
2011-02-21 11:19:23b.a.scottsetfiles: + http_proxy_https.patch
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa, b.a.scott
messages: + msg128955
2011-02-21 11:09:42b.a.scottsetfiles: + urllib2_tests.tar.gz
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa, b.a.scott
messages: + msg128953
2011-02-21 10:41:27b.a.scottsetfiles: + http_proxy_https.patch
nosy: + b.a.scott
messages: + msg128952

2011-02-19 17:25:17mbeachysetfiles: + monkey_2_6_4.py
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa
messages: + msg128862
2011-02-19 17:23:26mbeachysetfiles: + 2_7_x.patch
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa
messages: + msg128861
2010-04-13 23:11:40mbeachysetnosy: + mbeachy
messages: + msg103090
2010-03-11 09:31:44orsenthilsetstatus: closed -> open
resolution: fixed -> accepted
messages: + msg100840
2010-02-22 16:13:43floxlinkissue7986 superseder
2009-12-24 00:53:30orsenthilsetmessages: + msg96846
2009-12-23 22:16:00dieresyssetfiles: + urllib2_with_proxy_auth_comparison.py

messages: + msg96845
2009-12-23 22:11:04dieresyssetnosy: + dieresys
messages: + msg96844
2009-12-20 07:22:49orsenthilsetmessages: + msg96661
2009-12-20 07:22:10orsenthilsetstatus: open -> closed

messages: + msg96660
2009-12-20 06:06:00orsenthilsetkeywords: - needs review
resolution: accepted -> fixed
messages: + msg96659
2009-11-17 10:52:27ronaldoussorensetmessages: + msg95377
2009-11-17 10:40:36ronaldoussorensetkeywords: + needs review
nosy: + ronaldoussoren
messages: + msg95373

2009-11-15 09:21:35orsenthilsetassignee: orsenthil

resolution: accepted
nosy: + orsenthil
2009-11-11 01:06:09tsujikawasetversions: + Python 2.7
2009-11-09 06:56:13tsujikawasetfiles: + https_proxy_auth.patch
keywords: + patch
messages: + msg95060
2009-11-09 04:38:16tsujikawacreate