This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2 cannnot handle https and BasicAuth via Proxy.
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: masato kawamura, orsenthil
Priority: normal Keywords:

Created on 2013-03-22 11:18 by masato kawamura, last changed 2022-04-11 14:57 by admin.

Messages (2)
msg184958 - (view) Author: masato (masato kawamura) Date: 2013-03-22 11:18
When urllib2 module is used, https connections to servers that BasicAuth is required, via proxies, can not be established properly.

Sample code:
import urllib2

def main():
    url = "https://example.com/aplication/rpc"
    
    proxy_support = urllib2.ProxyHandler({'http': 'http://proxy.server.com:8080/','https': 'http://proxy.server.com:8080/'})

    pass_mngr = urllib2.HTTPPasswordMgrWithDefaultRealm()
    pass_mngr.add_password(None, url, 'user', 'password')
    auth_handler = urllib2.HTTPBasicAuthHandler(pass_mngr)
    
    opener = urllib2.build_opener(proxy_support,auth_handler)
    req = urllib2.Request(url)
    f = opener.open(req)
    print f.read()


"opener.open" method throws an exception.

I found a similar case at http://bugs.python.org/issue7291. However, this issue indicates a case of authentication at a proxy server. In my case, the origin server uses basic authentication.

The origin server address is not set in Request URI in the https request in case of using proxy servers, because https request gets encripted with SSL when it passes through proxies.

urllib2 works well for the first on-going request and in-comming return. However, after the first in-comming return (401 code), urllib2 module sends a worng request that includes the origin server URL.

I suggest modification for set_proxy method in Request class.

Original code:
def set_proxy(self, host, type):
        if self.type == 'https' and not self._tunnel_host:
            self._tunnel_host = self.host
        else:
            self.type = type
            self.__r_host = self.__original

        self.host = host

As a value is already set to tunnel_host when the second out-going request is sent, "self.type == 'https' and not self._tunnel_host" is false.

Code suggested:
def set_proxy(self, host, type):
        if self.type == 'https' and not self._tunnel_host:
            self._tunnel_host = self.host
        else:
            self.type = type
            if self.type != 'https':
                self.__r_host = self.__original

        self.host = host


regards.
msg228408 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-10-03 22:51
Slipped under the radar?  Note the patch suggested in msg184958.
History
Date User Action Args
2022-04-11 14:57:43adminsetgithub: 61720
2018-07-11 10:16:06BreamoreBoysetnosy: - BreamoreBoy
2018-07-11 07:37:37serhiy.storchakasettype: crash -> behavior
2014-10-03 22:51:09BreamoreBoysetnosy: + BreamoreBoy
messages: + msg228408
2013-11-10 23:08:13ned.deilysetnosy: + orsenthil
2013-03-22 11:18:20masato kawamuracreate