This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author leotan
Recipients leotan
Date 2015-02-24.21:57:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1424815080.16.0.743473330916.issue23516@psf.upfronthosting.co.za>
In-reply-to
Content
I was running pip install with the --proxy switch to authenticate to a proxy server with user "user" and password "pass?word", when I noticed it fails. It seems to fail when the password contains some special characters, v.g., ? and #.

Here's the exception I saw:

  Exception:
  Traceback (most recent call last):
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/basecommand.py", line 232, in main
      status = self.run(options, args)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/commands/install.py", line 339, in run
      requirement_set.prepare_files(finder)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/req/req_set.py", line 333, in prepare_files
      upgrade=self.upgrade,
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/index.py", line 305, in find_requirement
      page = self._get_page(main_index_url, req)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/index.py", line 783, in _get_page
      return HTMLPage.get_page(link, req, session=self.session)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/index.py", line 872, in get_page
      "Cache-Control": "max-age=600",
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/sessions.py", line 473, in get
      return self.request('GET', url, **kwargs)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/download.py", line 365, in request
      return super(PipSession, self).request(method, url, *args, **kwargs)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/sessions.py", line 461, in request
      resp = self.send(prep, **send_kwargs)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/sessions.py", line 573, in send
      r = adapter.send(request, **kwargs)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/cachecontrol/adapter.py", line 43, in send
      resp = super(CacheControlAdapter, self).send(request, **kw)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/adapters.py", line 337, in send
      conn = self.get_connection(request.url, proxies)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/adapters.py", line 245, in get_connection
      proxy_manager = self.proxy_manager_for(proxy)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/adapters.py", line 155, in proxy_manager_for
      **proxy_kwargs)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/packages/urllib3/poolmanager.py", line 265, in proxy_from_url
      return ProxyManager(proxy_url=url, **kw)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/packages/urllib3/poolmanager.py", line 210, in __init__
      proxy = parse_url(proxy_url)
    File "/usr/local/lib/python3.3/site-packages/pip-6.0.8-py3.3.egg/pip/_vendor/requests/packages/urllib3/util/url.py", line 185, in parse_url
      raise LocationParseError(url)
  pip._vendor.requests.packages.urllib3.exceptions.LocationParseError: Failed to parse: user:pass

AFAICT the problem lies in function parse_url() in url.py because it assumes that there cannot exist neither a ? nor a # between the :// and the next / .  This does not hold, because a URL can include a username and a password right there, as in http://user:pass?word@host/path. Here's the offending piece of code:

    if '://' in url:
        scheme, url = url.split('://', 1)

    # Find the earliest Authority Terminator
    # (http://tools.ietf.org/html/rfc3986#section-3.2)
    url, path_, delim = split_first(url, ['/', '?', '#'])


It's funny that this snippet violates precisely the specification given in that comment (RFC3986 section 3.2), because it clearly states that this string can contain a userinfo field:

     authority   = [ userinfo "@" ] host [ ":" port ]

For some reason, urlencoding the password did not help either, the error message did not change.
History
Date User Action Args
2015-02-24 21:58:00leotansetrecipients: + leotan
2015-02-24 21:58:00leotansetmessageid: <1424815080.16.0.743473330916.issue23516@psf.upfronthosting.co.za>
2015-02-24 21:58:00leotanlinkissue23516 messages
2015-02-24 21:57:58leotancreate