Issue91

Title setuptools fails with non-ascii urls in pypi
Priority bug Status chatting
Superseder Nosy List pje, srid
Assigned To Keywords

Created on 2009-11-06.14:24:43 by srid, last changed 2009-11-09.23:47:30 by srid.

Messages
msg460 (view) Author: srid Date: 2009-11-09.23:47:29
Using str() on `safe_name` or the parameters to Requirement.parse did not 
resolve this issue for me. 

I don't have the time to look deeper into this issue, but my guess that it is 
the download URL that has non-ascii chars. The (non-ascii) URL is internally 
handled by setuptools; so I cannot find a workaround on my side (except 
catching-and-ignoring UnicodeDecodeError).
msg456 (view) Author: srid Date: 2009-11-06.21:22:54
Ah I see. So using byte strings fixes the problem. 

> Do you have any suggestions for how you'd like to see this fixed?

I am happy to change my code to use bytes, otherwise:

The only solution I can think of is to have Requirement/Distribution (or 
whatever code path that relies on bytes) to explicitly encode unicode strings. 
Something like "if type(s) is unicode: s = s.encode('utf-8')". Or atleast fail 
on receiving unicode strings. Especially for functions that accept package 
names or requirement strings as parameters.
msg455 (view) Author: pje Date: 2009-11-06.20:18:32
So, you can only generate this programmatically, using unicode. Wrapping the
safe_name call in a str() call fixes the problem.

It's likely that the reason Distribute doesn't have this problem is that it has
changes for Python 3.  It's probably the case that Requirement and Distribution
objects should force their project names (among other elements) to be byte
strings under Python 2.x.

Do you have any suggestions for how you'd like to see this fixed?
msg454 (view) Author: srid Date: 2009-11-06.17:48:03
Note that Distribute does not have this bug:

http://paste.pocoo.org/show/149058/
$ e/bin/python bug.py 
Using setuptools version 0.6 /home/sridharr/as/pypm/e/lib/python2.6/site-
packages/distribute-0.6.8-py2.6.egg/setuptools/__init__.pyc
Download error: ascii /simple/flügelform 10 11 ordinal not in range(128) -- 
Some packages may not be found!
Download error: ascii /simple/flügelform 10 11 ordinal not in range(128) -- 
Some packages may not be found!
Couldn't find index page for u'fl-gelform' (maybe misspelled?)
No local packages or download links found for a source distribution of fl-
gelform
$
msg453 (view) Author: srid Date: 2009-11-06.17:41:51
The following code reproduces the bug for me (in Python 2.6.4 / 
setuptools-0.6c11)

import pkg_resources
import setuptools
from setuptools.package_index import PackageIndex

assert setuptools.__version__ == '0.6c11'

if __name__ == '__main__':
    pi = PackageIndex()
    r = pkg_resources.Requirement.parse(pkg_resources.safe_name(u'fl
\xfcgelform'))
    pi.fetch_distribution(r, '/tmp', source=True)
msg452 (view) Author: pje Date: 2009-11-06.16:28:25
I can't currently reproduce this with flügelform - it has no links at the moment.

It seems like there must be a unicode string somewhere involved here, since
otherwise the addition of two bytestrings wouldn't cause a problem.  Are you by
any chance passing in a unicode string?  I can't think of any place in
setuptools that would introduce unicode into the process otherwise.
msg451 (view) Author: srid Date: 2009-11-06.14:35:49
Two packages in pypi contains non-ascii chars.

In [6]: [x for x in pypipackages if not re.match(r'^[A-Za-z0-9 -_]+$', x)]
Out[6]: [u'fl\xfcgelform', u'Manual de Py2Exe en Espa\xf1ol']
msg450 (view) Author: srid Date: 2009-11-06.14:24:42
Calling `fetch_distribution` with req = "fl-gelform" and source = True results 
in the following traceback.

    sdist = pi.fetch_distribution(req, directory, source=True)
  File "/home/sridharr/.local/lib/python2.6/site-packages/setuptools/
package_index.py", line 465, in fetch_distribution
    self.find_packages(requirement)
  File "/home/sridharr/.local/lib/python2.6/site-packages/setuptools/
package_index.py", line 303, in find_packages
    self.scan_url(self.index_url + requirement.unsafe_name+'/')
  File "/home/sridharr/.local/lib/python2.6/site-packages/setuptools/
package_index.py", line 617, in scan_url
    self.process_url(url, True)
  File "/home/sridharr/.local/lib/python2.6/site-packages/setuptools/
package_index.py", line 190, in process_url
    f = self.open_url(url, "Download error: %s -- Some packages may not be 
found!")
  File "/home/sridharr/.local/lib/python2.6/site-packages/setuptools/
package_index.py", line 578, in open_url
    return open_with_auth(url)
  File "/home/sridharr/.local/lib/python2.6/site-packages/setuptools/
package_index.py", line 717, in open_with_auth
    fp = urllib2.urlopen(request)
  File "/opt/ActivePython-2.6/lib/python2.6/urllib2.py", line 124, in urlopen
    return _opener.open(url, data, timeout)
  File "/opt/ActivePython-2.6/lib/python2.6/urllib2.py", line 395, in open
    response = meth(req, response)
  File "/opt/ActivePython-2.6/lib/python2.6/urllib2.py", line 508, in 
http_response
    'http', request, response, code, msg, hdrs)
  File "/opt/ActivePython-2.6/lib/python2.6/urllib2.py", line 427, in error
    result = self._call_chain(*args)
  File "/opt/ActivePython-2.6/lib/python2.6/urllib2.py", line 367, in 
_call_chain
    result = func(*args)
  File "/opt/ActivePython-2.6/lib/python2.6/urllib2.py", line 577, in 
http_error_302
    newurl = urlparse.urljoin(req.get_full_url(), newurl)
  File "/opt/ActivePython-2.6/lib/python2.6/urlparse.py", line 219, in urljoin
    params, query, fragment))
  File "/opt/ActivePython-2.6/lib/python2.6/urlparse.py", line 184, in 
urlunparse
    return urlunsplit((scheme, netloc, url, query, fragment))
  File "/opt/ActivePython-2.6/lib/python2.6/urlparse.py", line 190, in 
urlunsplit
    url = '//' + (netloc or '') + url
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10: 
ordinal not in range(128)
History
Date User Action Args
2009-11-09 23:47:30sridsetmessages: + msg460
2009-11-06 21:22:55sridsetmessages: + msg456
2009-11-06 20:18:32pjesetmessages: + msg455
2009-11-06 17:48:04sridsetmessages: + msg454
2009-11-06 17:41:51sridsetmessages: + msg453
2009-11-06 16:28:25pjesetstatus: unread -> chatting
nosy: + pje
messages: + msg452
2009-11-06 14:35:53sridsetstatus: chatting -> unread
2009-11-06 14:35:49sridsetstatus: unread -> chatting
messages: + msg451
2009-11-06 14:24:43sridcreate