I first stumbled across this bug attempting to install use pip's cool editable mode:
$ pip install -e git+git://github.com/appliedsec/pygeoip.git#egg=pygeoip
Obtaining pygeoip from git+git://github.com/appliedsec/pygeoip.git#egg=pygeoip
Cloning git://github.com/appliedsec/pygeoip.git to ./src/pygeoip
Running setup.py egg_info for package pygeoip
Traceback (most recent call last):
File "<string>", line 16, in <module>
File "/home/curtis/python/3.3.3/lib/python3.3/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1098: ordinal not in range(128)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 16, in <module>
File "/home/curtis/python/3.3.3/lib/python3.3/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1098: ordinal not in range(128)
----------------------------------------
Cleaning up...
Command python setup.py egg_info failed with error code 1 in /home/curtis/python/2013-11-20/src/pygeoip
Storing complete log in /home/curtis/.pip/pip.log
It turns out this is related to a local LANG=C environment. If I set LANG=en_US.UTF-8, the problem goes away. But it seems pip/python3 open() should be more intelligently handling this.
Worse, the file in this case https://github.com/appliedsec/pygeoip/blob/master/setup.py already has a source code decorator *declaring* it as utf-8.
Ugly workaround patch is to force pip to always use 8-bit encoding on setup.py:
--- pip.orig/req.py 2013-11-19 15:53:49.000000000 -0800
+++ pip/req.py 2013-11-20 16:37:23.642656132 -0800
@@ -281,7 +281,7 @@ def replacement_run(self):
writer(self, ep.name, os.path.join(self.egg_info,ep.name))
self.find_sources()
egg_info.egg_info.run = replacement_run
-exec(compile(open(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))
+exec(compile(open(__file__,encoding='utf_8').read().replace('\\r\\n', '\\n'), __file__, 'exec'))
"""
def egg_info_data(self, filename):
@@ -687,7 +687,7 @@ exec(compile(open(__file__).read().repla
## FIXME: should we do --install-headers here too?
call_subprocess(
[sys.executable, '-c',
- "import setuptools; __file__=%r; exec(compile(open(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))" % self.setup_py]
+ "import setuptools; __file__=%r; exec(compile(open(__file__,encoding='utf_8').read().replace('\\r\\n', '\\n'), __file__, 'exec'))" % self.setup_py]
+ list(global_options) + ['develop', '--no-deps'] + list(install_options),
cwd=self.source_dir, filter_stdout=self._filter_install,
But that only treats the symptom. Root cause appears to be in python3 as demonstrated by this simple script:
wrong-codec.py:
#! /bin/env python3
from urllib.request import urlretrieve
urlretrieve('https://raw.github.com/appliedsec/pygeoip/master/setup.py', filename='setup.py')
# if LANC=C then locale.py:getpreferredencoding()->'ANSI_X3.4-1968'
foo= open('setup.py')
# bang! ascii_decode() cannot handle the unicode
bar= foo.read()
This does not occur in python2. Is this bug in pip or python3?
|