Issue22888
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014-11-17 02:03 by gmljosea, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (4) | |||
---|---|---|---|
msg231263 - (view) | Author: José Alberto Goncalves (gmljosea) | Date: 2014-11-17 02:03 | |
Summary: Python 3.4's venv works fine in Windows, and pip works fine when installing both pure Python libraries and extension modules. However, when the virtual environment is under a path with non-ASCII characters, attempting to install a package that specifies console_scripts or scripts (like pip or mutagen, respectivelly), it fails with encoding errors. I looked around the Internet for a solution but the best I could find was Issue #10419, which is over 3 years old and is marked as resolved, and couldn't find any other open issue about this. Details of my case: I created a Python 3.4 (32-bit) virtualenv via Python Tools for Visual Studio, on windows 8.1 (64-bit), in a folder that is under my home directory (C:\Users\José Alberto\), which happens to contain an accented character, using the latest Python you can download from the homepage. Via Powershell I activated the virtualenv and tried to execute pip install mutagen (https://pypi.python.org/pypi/mutagen, it is relevant because it specifies scripts in its setup.py). The installation failed with the following error: Downloading/unpacking mutagen Running setup.py (path:C:\Users\José Alberto\Documents\podtimizer\env_podtimizer\build\mutagen\setup.py) egg_info for package mutagen Installing collected packages: mutagen Running setup.py install for mutagen Traceback (most recent call last): File "C:\Python34\lib\distutils\command\build_scripts.py", line 114, in copy_scripts shebang.decode('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 14: invalid continuation byte During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\JosÚ Alberto\Documents\podtimizer\env_podtimizer\build\mutagen\setup.py", line 277, in <module> """ File "C:\Python34\lib\distutils\core.py", line 148, in setup dist.run_commands() File "C:\Python34\lib\distutils\dist.py", line 955, in run_commands self.run_command(cmd) File "C:\Python34\lib\distutils\dist.py", line 974, in run_command cmd_obj.run() File "C:\Users\JosÚ Alberto\Documents\podtimizer\env_podtimizer\lib\site-packages\setuptools-6.0.2-py3.4.egg\setuptools\command\install.py", line 61, in run File "C:\Python34\lib\distutils\command\install.py", line 539, in run self.run_command('build') File "C:\Python34\lib\distutils\cmd.py", line 313, in run_command self.distribution.run_command(command) File "C:\Python34\lib\distutils\dist.py", line 974, in run_command cmd_obj.run() File "C:\Python34\lib\distutils\command\build.py", line 126, in run self.run_command(cmd_name) File "C:\Python34\lib\distutils\cmd.py", line 313, in run_command self.distribution.run_command(command) File "C:\Python34\lib\distutils\dist.py", line 974, in run_command cmd_obj.run() File "C:\Python34\lib\distutils\command\build_scripts.py", line 50, in run self.copy_scripts() File "C:\Python34\lib\distutils\command\build_scripts.py", line 118, in copy_scripts "from utf-8".format(shebang)) ValueError: The shebang (b'#!C:\\Users\\Jos\xe9 Alberto\\Documents\\podtimizer\\env_podtimizer\\Scripts\\python.exe\n') is not decodable from utf-8 I looked around the Internet for a solution, but the best I could find was the Issue #10419, which is over 3 years old and is marked as closed and resolved. The last comment mentions a fix that was commited to Distribute around that time, with the caveat that entry points script creation would fail if the path contained unencodeable characters (which sounds exactly like the problem I'm having). I Couldn't find an open issue to follow up on this. I went to the source of the error, around Lib/distutils/command/build_scripts.py:106. Since this is Windows, the result of os.fsencode() uses the encoding 'mbcs' (as reported by Python), then it tries to decode it back using utf-8, and it blows up: >>> import os >>> os.fsencode('C:\\Users\\José Alberto\\') b'C:\\Users\\Jos\xe9 Alberto\\' >>> 'C:\\Users\\José Alberto\\'.encode('utf-8') b'C:\\Users\\Jos\xc3\xa9 Alberto\\' I commented both try..except after the os.fsencode and it worked, but commenting random code whose purpose I don't fully understand doesn't seem like a good strategy. While testing for the above, I found I couldn't finish installing pip successfully on a virtualenv using just the Python installed from python.org. On Powershell I created several virtualenvs using C:\Python34\python.exe -m venv. The envs were created successfully, but the pip's console_scripts installation failed silently. I could still run python -m pip and install packages, but the pip.exe files were not created. I removed pip from the environment's site-packages directory and tried to reinstall it via python -m ensurepip, but instead got the following error: Installing collected packages: pip Cleaning up... Removing temporary dir C:\Users\José Alberto\test_env3\build... Exception: Traceback (most recent call last): File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\_vendor\distlib\scripts.py", line 124, in _get_shebang shebang.decode('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 15: invalid continuation byte During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\basecommand.py", line 122, in main status = self.run(options, args) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\commands\install.py", line 283, in run requirement_set.install(install_options, global_options, root=options.root_path) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\req.py", line 1435, in install requirement.install(install_options, global_options, *args, **kwargs) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\req.py", line 671, in install self.move_wheel_files(self.source_dir, root=root) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\req.py", line 901, in move_wheel_files pycompile=self.pycompile, File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\wheel.py", line 325, in move_wheel_files generated.extend(maker.make(spec)) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\_vendor\distlib\scripts.py", line 311, in make self._make_script(entry, filenames, options=options) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\_vendor\distlib\scripts.py", line 201, in _make_script shebang = self._get_shebang('utf-8', options=options) File "C:\Users\JOSALB~1\AppData\Local\Temp\tmpax00n0z5\pip-1.5.6-py2.py3-none-any.whl\pip\_vendor\distlib\scripts.py", line 127, in _get_shebang 'The shebang (%r) is not decodable from utf-8' % shebang) ValueError: The shebang (b'#!"C:\\Users\\Jos\xe9 Alberto\\test_env3\\Scripts\\python.exe"\n') is not decodable from utf-8 Which is exactly the same issue I was running into with build_scripts, but this time in a similar code within ensurepip's pip wheel. This time I tried again to comment the utf-8 encoding checks, and although ensurepip now finished successfully, the executables failed with "Couldn't create process". This is as far as I could go within my very limited understanding of encoding issues and pip, so I decided to write this issue. Is it possible to fix this? Is there something I can do to help? |
|||
msg231267 - (view) | Author: Eryk Sun (eryksun) * | Date: 2014-11-17 06:51 | |
On Windows, shouldn't copy_scripts use UTF-8 instead of os.fsencode (MBCS)? The Python launcher executes the shebang line on Windows, and it defaults to UTF-8 if a script doesn't have a BOM. See line 1105 in maybe_handle_shebang: https://hg.python.org/cpython/file/ab2c023a9432/PC/launcher.c#l1064 |
|||
msg231302 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2014-11-17 21:56 | |
> On Windows, shouldn't copy_scripts use UTF-8 instead of os.fsencode > (MBCS)? The Python launcher executes the shebang line on Windows, and > it defaults to UTF-8 if a script doesn't have a BOM. Good catch! It seems you're right. Do you want to provide a patch + tests? |
|||
msg386364 - (view) | Author: Steve Dower (steve.dower) * | Date: 2021-02-03 18:21 | |
Distutils is now deprecated (see PEP 632) and all tagged issues are being closed. From now until removal, only release blocking issues will be considered for distutils. If this issue does not relate to distutils, please remove the component and reopen it. If you believe it still requires a fix, most likely the issue should be re-reported at https://github.com/pypa/setuptools |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:10 | admin | set | github: 67077 |
2021-02-03 18:21:31 | steve.dower | set | status: open -> closed nosy: + steve.dower messages: + msg386364 resolution: out of date stage: needs patch -> resolved |
2016-09-01 01:40:37 | jayvdb | set | nosy:
+ jayvdb |
2014-11-17 21:56:39 | pitrou | set | stage: needs patch versions: + Python 3.5 |
2014-11-17 21:56:31 | pitrou | set | nosy:
+ pitrou messages: + msg231302 |
2014-11-17 06:51:20 | eryksun | set | nosy:
+ eryksun messages: + msg231267 |
2014-11-17 02:19:34 | ezio.melotti | link | issue22887 superseder |
2014-11-17 02:03:25 | gmljosea | create |