Issue20140
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014-01-06 10:43 by Jarek.Śmiejczak, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Messages (12) | |||
---|---|---|---|
msg207424 - (view) | Author: Jarek Śmiejczak (Jarek.Śmiejczak) | Date: 2014-01-06 10:43 | |
Full traceback: https://gist.github.com/jarekps/2729ee1917ea372e6642 Error's starts in pip but after investigation of traceback it looks like it is python's issue (version 2.7.5). Windows version: 8.1 Enterprise x64 with Polish language pack. Feel free to ask if any additional information is necessary. |
|||
msg207425 - (view) | Author: STINNER Victor (vstinner) * | Date: 2014-01-06 10:47 | |
> https://gist.github.com/jarekps/2729ee1917ea372e6642 Copy of the output: --- C:\Users\Jarosław>pip Traceback (most recent call last): File "c:\python27\Scripts\pip-script.py", line 9, in <module> load_entry_point('pip==1.5', 'console_scripts', 'pip')() File "c:\python27\lib\site-packages\distribute-0.6.49-py2.7.egg\pkg_resources.py", line 345, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "c:\python27\lib\site-packages\distribute-0.6.49-py2.7.egg\pkg_resources.py", line 2381, in load_entry_point return ep.load() File "c:\python27\lib\site-packages\distribute-0.6.49-py2.7.egg\pkg_resources.py", line 2087, in load entry = __import__(self.module_name, globals(),globals(), ['__name__']) File "c:\python27\lib\site-packages\pip\__init__.py", line 11, in <module> from pip.vcs import git, mercurial, subversion, bazaar # noqa File "c:\python27\lib\site-packages\pip\vcs\subversion.py", line 4, in <module> from pip.index import Link File "c:\python27\lib\site-packages\pip\index.py", line 16, in <module> from pip.wheel import Wheel, wheel_ext, wheel_setuptools_support File "c:\python27\lib\site-packages\pip\wheel.py", line 23, in <module> from pip._vendor.distlib.scripts import ScriptMaker File "c:\python27\lib\site-packages\pip\_vendor\distlib\scripts.py", line 15, in <module> from .resources import finder File "c:\python27\lib\site-packages\pip\_vendor\distlib\resources.py", line 105, in <module> cache = Cache() File "c:\python27\lib\site-packages\pip\_vendor\distlib\resources.py", line 40, in __init__ base = os.path.join(get_cache_base(), 'resource-cache') File "c:\python27\lib\ntpath.py", line 108, in join path += "\\" + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 14: ordinal not in range(128) --- It looks like a bug in distlib.resources, not in Python. os.path.join() works correctly if all arguments are bytes strings (str type). I should work if all arguments are Unicode strings only containing ASCII characters. (I don't know if it works if all aruments are Unicode strings.) In your case, it looks like os.path.join() is called with a unicode and a bytes string. |
|||
msg207428 - (view) | Author: Vinay Sajip (vinay.sajip) * | Date: 2014-01-06 11:07 | |
It's not failing specifically because of distlib or os.path.join functionality: it's failing because, given a Unicode path C:\Users\Jarosław\..., Python is attempting to decode it using the default, ASCII codec. I'll certainly look at updating distlib to handle this case, but the same problem could bite the user in other areas. |
|||
msg207429 - (view) | Author: Vinay Sajip (vinay.sajip) * | Date: 2014-01-06 12:11 | |
Jarek: I can't easily test this in my environment; perhaps you can help. Could you change, in the file c:\python27\lib\site-packages\pip\_vendor\distlib\resources.py, line 40 from base = os.path.join(get_cache_base(), 'resource-cache') to base = os.path.join(get_cache_base(), str('resource-cache')) to see if that resolves the problem? Currently, 'resource-cache' is a Unicode string (because of "from __future__ import unicode_literals" in the containing module) and that causes Python to try and convert the get_cache_base() result to Unicode using ASCII, which leads to the failure. |
|||
msg207603 - (view) | Author: Jarek Śmiejczak (Jarek.Śmiejczak) | Date: 2014-01-07 21:15 | |
@Vinay.Sajip After adding change you suggested i'm getting different error: --- C:\Users\Jarosław>pip install virtualenv Downloading/unpacking virtualenv Running setup.py (path:c:\users\jarosa~1\appdata\local\temp\pip_build_Jaros│a \virtualenv\setup.py) egg_info for package virtualenv warning: no previously-included files matching '*' found under directory 'd cs\_templates' warning: no previously-included files matching '*' found under directory 'd cs\_build' Cleaning up... Exception: Traceback (most recent call last): File "c:\python27\lib\site-packages\pip\basecommand.py", line 122, in main status = self.run(options, args) File "c:\python27\lib\site-packages\pip\commands\install.py", line 270, in ru requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bund e=self.bundle) File "c:\python27\lib\site-packages\pip\req.py", line 1211, in prepare_files req_to_install.assert_source_matches_version() File "c:\python27\lib\site-packages\pip\req.py", line 451, in assert_source_m tches_version % (display_path(self.source_dir), version, self)) UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 62: ordina not in range(128) Traceback (most recent call last): File "c:\python27\Scripts\pip-script.py", line 9, in <module> load_entry_point('pip==1.5', 'console_scripts', 'pip')() File "c:\python27\lib\site-packages\pip\__init__.py", line 185, in main return command.main(cmd_args) File "c:\python27\lib\site-packages\pip\basecommand.py", line 161, in main text = '\n'.join(complete_log) UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 77: ordina not in range(128) C:\Users\Jarosław> --- It looks like this needs a little more changes in pip to solve this issue. What's strange: In Windows 8.1, name of home directory is first name saved in your Microsoft Profile (if you log via this profile of course), so it should be a pretty common issue (i think). Thanks for your fast reaction and support. |
|||
msg219232 - (view) | Author: honglei jiang (jhonglei) | Date: 2014-05-27 17:16 | |
Python:canopy-1.3.0.1715.win-x86_64\ OS:Win8.1 64 >>>directory 'F:\\Flask\\EmberJS\\\xd6\xd0\xce\xc4\\Prj\\static' >>>os.path.isdir(directory) True >>>filename u'todomvc/architecture-examples/angularjs/index.html' >>>os.path.join(directory,filename) Traceback (most recent call last): File "c:\Users\honglei\AppData\Local\Enthought\Canopy\User\Lib\site-packages\flask\helpers.py", line 1, in <module> # -*- coding: utf-8 -*- File "C:\Users\honglei\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.3.0.1715.win-x86_64\Lib\ntpath.py", line 108, in join path += "\\" + b UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 17: ordinal not in range(128) >>>f=os.path.join(directory.decode(sys.getfilesystemencoding()),filename) >>>os.path.isfile(f) True |
|||
msg227696 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2014-09-27 16:30 | |
This looks to me as documentation issue. Unfortunately it is not explicitly documented that os.path.join() shouldn't mix str and unicode components (except ascii-only str, such as '.'). There is relevant note in 3.x documentation. It should be adapted to 2.7. |
|||
msg234010 - (view) | Author: Lin Wei (Lin.Wei) | Date: 2015-01-14 07:07 | |
The patch (http://bugs.python.org/issue9291#msg206938) for #9291 actually helps with this issue, at least for me. By the way, @Serhiy do you mean that the problem is merely documentation, while the implementation is alright? |
|||
msg236077 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-02-15 22:53 | |
Yes, the implementation of os.path is alright. There is a bug in distlib.resources. And the lack of os.path documentation. |
|||
msg236079 - (view) | Author: Vinay Sajip (vinay.sajip) * | Date: 2015-02-15 23:51 | |
> There is a bug in distlib.resources. As far as I know, this is no longer the case - a change was made in distlib.resources to get around the problem: https://bitbucket.org/vinay.sajip/distlib/src/471427909ebbba2f4fa9f4cbc34f17bd2d31b8e3/distlib/resources.py?at=default#cl-31 |
|||
msg276134 - (view) | Author: Robert Collins (rbcollins) * | Date: 2016-09-12 23:28 | |
Given two (or more) parameters where one is unicode and one is not, upcasting will occur multiples times in path.join on windows: - '\\' is str and will cast up safely in all codecs - the other str (or bytes) parameter will be upcast using sys.defaultencoding which is often / usually ASCII on Windows This will then fail when the str parameter is not valid ASCII. From this we can conclude that this is a failure to use path.join correctly: if all the parameters passed in were unicode, no error would occur as only '\\' would be getting coerced to unicode. The interesting question is why there was a str parameter that wasn't valid ASCII; and that lies with path.expanduser() which is returning a str for the non-ascii home directory. Changing that to return unicode rather than a no-encoding specified str when HOME or HOMEPATH etc etc contain non-ascii characters is a change that would worry me - specifically that we'd encounter code that assumes it is always str, e.g. by calling path.join(expanduser('~fred'), '\xe1\xbd\x84D') which will then blow up. Worth noting too is that expanduser(u'~user/\u14ffd') will also blow up in the same way in the same situation - as it ends up decoding the user home path when it concatenates userhome and path[i:]. So, what to do: - It might be worth testing a patch that changes expanduser to decode the environment variables - I'm not sure whether we'd want the filesystemencoding or the defaultencoding for handling these environment variables. Steve Dower probably knows :). - Or we say 'sorry, too hard in 2.7' and move on: join *itself* is fine here, given the limits of 2.7. |
|||
msg276147 - (view) | Author: Eryk Sun (eryksun) * | Date: 2016-09-13 02:02 | |
> It might be worth testing a patch that changes expanduser to > decode the environment variables If expanduser() is passed a unicode path, it can use _winreg.ExpandEvironmentStrings(u'%USERPROFILE%') instead of decoding os.environ['USERPROFILE']. In 2.7, os.environ is a lossy ANSI encoding of the native Unicode environment block. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:56 | admin | set | github: 64339 |
2021-02-25 12:27:52 | eryksun | set | status: open -> closed resolution: out of date stage: needs patch -> resolved |
2016-09-13 02:02:25 | eryksun | set | nosy:
+ eryksun messages: + msg276147 |
2016-09-12 23:28:56 | rbcollins | set | nosy:
+ rbcollins, steve.dower messages: + msg276134 |
2015-02-15 23:51:18 | vinay.sajip | set | messages: + msg236079 |
2015-02-15 22:53:08 | serhiy.storchaka | set | messages: + msg236077 |
2015-01-14 07:07:25 | Lin.Wei | set | nosy:
+ Lin.Wei messages: + msg234010 |
2014-09-27 16:30:33 | serhiy.storchaka | set | assignee: docs@python type: crash -> behavior components: + Documentation, - Windows keywords: + easy nosy: + serhiy.storchaka, docs@python messages: + msg227696 stage: needs patch |
2014-05-27 17:16:23 | jhonglei | set | nosy:
+ jhonglei messages: + msg219232 |
2014-01-07 21:15:45 | Jarek.Śmiejczak | set | messages: + msg207603 |
2014-01-06 12:11:00 | vinay.sajip | set | messages: + msg207429 |
2014-01-06 11:07:58 | vinay.sajip | set | messages: + msg207428 |
2014-01-06 10:47:29 | vstinner | set | nosy:
+ vinay.sajip, vstinner, ncoghlan messages: + msg207425 |
2014-01-06 10:43:51 | Jarek.Śmiejczak | create |