This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author amaury.forgeotdarc
Recipients amaury.forgeotdarc, xiaowei.py
Date 2013-03-19.10:54:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1363690469.26.0.459102193798.issue17320@psf.upfronthosting.co.za>
In-reply-to
Content
The string '\xe7\x8e\xb0' is the utf-8 encoded version of u'现' (=u'\u73b0')

But your Windows system uses the cp936 code page to encode file names.
'\xe7\x8e\xb0' is invalid in this code page: the last character is an incomplete multibyte sequence, and is dropped by Windows when converting to a Unicode file name.
Windows automatic conversion functions work similar to this Python code (note the 'ignore' parameter):
>>> '\xe7\x8e\xb0'.decode('cp936', 'ignore').encode('cp936')
'\xe7\x8e'


'\xe7\x8e\xb0' is an invalid file name on your platform. You should either:

- use cp936 encoding in your application

- much better, use unicode file names everywhere:
  >>> os.path.abspath('\xe7\x8e\xb0'.decode('utf-8'))
  will return the expected result.


Python3 will emit a Warning when os.path.abspath() is called with a bytes string.
History
Date User Action Args
2013-03-19 10:54:29amaury.forgeotdarcsetrecipients: + amaury.forgeotdarc, xiaowei.py
2013-03-19 10:54:29amaury.forgeotdarcsetmessageid: <1363690469.26.0.459102193798.issue17320@psf.upfronthosting.co.za>
2013-03-19 10:54:29amaury.forgeotdarclinkissue17320 messages
2013-03-19 10:54:28amaury.forgeotdarccreate