This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients loewis, vstinner
Date 2010-09-10.10:42:57
SpamBayes Score 5.1692273e-09
Marked as misclassified No
Message-id <1284115379.54.0.982688803617.issue9820@psf.upfronthosting.co.za>
In-reply-to
Content
In Python 3.2, mbcs encoding (default filesystem encoding on Windows) is now strict: raise an error on unencodable/undecodable characters/bytes. But os.listdir(b'.') encodes unencodable bytes as b'?'.

Example:

>>> os.mkdir('listdir')
>>> open('listdir\\xxx-\u0363', 'w').close()
>>> filename = os.listdir(b'listdir')[0]
>>> filename
b'xxx-?'
>>> open(filename, 'r').close()
IOError: [Errno 22] Invalid argument: 'xxx-?'

os.listdir(b'listdir') should raise an error (and not ignore the filename or replaces unencodable characters by b'?').

I think that we should list the directory using the wide character API (FindFirstFileW) but encode the filename using PyUnicode_EncodeFSDefault() if the directory name type is bytes, instead of using the ANSI API (FindFirstFileA).
History
Date User Action Args
2010-09-10 10:42:59vstinnersetrecipients: + vstinner, loewis
2010-09-10 10:42:59vstinnersetmessageid: <1284115379.54.0.982688803617.issue9820@psf.upfronthosting.co.za>
2010-09-10 10:42:58vstinnerlinkissue9820 messages
2010-09-10 10:42:57vstinnercreate