This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, flox, ishimoto, loewis, tim.golden, vstinner
Date 2012-07-28.12:49:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1343479745.83.0.410557070565.issue15478@psf.upfronthosting.co.za>
In-reply-to
Content
On Windows, if an OS error fails, the filename type is bytes and the filename cannot be decoded: Python raises an UnicodeDecodeError instead of an OSError. The problem is that Python decodes the filename to fill OSError.filename field. See the issue #15441 for the initial report.

There are different options to solve this issue:
 - always keep the filename parameter unchanged, so OSError.filename can be a str or a bytes string, depending on the input parameter
 - try to decode the filename from the filesystem encoding, or keep the filename unchanged: OSError.filename is only a bytes string if the filename cannot be decoded
 - don't fill OSError.filename (= None) if the filename cannot be decoded
 - use "surrogateescape", "replace" or "backslashreplace" error handler to decode the filename

This issue is specific to Windows: on other plaforms, the filename is decoded using the "surrogateescape" error handler and so decoding the filename cannot fail.

I don't know if OSError.filename is only used to display more information to the user, or if it is used to do another operation on the file (ex: os.chmod).

I like solutions keeping the filename unchanged, because it does not loose information, and the user can decide how to handle the undecodable filename.

I don't like the option trying to decode the filename or keeping it unchanged it decoding fails, because applications will work in most cases, but "crash" when someone comes with an unusual code page, a special USB key, or a filename with a non-ASCII character.

So the best option is maybe to always keep the bytes filename unchanged.

Such change cannot be done anymore in Python 3.3, it's too late to test it correctly.
History
Date User Action Args
2012-07-28 12:49:06vstinnersetrecipients: + vstinner, loewis, ishimoto, tim.golden, ezio.melotti, flox
2012-07-28 12:49:05vstinnersetmessageid: <1343479745.83.0.410557070565.issue15478@psf.upfronthosting.co.za>
2012-07-28 12:49:04vstinnerlinkissue15478 messages
2012-07-28 12:49:04vstinnercreate