This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients BreamoreBoy, anthonybaxter, brett.cannon, ezio.melotti, kristjan.jonsson, loewis, nnorwitz, theller, vstinner
Date 2010-08-31.22:43:49
SpamBayes Score 1.6114312e-07
Marked as misclassified No
Message-id <1283294631.41.0.497532721908.issue1552880@psf.upfronthosting.co.za>
In-reply-to
Content
utf-8 codec (in strict mode) rejects surrogates in python3, and so you doesn't support undecodable filenames (filenames decoded using surrogateescape error handler which produces surrogate characters). It may be possible if you use surrogateescape everywhere.

Manipulate encoded filenames is not trivial because it may quickly lead to mojibake if the encodings are different (eg. if sys.path contains a bytes filename, you have to be careful). Use utf-8 means that you have to decode and then reencode (to the filesystem encoding) a filename before passing it to a system call (eg. mkdir()). #8611 problem is that Python3 doesn't work if the filesystem is *not* utf-8.

You solution is attractive because it is short, but I prefer to use directly the right solution to not patch Python twice: use unicode (with surrogates, PEP 383, for undecodable filenames) everywhere.
History
Date User Action Args
2010-08-31 22:43:51vstinnersetrecipients: + vstinner, loewis, nnorwitz, brett.cannon, anthonybaxter, theller, kristjan.jonsson, ezio.melotti, BreamoreBoy
2010-08-31 22:43:51vstinnersetmessageid: <1283294631.41.0.497532721908.issue1552880@psf.upfronthosting.co.za>
2010-08-31 22:43:50vstinnerlinkissue1552880 messages
2010-08-31 22:43:49vstinnercreate