Author vstinner
Recipients a.badger, abadger1999, benjamin.peterson, ezio.melotti, lemburg, ncoghlan, pitrou, r.david.murray, vstinner
Date 2013-08-21.00:31:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CAMpsgwbtE_ETOn-W+oc8MVy4eZPJrsnLG90DBC187OCw1oSRDA@mail.gmail.com>
In-reply-to <1377044191.58.0.292561564195.issue18713@psf.upfronthosting.co.za>
Content
On Linux, the locale encoding is usually UTF-8. If a filename cannot
be decoded from UTF-8, invalid bytes are escaped to the surrogate
range using the PEP 383. If I create a UTF-8 text file and I try to
write the filename into this text file, the Python UTF-8 encoder
raises an error.

IMO Python must raise an error here because I want to generate a valid
UTF-8 text file, not a text file only readable by Python if the locale
encoding is UTF-8.

So using surrogateescape error handler if the encoding is
sys.getfilesystemencoding() is *not* a good idea.

What is your use case where you need to display a filename? Is it
displayed to the terminal, into a file or in a graphical window? Why
not escaping surrogate just to format the filename, as Gnome does? See
for example:
https://developer.gnome.org/glib/2.34/glib-Character-Set-Conversion.html#g-filename-display-name
History
Date User Action Args
2013-08-21 00:31:54vstinnersetrecipients: + vstinner, lemburg, ncoghlan, pitrou, abadger1999, benjamin.peterson, ezio.melotti, a.badger, r.david.murray
2013-08-21 00:31:54vstinnerlinkissue18713 messages
2013-08-21 00:31:53vstinnercreate