Author vstinner
Recipients Sworddragon, a.badger, bkabrda, larry, lemburg, loewis, ncoghlan, pitrou, r.david.murray, serhiy.storchaka, terry.reedy, vstinner
Date 2013-12-10.20:27:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CAMpsgwYnhDJWq4gerqobyr1d4ks+dRqUc=sWD8ndF5Ug36xx_A@mail.gmail.com>
In-reply-to <1386703691.12.0.697867846929.issue19846@psf.upfronthosting.co.za>
Content
2013/12/10 Toshio Kuratomi <report@bugs.python.org>:
> if G_FILENAME_ENCODING:
>     charset = the first charset listed in G_FILENAME_ENCODING
>     if charset == '@locale':
>         charset = charset of user's locale
> elif G_BROKEN_FILENAMES:
>     charset = charset of user's locale
> else:
>     charset = 'UTF-8'

g_get_filename_charsets() returns a list of encodings. For the last
case (else:), it uses ['utf-8', local_encoding] on UNIX. It's reliable
because the utf-8 encoding has a nice feature, the utf-8 decoder fails
if the byte string is not a valid utf-8 string.

It would interesting to test this approach (try utf-8 or use the
locale encoding) in
PyUnicode_DecodeFSDefault/PyUnicode_EncodeFSDefault and
_Py_char2wchar/_Py_wchar2char.
History
Date User Action Args
2013-12-10 20:27:49vstinnersetrecipients: + vstinner, lemburg, loewis, terry.reedy, ncoghlan, pitrou, larry, a.badger, r.david.murray, Sworddragon, serhiy.storchaka, bkabrda
2013-12-10 20:27:49vstinnerlinkissue19846 messages
2013-12-10 20:27:49vstinnercreate