This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author a.badger
Recipients Sworddragon, a.badger, bkabrda, larry, lemburg, loewis, ncoghlan, pitrou, r.david.murray, serhiy.storchaka, terry.reedy, vstinner
Date 2013-12-10.19:28:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1386703691.12.0.697867846929.issue19846@psf.upfronthosting.co.za>
In-reply-to
Content
Looking at the glib code, this looks like the SO post is closer to the truth.  The API documentation for g_filename_to_utf8() is over-simplified to the point of confusion.  This section of the glib API document is closer to what the code is doing: https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html#file-name-encodings

* When encoding matters, glib and gtk functions will assume that char*'s that you pass to them point to strings which are encoded in utf-8.
* When char* are not utf8 you are responsible for converting them to utf8 to be used by the glib functions (if encoding matters).
* glib provides g_filename_to_utf8() for the special case of transforming filenames into the encoding that glib expects.  (Presumably because glib and gtk deal with non-utf8 unicode filenames more often than the equivalent environment variables, command line switches, etc).
* Contrary to the API docs for g_filename_to_utf8(), g_filename_to_utf8() will simply return a copy of the byte string it was passed unless G_FILENAME_ENCODING or G_BROKEN_FILENAMES is set.  If those are set, then the value of G_FILENAME_ENCODING might be used to attempt to decode the filename or the encoding specified in the user's locale might be used.

@haypo, I'm pretty sure from reading the code for g_get_filename_charsets() that you have the conditionals reversed.  What I'm seeing is:

if G_FILENAME_ENCODING:
    charset = the first charset listed in G_FILENAME_ENCODING
    if charset == '@locale':
        charset = charset of user's locale
elif G_BROKEN_FILENAMES:
    charset = charset of user's locale
else:
    charset = 'UTF-8'
History
Date User Action Args
2013-12-10 19:28:11a.badgersetrecipients: + a.badger, lemburg, loewis, terry.reedy, ncoghlan, pitrou, vstinner, larry, r.david.murray, Sworddragon, serhiy.storchaka, bkabrda
2013-12-10 19:28:11a.badgersetmessageid: <1386703691.12.0.697867846929.issue19846@psf.upfronthosting.co.za>
2013-12-10 19:28:11a.badgerlinkissue19846 messages
2013-12-10 19:28:10a.badgercreate