Message205670
I didn't understand Serhiy's "ls" example. I tried:
$ mkdir unicode
$ cd unicode
$ python3 -c 'open("ab\xe9.txt", "w").close()'
$ python3 -c 'open("euro\u20ac.txt", "w").close()'
$ ls
abé.txt euro€.txt
$ LANG=C ls
ab??.txt euro???.txt
Ah yes, I didn't remember that "ls" is aware of the locale encoding.
printf() and wprintf() behave differently on unencodable/undecoable characters:
http://unicodebook.readthedocs.org/en/latest/programming_languages.html#printf-functions-family
Again, the issue is not specific to Python. So it's time to learn how to configure correctly your locales.
About the "interoperability" point I mentionned in my first message ("This encoding is the best choice for interopability with other (python2 or non python) programs."): if you work around the annoying ASCII encoding by forcing UTF-8 encoding, Python may produce data which would be incompatible with other applications following POSIX and so using the ASCII encoding. |
|
Date |
User |
Action |
Args |
2013-12-09 10:13:15 | vstinner | set | recipients:
+ vstinner, lemburg, loewis, terry.reedy, ncoghlan, pitrou, larry, a.badger, r.david.murray, deleted250130, serhiy.storchaka, bkabrda |
2013-12-09 10:13:15 | vstinner | set | messageid: <1386583995.79.0.249540871674.issue19846@psf.upfronthosting.co.za> |
2013-12-09 10:13:15 | vstinner | link | issue19846 messages |
2013-12-09 10:13:15 | vstinner | create | |
|