Message225858
Your clean() function looses information. If a filename contains almost only undecodable characters, it will looks like ����.txt. It's not very useful. I would prefer to escape the byte. Mac OS X (HFS+ filesystem) uses for example %HH format: "\udc80" would be replaced with "%80" for example.
This format is also used in URLs. For example, "a\xe9b.txt" (latin-1, whereas my locale encoding is UTF-8) is displayed "a�b.txt" in Firefox (when listing a local directory), but Firefox uses the URL "file://.../a%E9b.txt" (hexadecimal in uppercase).
In the Gnome file browser (Nautilus), "a\xe9b.txt" (latin-1, whereas my locale encoding is UTF-8) is displayed "a�b.txt (invalid encoding)". |
|
Date |
User |
Action |
Args |
2014-08-25 01:16:14 | vstinner | set | recipients:
+ vstinner, ncoghlan, pitrou, ezio.melotti, Arfrever, r.david.murray, serhiy.storchaka |
2014-08-25 01:16:14 | vstinner | set | messageid: <1408929374.02.0.294225577905.issue18814@psf.upfronthosting.co.za> |
2014-08-25 01:16:14 | vstinner | link | issue18814 messages |
2014-08-25 01:16:13 | vstinner | create | |
|