This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients HWJ, amaury.forgeotdarc, benjamin.peterson, dlitz, gvanrossum, loewis, pitrou, vstinner, zegreek
Date 2008-09-28.21:31:28
SpamBayes Score 2.6339146e-09
Marked as misclassified No
Message-id <1222637489.65.0.737924576623.issue3187@psf.upfronthosting.co.za>
In-reply-to
Content
I'd like to propose yet another approach: make sure that conversion
according to the file system encoding always succeeds. If an
unconvertable byte is detected, map it into some private-use character.
To reduce the chance of conflict with other people's private-use
characters, we can use some of the plane 15 private-use characters, e.g.
map byte 0xPQ to U+F30PQ (in two-byte Unicode mode, this would result in
a surrogate pair).

This would make all file names accessible to all text processing
(including glob and friends); UI display would typically either report
an encoding error, or arrange for some replacement glyph to be shown.

There are certain variations of the approach possible, in case there is
objection to a specific detail.
History
Date User Action Args
2008-09-28 21:31:29loewissetrecipients: + loewis, gvanrossum, amaury.forgeotdarc, pitrou, vstinner, benjamin.peterson, HWJ, dlitz, zegreek
2008-09-28 21:31:29loewissetmessageid: <1222637489.65.0.737924576623.issue3187@psf.upfronthosting.co.za>
2008-09-28 21:31:29loewislinkissue3187 messages
2008-09-28 21:31:28loewiscreate