This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Arfrever, amaury.forgeotdarc, brett.cannon, lemburg, loewis, pitrou, vstinner
Date 2010-09-29.10:48:40
SpamBayes Score 6.4119887e-10
Marked as misclassified No
Message-id <1285757323.39.0.76994449268.issue9630@psf.upfronthosting.co.za>
In-reply-to
Content
Forget my previous message, I forgot important points.

> So the only reason why you have to go through
> all those hoops is to
>
> * allow the complete set of Python supported encoding
>   names for the PYTHONFSENCODING
>
> * make sure that the Py_FilesystemDefaultEncoding is set
>   to the actual name of the codec as used by the system

Not only. As I wrote in my first message (msg114191), there are two
other good reasons to keep the current code but redecode filenames:

 * Encoding aliases: locale encoding is not always written as the
   official Python encoding name. Eg. utf8 vs UTF-8, iso8859-1 vs
   latin_1, etc. We have to be able to load Lib/encodings/aliases.py to
   to get the Python codec.

 * Codecs implemented in Python: only ascii, latin1, utf8 and mbcs
   codecs are builtin. All other encodings are implemented in Python. If
   your filesystem encoding is ShiftJIS, you have to load
   Lib/encodings/shift_jis.py to load the codec.

For these two reasons, we have to import Python modules before being
able to set the filesystem encoding. So we have to redecode filenames
after setting the filesystem encodings.

> the redecoding of the filenames is fragile

We can setup a buildbot installed in a non-ascii path. Antoine had such
buildbot, which already helped to find many bugs related to non-ascii paths.

--

We can choose to only support ascii, latin1, utf8 and mbcs for the
filesystem encoding, but users will complain that we break compatibility
with old systems. Python3 already "breaks" the language, I don't think
that it is a good idea to choose to become incompatible with old systems
just to simplify (too much) the code.

--

Another solution would be to unload all modules, clear all caches,
delete all code objects, etc. after setting the filesystem encoding. But
I think that it is inefficient and nobody wants a slower Python startup.
History
Date User Action Args
2010-09-29 10:48:43vstinnersetrecipients: + vstinner, lemburg, loewis, brett.cannon, amaury.forgeotdarc, pitrou, Arfrever
2010-09-29 10:48:43vstinnersetmessageid: <1285757323.39.0.76994449268.issue9630@psf.upfronthosting.co.za>
2010-09-29 10:48:41vstinnerlinkissue9630 messages
2010-09-29 10:48:40vstinnercreate