Message 117605 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	Arfrever, amaury.forgeotdarc, brett.cannon, lemburg, loewis, pitrou, vstinner
Date	2010-09-29.11:45:12
SpamBayes Score	1.2273516e-13
Marked as misclassified	No
Message-id	<4CA326C4.90601@egenix.com>
In-reply-to	<1285757323.39.0.76994449268.issue9630@psf.upfronthosting.co.za>

Content
STINNER Victor wrote: > > STINNER Victor <victor.stinner@haypocalc.com> added the comment: > > Forget my previous message, I forgot important points. > >> So the only reason why you have to go through >> all those hoops is to >> >> * allow the complete set of Python supported encoding >> names for the PYTHONFSENCODING >> >> * make sure that the Py_FilesystemDefaultEncoding is set >> to the actual name of the codec as used by the system > > Not only. As I wrote in my first message (msg114191), there are two > other good reasons to keep the current code but redecode filenames: > > * Encoding aliases: locale encoding is not always written as the > official Python encoding name. Eg. utf8 vs UTF-8, iso8859-1 vs > latin_1, etc. We have to be able to load Lib/encodings/aliases.py to > to get the Python codec. > > * Codecs implemented in Python: only ascii, latin1, utf8 and mbcs > codecs are builtin. All other encodings are implemented in Python. If > your filesystem encoding is ShiftJIS, you have to load > Lib/encodings/shift_jis.py to load the codec. > > For these two reasons, we have to import Python modules before being > able to set the filesystem encoding. So we have to redecode filenames > after setting the filesystem encodings. No, that's not needed ! Please see my earlier message: you can still do all this at a later time during startup and double-check that the encoding is indeed valid. The main point is that you don't need to apply all those checks before setting the file system encoding in the interpreter. Early on you just assume that the env vars are setup correctly and head on into starting up the interpreter. If the decoding fails during startup due to a wrong encoding of file or path names, the interpreter will signal this. If you have a case where everything imports fine, you can then still double check at the time the file system encoding is set now to e.g. detect cases where the encoding was set to ascii, but in reality the interpreter was just lucky and the file system encoding should be utf-8.

STINNER Victor wrote:
> 
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
> 
> Forget my previous message, I forgot important points.
> 
>> So the only reason why you have to go through
>> all those hoops is to
>>
>> * allow the complete set of Python supported encoding
>>   names for the PYTHONFSENCODING
>>
>> * make sure that the Py_FilesystemDefaultEncoding is set
>>   to the actual name of the codec as used by the system
> 
> Not only. As I wrote in my first message (msg114191), there are two
> other good reasons to keep the current code but redecode filenames:
> 
>  * Encoding aliases: locale encoding is not always written as the
>    official Python encoding name. Eg. utf8 vs UTF-8, iso8859-1 vs
>    latin_1, etc. We have to be able to load Lib/encodings/aliases.py to
>    to get the Python codec.
> 
>  * Codecs implemented in Python: only ascii, latin1, utf8 and mbcs
>    codecs are builtin. All other encodings are implemented in Python. If
>    your filesystem encoding is ShiftJIS, you have to load
>    Lib/encodings/shift_jis.py to load the codec.
> 
> For these two reasons, we have to import Python modules before being
> able to set the filesystem encoding. So we have to redecode filenames
> after setting the filesystem encodings.

No, that's not needed ! Please see my earlier message: you can still
do all this at a later time during startup and double-check that
the encoding is indeed valid.

The main point is that you don't need to apply all those checks
before setting the file system encoding in the interpreter.
Early on you just assume that the env vars are setup correctly
and head on into starting up the interpreter.

If the decoding fails during startup due to a wrong encoding of
file or path names, the interpreter will signal this. If you have
a case where everything imports fine, you can then still double
check at the time the file system encoding is set now to e.g.
detect cases where the encoding was set to ascii, but in reality
the interpreter was just lucky and the file system encoding
should be utf-8.

History
Date	User	Action	Args
2010-09-29 11:45:15	lemburg	set	recipients: + lemburg, loewis, brett.cannon, amaury.forgeotdarc, pitrou, vstinner, Arfrever
2010-09-29 11:45:13	lemburg	link	issue9630 messages
2010-09-29 11:45:13	lemburg	create