Message 404290 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	Caolán.McNamara, ezio.melotti, jwilk, lemburg, loewis, serhiy.storchaka, taleinat, vstinner
Date	2021-10-19.11:20:36
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<725d9a1f-ea72-4cde-0e01-41922a168f27@egenix.com>
In-reply-to	<1634633089.27.0.996674044666.issue19459@roundup.psfhosted.org>

Content
On 19.10.2021 10:44, Serhiy Storchaka wrote: > > Possible solutions (they can be combined): > > 1. Add support for the GEORGIAN-PS charset and all other encodings used in libc (issue22679). The problem is that it is difficult to get the official information about these encodings. As with all encodings we add: there has to be a real need to support them natively in Python (as opposed to installing codecs via PyPI) and we need a definite source for the encoding, e.g. a standards document from an official body. IMO, we should not really add more encodings to the stdlib, but instead point people to e.g. the iconv package: https://pypi.org/project/python-iconv/ Perhaps we ought to make it easier for such packages to provide additional codecs even during the startup phase, e.g. via a special env var which points Python to a list of codec packages to load prior to initializing the I/O encoding... not sure whether this is possible, though. > 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale encoding. But typos can slip unnoticed. I think this would be a more general solution to such cases, provided the startup logic issues a visible warning about the fallback.

On 19.10.2021 10:44, Serhiy Storchaka wrote:
> 
> Possible solutions (they can be combined):
> 
> 1. Add support for the GEORGIAN-PS charset and all other encodings used in libc (issue22679). The problem is that it is difficult to get the official information about these encodings.

As with all encodings we add: there has to be a real need to support
them natively in Python (as opposed to installing codecs via PyPI)
and we need a definite source for the encoding, e.g. a standards
document from an official body.

IMO, we should not really add more encodings to the stdlib, but instead
point people to e.g. the iconv package:

https://pypi.org/project/python-iconv/

Perhaps we ought to make it easier for such packages to provide
additional codecs even during the startup phase, e.g. via a special
env var which points Python to a list of codec packages to load
prior to initializing the I/O encoding... not sure whether this is
possible, though.

> 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale encoding. But typos can slip unnoticed.

I think this would be a more general solution to such cases, provided
the startup logic issues a visible warning about the fallback.

History
Date	User	Action	Args
2021-10-19 11:20:36	lemburg	set	recipients: + lemburg, loewis, vstinner, taleinat, jwilk, ezio.melotti, serhiy.storchaka, Caolán.McNamara
2021-10-19 11:20:36	lemburg	link	issue19459 messages
2021-10-19 11:20:36	lemburg	create