This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients Caolán.McNamara, ezio.melotti, jwilk, lemburg, loewis, serhiy.storchaka, taleinat, vstinner
Date 2021-10-19.11:20:36
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <725d9a1f-ea72-4cde-0e01-41922a168f27@egenix.com>
In-reply-to <1634633089.27.0.996674044666.issue19459@roundup.psfhosted.org>
Content
On 19.10.2021 10:44, Serhiy Storchaka wrote:
> 
> Possible solutions (they can be combined):
> 
> 1. Add support for the GEORGIAN-PS charset and all other encodings used in libc (issue22679). The problem is that it is difficult to get the official information about these encodings.

As with all encodings we add: there has to be a real need to support
them natively in Python (as opposed to installing codecs via PyPI)
and we need a definite source for the encoding, e.g. a standards
document from an official body.

IMO, we should not really add more encodings to the stdlib, but instead
point people to e.g. the iconv package:

https://pypi.org/project/python-iconv/

Perhaps we ought to make it easier for such packages to provide
additional codecs even during the startup phase, e.g. via a special
env var which points Python to a list of codec packages to load
prior to initializing the I/O encoding... not sure whether this is
possible, though.

> 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale encoding. But typos can slip unnoticed.

I think this would be a more general solution to such cases, provided
the startup logic issues a visible warning about the fallback.
History
Date User Action Args
2021-10-19 11:20:36lemburgsetrecipients: + lemburg, loewis, vstinner, taleinat, jwilk, ezio.melotti, serhiy.storchaka, Caolán.McNamara
2021-10-19 11:20:36lemburglinkissue19459 messages
2021-10-19 11:20:36lemburgcreate