This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients lemburg, vstinner
Date 2021-03-19.11:11:34
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1616152294.5.0.168060102826.issue43552@roundup.psfhosted.org>
In-reply-to
Content
> Martin later added locale.getpreferredencoding(), which tries to
> determine the encoding in a different way new way, based on
> nl_langset(CODEINFO). As you mentioned, this intention was broken
> on several platforms by forcing UTF-8 as output.

When I designed and implemented the PEP 540 (Python UTF-8 Mode), I tried to leave getpreferredencoding() unchanged. The problem was that I quickly got mojibake because too many functions call getpreferredencoding(False):

* open() and _pyio.open() -- in Python 3.10, open() now calls the C _Py_GetLocaleEncoding() function to fix issues during Python shutdown, it also avoids issues at startup.
* Many gettext functions
* cgi to decode the query string from QUERY_STRING env var or sys.argv[1]}
* xml.etree.ElementTree.write(encoding="unicode") is some cases

The Python UTF-8 Mode ignores the locale *on purpose*. But I agree that it's surprising and can lead to confusion. That's what I'm trying to fix here :-)
History
Date User Action Args
2021-03-19 11:11:34vstinnersetrecipients: + vstinner, lemburg
2021-03-19 11:11:34vstinnersetmessageid: <1616152294.5.0.168060102826.issue43552@roundup.psfhosted.org>
2021-03-19 11:11:34vstinnerlinkissue43552 messages
2021-03-19 11:11:34vstinnercreate