This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients docs@python, eryksun, smallbigcake
Date 2021-02-06.06:39:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1612593581.97.0.493609874304.issue43140@roundup.psfhosted.org>
In-reply-to
Content
On most platforms, unless UTF-8 mode is enabled, locale.getpreferredencoding(False) returns the LC_CTYPE encoding of the current locale. For example, in Linux:

    >>> locale.setlocale(locale.LC_CTYPE, 'en_US.UTF-8')
    'en_US.UTF-8'
    >>> locale.getpreferredencoding(False)
    'UTF-8'
    >>> locale.setlocale(locale.LC_CTYPE, 'en_US.iso-88591')
    'en_US.iso-88591'
    >>> locale.getpreferredencoding(False)
    'ISO-8859-1'

If the designers of the io module had wanted the preferred encoding to always be the default encoding from setlocale(LC_CTYPE, ""), they would have used and documented locale.getpreferredencoding(True).

---

In Windows, locale.getpreferredencoding(False) always returns the default encoding from locale.getdefaultlocale(), which is the process active (ANSI) code page. Changing it to track the LC_CTYPE locale would be convenient for applications and scripts running in Windows 10, for which the CRT's POSIX locale implementation has supported UTF-8 since spring of 2018.

The base behavior can't be changed at this point, but a -X option and/or environment variable could enable locale.getpreferredencoding(False) --  i.e. locale._get_locale_encoding() -- to return the current LC_CTYPE encoding in Windows, as it does in POSIX.
History
Date User Action Args
2021-02-06 06:39:42eryksunsetrecipients: + eryksun, docs@python, smallbigcake
2021-02-06 06:39:41eryksunsetmessageid: <1612593581.97.0.493609874304.issue43140@roundup.psfhosted.org>
2021-02-06 06:39:41eryksunlinkissue43140 messages
2021-02-06 06:39:41eryksuncreate