This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [docs] IO > Text Encoding info outdated
Type: behavior Stage:
Components: Documentation Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, eryksun, gilbertson.david, methane
Priority: normal Keywords:

Created on 2021-12-21 04:42 by gilbertson.david, last changed 2022-04-11 14:59 by admin.

Messages (3)
msg408983 - (view) Author: David Gilbertson (gilbertson.david) * Date: 2021-12-21 04:42
On this page: https://docs.python.org/3/library/io.html#text-encoding it says "there is no concrete plan as of yet, Python may change the default text file encoding to UTF-8 in the future".

On this page https://docs.python.org/3/library/os.html#utf8-mode is says that from 3.7 onwards UTF-8 will be selected by default.

Does that mean that the text in the first section is now outdated, as it was addressed by PEP 540?

I'm a newbie, so apologies if I'm missing something obvious or filing this in the wrong spot.
msg408985 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2021-12-21 06:27
UTF-8 mode is not enabled by default. So locale encoding is still the default encoding.
msg408993 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-12-21 09:07
The rare circumstance in which UTF-8 mode gets enabled automatically is described in the following paragraph [1]:

    If the PYTHONUTF8 environment variable is not set at all, then the
    interpreter defaults to using the current locale settings, unless the
    current locale is identified as a legacy ASCII-based locale (as
    described for PYTHONCOERCECLOCALE), and locale coercion is either 
    disabled or fails. In such legacy locales, the interpreter will
    default to enabling UTF-8 mode unless explicitly instructed not to do
    so.

Note that UTF-8 mode is never enabled automatically in Windows. In contrast to POSIX, the locale encoding in Windows is unrelated to the current LC_CTYPE locale. Instead, the locale encoding gets set to the process code page, which is based on the system locale by default and never changes while a process is running. The system locale may be incompatible with the current LC_CTYPE locale, Windows user locale, and preferred UI language (e.g. for text resources such as error messages), so try to explicitly use UTF-8 for text files whenever possible.

---
[1] https://docs.python.org/3/library/os.html#utf8-mode
History
Date User Action Args
2022-04-11 14:59:53adminsetgithub: 90301
2021-12-21 09:07:08eryksunsetnosy: + eryksun
messages: + msg408993
2021-12-21 06:27:47methanesetnosy: + methane
messages: + msg408985
2021-12-21 04:42:17gilbertson.davidcreate