This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients methane, vstinner
Date 2021-03-19.14:31:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1616164318.88.0.414287905715.issue43510@roundup.psfhosted.org>
In-reply-to
Content
I see different cases when open() is called with no encoding argument:

(A) User wants to use UTF-8: add encoding="utf-8"

(B) Windows user wants to use the ANSI code page of their computer, local file not intended to be shared with other computers: add encoding="mbcs". This makes the code specific to Windows ("mbcs" alias doesn't exist on Unix).

(C) User wants to use the locale encoding and is fine with the UTF-8 Mode: add encoding=getpreferredencoding(False)

(D) Unix user wants to use the locale encoding but not the UTF-8 Mode: encoding=get_current_locale_encoding() (function proposed in bpo-43552) or nl_langinfo(CODESET) (should work on any Python version). I don't know if nl_langinfo(CODESET) is available on Windows.

(E) User has no idea of what they are doing and don't understand anything to Unicode: please trust us and specify explicitly UTF-8 :-)

Apart the encoding="utf-8" case, I understand that they are two main complex cases:

(1) "UTF-8" in the UTF-8 Mode, or the locale encoding
(2) Always use the locale encoding, ignore the UTF-8 Mode

What I don't expect is the current behavior, before PEP 597. Who uses open() without specifying an encoding but always want to use the locale encoding? (case 2) So this use case is already broken when the UTF-8 Mode is enabled explicitly?
History
Date User Action Args
2021-03-19 14:31:58vstinnersetrecipients: + vstinner, methane
2021-03-19 14:31:58vstinnersetmessageid: <1616164318.88.0.414287905715.issue43510@roundup.psfhosted.org>
2021-03-19 14:31:58vstinnerlinkissue43510 messages
2021-03-19 14:31:58vstinnercreate