Message341955
> FYI, I expect cp65001 will be used more widely in near future,
[...]
> It seems use `SetConsoleOutputCP(65001)` and `SetConsoleCP(65001)`.
Unless PYTHONLEGACYWINDOWSSTDIO is defined, Python 3.6+ doesn't use the console's codepage-based interface (except for low-level os.read and os.write). Console files uses the wide-character console API internally, and have a "utf-8" encoding. "cp65001" isn't a factor in this context.
This issue probably occurs due to the encoding returned by locale.getpreferredencoding(). This calls _locale._getdefaultlocale, which returns a tuple that mixes the user locale with the system ANSI codepage. For example, with ANSI set to UTF-8 (Windows 10):
>>> _locale._getdefaultlocale()
('en_GB', 'cp65001')
The Universal CRT special cases CP_UTF8 (codepage 65001) as "utf8" and accepts "utf-8" as an alias. For example, after setting the ANSI codepage to UTF-8:
>>> locale.setlocale(locale.LC_CTYPE, '')
'English_United Kingdom.utf8'
Python could similarly special case CP_UTF8 as "utf-8" in _locale._getdefaultlocale. |
|
Date |
User |
Action |
Args |
2019-05-09 03:31:49 | eryksun | set | recipients:
+ eryksun, paul.moore, vstinner, tim.golden, methane, zach.ware, serhiy.storchaka, steve.dower, Paul Monson |
2019-05-09 03:31:49 | eryksun | set | messageid: <1557372709.91.0.0173508694847.issue36778@roundup.psfhosted.org> |
2019-05-09 03:31:49 | eryksun | link | issue36778 messages |
2019-05-09 03:31:49 | eryksun | create | |
|