Issue42846
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-01-06 22:38 by neonene, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
bug.py | vstinner, 2021-01-07 22:18 |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 24157 | merged | vstinner, 2021-01-07 22:45 |
Messages (12) | |||
---|---|---|---|
msg384541 - (view) | Author: neonene (neonene) * | Date: 2021-01-06 22:38 | |
After https://github.com/python/cpython/commit/0b858cdd5d114f0890b11b6c4d6559d0ceb468ab (bpo-1635741: Convert _multibytecodec to multi-phase init), On Windows x64/x86 with chinese/japanese/korean system-locale, MultibyteCodec_Check() in multibytecodec.c returns false and PyExc_TypeError follows. This affects some tests and PGO training. 1) python -m test --verbose test_threading ====================================================================== FAIL: test_daemon_threads_fatal_error (test.test_threading.SubinterpThreadi ngTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_threading.py", line 1124, in test_da emon_threads_fatal_error self.assertIn("Fatal Python error: Py_EndInterpreter: " AssertionError: 'Fatal Python error: Py_EndInterpreter: not the last thread ' not found in 'TypeError: codec is unexpected type\nFatal Python error: _P yThreadState_Delete: tstate 00000000003FF980 is still current\nPython runti me state: initialized\n\nThread 0x00000710 (most recent call first):\n<no P ython frame>\n' 2) python -m test --verbose test_embed ====================================================================== FAIL: test_audit_subinterpreter (test.test_embed.AuditingTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 1433, in test_audit_ subinterpreter self.run_embedded_interpreter("test_audit_subinterpreter") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 000000000050CAF0 is still current\nPython runtime state: initializ ed\n\nThread 0x000009d8 (most recent call first):\n<no Python frame>\n' ====================================================================== FAIL: test_subinterps_different_ids (test.test_embed.EmbeddingTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 169, in test_subinte rps_different_ids for run in self.run_repeated_init_and_subinterpreters(): File "C:\cpython-0b858\lib\test\test_embed.py", line 110, in run_repeated _init_and_subinterpreters out, err = self.run_embedded_interpreter("test_repeated_init_and_subint erpreters") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 000000000041C960 is still current\nPython runtime state: initializ ed\n\nThread 0x00000a40 (most recent call first):\n<no Python frame>\n' ====================================================================== FAIL: test_subinterps_distinct_state (test.test_embed.EmbeddingTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 177, in test_subinte rps_distinct_state for run in self.run_repeated_init_and_subinterpreters(): File "C:\cpython-0b858\lib\test\test_embed.py", line 110, in run_repeated _init_and_subinterpreters out, err = self.run_embedded_interpreter("test_repeated_init_and_subint erpreters") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 000000000047C960 is still current\nPython runtime state: initializ ed\n\nThread 0x00000b34 (most recent call first):\n<no Python frame>\n' ====================================================================== FAIL: test_subinterps_main (test.test_embed.EmbeddingTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 163, in test_subinte rps_main for run in self.run_repeated_init_and_subinterpreters(): File "C:\cpython-0b858\lib\test\test_embed.py", line 110, in run_repeated _init_and_subinterpreters out, err = self.run_embedded_interpreter("test_repeated_init_and_subint erpreters") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 000000000032C960 is still current\nPython runtime state: initializ ed\n\nThread 0x00000bf0 (most recent call first):\n<no Python frame>\n' |
|||
msg384607 - (view) | Author: Erlend E. Aasland (erlendaasland) * ![]() |
Date: 2021-01-07 21:52 | |
I'm unable to reproduce this on Windows 10 (amd64). What's your exact locale setting? Are you compiling with HEAD at 0b858cdd5d114f0890b11b6c4d6559d0ceb468ab? |
|||
msg384610 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 22:18 | |
I can reproduce the issue on Windows configured in Japanese language: ANSI code page cp932. I managed to reproduce the bug on Linux with attached bug.py |
|||
msg384611 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 22:21 | |
It took me a while to understand it, the _multibytecodec module itself is fine. The issue comes from the _codecs_jp module which uses the legacy module API: codec = _codecs_jp.getcodec('cp932') |
|||
msg384613 - (view) | Author: Erlend E. Aasland (erlendaasland) * ![]() |
Date: 2021-01-07 22:25 | |
It should be sufficient to convert cjkcodecs.h to multi-phase init then? From what I can see, the support modules are state less, right? |
|||
msg384616 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 22:36 | |
I'm working on a fix. |
|||
msg384618 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 23:05 | |
Attached PR 24157 should fix the issue. > FAIL: test_daemon_threads_fatal_error (test.test_threading.SubinterpThreadingTests) This test runs code in a subinterpreter which is run in a subprocess. The problem is not in the code run in the subinterpreter, but the creation of sys.stdout in the subprocess. The test creates a subprocess and redirects its stdout and stderr. In this case, Python doesn't create a _io._WindowsConsoleIO for sys.stdout.buffer.raw, but a regular _io.FileIO object. When the raw I/O is a _WindowsConsoleIO instance, create_stdio() of Python/pylifecycle.c forces the usage of the UTF-8 encoding. But for FileIO, it keeps the locale encoding. If the locale encoding is "cp932", a CJK multicodec is used. In the main interpreter, it's fine. In a subinterpreter, we hit the bug of the _codecs_jp which doesn't use the new multi-phase initialization API. |
|||
msg384619 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 23:08 | |
Simpler way to reproduce the issue with t.py script: --- import test.support import sys import _testcapi print(f"{sys.stdout.encoding=}", file=sys.stderr) with test.support.SuppressCrashReport(): _testcapi.run_in_subinterp("pass") --- By default, UTF-8 is used, everything is fine: ----- C:\> python t.py sys.stdout.encoding='utf-8' ----- Disable _WindowsConsoleIO with PYTHONLEGACYWINDOWSSTDIO env var, we get the issue: ----- C:\> set PYTHONLEGACYWINDOWSSTDIO=1 C:\> python t.py Running Debug|x64 interpreter... sys.stdout.encoding='cp932' TypeError: codec is unexpected type Fatal Python error: (...) ----- Or redirect the output into a program or a file to disable _WindowsConsoleIO to also reproduce the issue: ----- C:\> python t.py|more sys.stdout.encoding='cp932' TypeError: codec is unexpected type (...) ----- |
|||
msg384620 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 23:12 | |
Ah, if you don't want to change the ANSI code page to cp932 (Japanese language) just to reproduce the issue, you can just set the stdio encoding: ----- C:\> set PYTHONIOENCODING=cp932 C:\> python t.py|more sys.stdout.encoding='cp1250' TypeError: codec is unexpected type (...) ----- |
|||
msg384621 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 23:15 | |
New changeset 07f2cee93f1b619650403981c455f47bfed8d818 by Victor Stinner in branch 'master': bpo-42846: Convert CJK codec extensions to multiphase init (GH-24157) https://github.com/python/cpython/commit/07f2cee93f1b619650403981c455f47bfed8d818 |
|||
msg384622 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-07 23:18 | |
> 1) python -m test --verbose test_threading > 2) python -m test --verbose test_embed I ran manually these two tests with cp932 ANSI code page: they now pass with my fix. I also added a regression test to test_multibytecodec.py. Thanks for your quick bug report neonene! It's now fixed. |
|||
msg384644 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-01-08 09:30 | |
> bpo-42846: Convert CJK codec extensions to multiphase init (GH-24157) I added a new test and new test spotted a reference leak, likely an existing one: bpo-42866 "test test_multibytecodec: Test_IncrementalEncoder.test_subinterp() leaks references". |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:39 | admin | set | github: 87012 |
2021-01-08 09:30:21 | vstinner | set | messages: + msg384644 |
2021-01-07 23:18:25 | vstinner | set | status: open -> closed resolution: fixed messages: + msg384622 stage: patch review -> resolved |
2021-01-07 23:15:29 | vstinner | set | messages: + msg384621 |
2021-01-07 23:13:00 | vstinner | set | messages: + msg384620 |
2021-01-07 23:08:48 | vstinner | set | messages: + msg384619 |
2021-01-07 23:05:41 | vstinner | set | messages: + msg384618 |
2021-01-07 22:45:55 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request22985 |
2021-01-07 22:36:31 | vstinner | set | messages: + msg384616 |
2021-01-07 22:25:30 | erlendaasland | set | messages: + msg384613 |
2021-01-07 22:21:36 | vstinner | set | messages: + msg384611 |
2021-01-07 22:18:28 | vstinner | set | files:
+ bug.py messages: + msg384610 |
2021-01-07 21:52:26 | erlendaasland | set | messages: + msg384607 |
2021-01-06 22:38:22 | neonene | create |