This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)
Type: behavior Stage: patch review
Components: Interpreter Core, Subinterpreters Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: M-Reimer, bsteffensmeier, corona10, eric.snow, erlendaasland, graysky, hroncok, jokot3, miss-islington, ndjensen, petr.viktorin, prahal, shihai1991, uckelman, vstinner
Priority: normal Keywords: patch

Created on 2021-12-14 10:57 by graysky, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
win_py399_crash_reproducer.py bsteffensmeier, 2021-12-15 18:35 Reproducer that intermittently crashes on windows
bug.py vstinner, 2021-12-16 00:40
pyobject_ob_interp.patch vstinner, 2022-01-13 20:24
sqlite3_crash.py prahal, 2022-03-24 16:27 Reproducer that crashes 90% of the time
bug.py_asyncio_cpustressed-crash.log prahal, 2022-03-24 21:16 asyncio bug.py crash while running stress -c `nproc --all`
Pull Requests
URL Status Linked Edit
PR 30423 merged erlendaasland, 2022-01-05 19:29
PR 30453 merged miss-islington, 2022-01-07 14:08
PR 30454 merged miss-islington, 2022-01-07 14:08
PR 30564 closed vstinner, 2022-01-12 23:26
PR 30565 closed vstinner, 2022-01-12 23:26
PR 30566 closed vstinner, 2022-01-12 23:26
PR 30577 closed vstinner, 2022-01-13 17:04
PR 30578 merged miss-islington, 2022-01-13 18:28
PR 30579 closed miss-islington, 2022-01-13 18:28
PR 30580 merged vstinner, 2022-01-13 18:30
Messages (54)
msg408520 - (view) Author: (graysky) Date: 2021-12-14 10:57
Seems as though cpython is broken when working with subinterpreters. The problematic change could be (d0d29655ff) affecting import.c.[1] Reverting this commit and rebuilding python fixes the issues on my system with some scripts that import sqlite, for example, the Kodi plugin YouTube[2] and the IMDB Trailers plugin. Others have reported similar breakage with other python code[3].

Example output when the bug manifests:

ERROR <general>: Traceback (most recent call last):
ERROR <general>:   File "/usr/lib/python3.10/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
ERROR <general>:     
ERROR <general>: register_converter("timestamp", convert_timestamp)
ERROR <general>: 
ERROR <general>: KeyError
ERROR <general>: : 
ERROR <general>: 'timepart_full'
ERROR <general>:                                              
ERROR <general>: Exception ignored deletion of interned string failed
ERROR <general>: :


References:
1. https://github.com/python/cpython/commit/d0d29655ffc43d426ad68542d8de8304f7f1346a
2. https://github.com/anxdpanic/plugin.video.youtube/issues/255
3. https://bbs.archlinux.org/viewtopic.php?id=272121
msg408539 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-12-14 15:36
(related: bpo-44059)

Presumably the problem relates to global state used in different interpreters leading to an inconsistent state in the crashing extension (or its dependencies).

@graysky, do you know if this was a problem before Python 3.8?
msg408540 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-12-14 15:44
Interned strings were broken in GH-20058, see bpo-46006. Maybe that's also the issue here?
msg408541 - (view) Author: (graysky) Date: 2021-12-14 16:03
@Eric - I have not seen this on 3.8 or 3.9.  No data before 3.8.
msg408558 - (view) Author: (graysky) Date: 2021-12-14 20:17
While this is being evaluated, can someone give an opinion about the sanity of simply reverting https://hg.python.org/lookup/d0d29655ff for now in order to use 3.10.1?  Thanks.
msg408568 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-12-14 21:38
That was a fix for GH-17350, which might need to be reverted as well.

Victor, could you take another look at GH-17350? I must admit I (still) don't understand this change; what would break if it was reverted (along with the fixup from bpo-44050)?
msg408633 - (view) Author: Ben Steffensmeier (bsteffensmeier) Date: 2021-12-15 18:35
We have been seeing intermittent crashes on jep that we tracked down to the same change (d0d29655ff).

I have created a sample program using _testcapi that crashes about 50% of the time when run on Windows with Python 3.9.9. We have not been able to reproduce problems on any other OS.

See also: https://github.com/ninia/jep/issues/366
msg408662 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-12-16 00:35
I can sometimes reproduce the crash on Windows with Python 3.9. Call stack (most recent to oldest frames):

* PyObject_GC_UnTrack() - crash on _PyGCHead_SET_NEXT(prev, next) because prev is dangling pointer (0x1fe64dd5250), Visual Studio is unable to read memory
* meth_dealloc() -- delete _sre_compile() method object
* (...)
* PyDict_SetItem() -- set "compile" to None
* _PyModule_ClearDict() -- clear the "_sre" module dict
* _PyModule_Clear()
* _PyImport_Clenaup()
* Py_EndInterpreter()
* (...)
* run_in_subinterp()
* (...)
* t_bootstrap()

The crash occurs in meth_dealloc(), when deallocating the _sre_compile() method object stored in _sre module dictionary as the attribute "compile".

The PyGC_Head.prev pointer is a dangling pointer.

On Python 3.9, the "re" module is not imported at startup, but it's imported indirectly by "import importlib.util" via "import typing". On Python 3.10, the re module is no longer imported by "import importlib.util".

The crash is random. Sometimes, I need 3 or 4 tries. Sometimes, it crash using -X dev. Sometimes, it crash immediately. When debugging in Visual Stuido, the crash seems easier to reproduce.

On Python 3.9, the _sre exetnsion uses the old API: PyModule_Create() with PyModuleDef.m_size = -1.

On Python 3.10, the _sre extension has been converted to multiphase init API: PyModuleDef_Init() with PyModuleDef.m_size = sizeof(_sremodulestate). Moreover, "import importlib.util" no longer imports indirectly the "re" module.
msg408664 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-12-16 00:40
Using attached bug.py, it's possible to trigger the crash on the main branch. I modified the reproducer to use the "_asyncio" extension which still uses the old API PyModule_Create() with PyModuleDef.m_size = -1.
msg408665 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-12-16 00:53
Hum, maybe bug.py exposes a different kind of bug. The _asyncio extension uses a non-trivial initialize code which doesn't seem to handle well concurrent "import _asyncio".
msg409245 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-12-27 23:22
FWIW, I've managed to reproduce once with win_py399_crash_reproducer.py on macOS 12.1 (with very high load average). With bug.py, I can reproduce almost always (more than 90% of the time).
msg409255 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-12-28 13:19
I can reproduce the crash on my macOS with main branch version.

Fatal Python error: Segmentation fault

Thread 0x0000700010389000 (most recent call first):
  File "/Users/user/oss/cpython/bug.py", line 16 in doIt
  File "/Users/user/oss/cpython/Lib/threading.py", line 968 in run
  File "/Users/user/oss/cpython/Lib/threading.py", line 1031 in _bootstrap_inner
  File "/Users/user/oss/cpython/Lib/threading.py", line 988 in _bootstrap

Current thread 0x000070000f386000 (most recent call first):
  File "/Users/user/oss/cpython/bug.py", line 16 in doIt
  File "/Users/user/oss/cpython/Lib/threading.py", line 968 in run
  File "/Users/user/oss/cpython/Lib/threading.py", line 1031 in _bootstrap_inner
  File "/Users/user/oss/cpython/Lib/threading.py", line 988 in _bootstrap

Thread 0x000070000e383000 (most recent call first):
  File "/Users/user/oss/cpython/bug.py", line 16 in doIt
  File "/Users/user/oss/cpython/Lib/threading.py", line 968 in run
  File "/Users/user/oss/cpython/Lib/threading.py", line 1031 in _bootstrap_inner
  File "/Users/user/oss/cpython/Lib/threading.py", line 988 in _bootstrap

Thread 0x000070000d380000 (most recent call first):
  File "/Users/user/oss/cpython/bug.py", line 16 in doIt
  File "/Users/user/oss/cpython/Lib/threading.py", line 968 in run
  File "/Users/user/oss/cpython/Lib/threading.py", line 1031 in _bootstrap_inner
  File "/Users/user/oss/cpython/Lib/threading.py", line 988 in _bootstrap

Thread 0x000000010a590e00 (most recent call first):
  File "/Users/user/oss/cpython/Lib/threading.py", line 1125 in _wait_for_tstate_lock
  File "/Users/user/oss/cpython/Lib/threading.py", line 1105 in join
  File "/Users/user/oss/cpython/bug.py", line 23 in func
  File "/Users/user/oss/cpython/bug.py", line 25 in <module>

Extension modules: _testcapi (total: 1)
[1]    9098 segmentation fault  ./python.exe bug.py
msg409461 - (view) Author: Joel Uckelman (uckelman) * Date: 2022-01-01 14:36
I have this happening on Linux with a Flask app after upgrading from Fedora 34 to 35. libpython keeps crashing httpd. 

I see this from journalctl:

     #0  0x00007fd899baa801 PyObject_Malloc (libpython3.10.so.1.0 + 0xf7801)
      #1  0x00007fd899baab47 PyUnicode_New (libpython3.10.so.1.0 + 0xf7b47)
      #2  0x00007fd899bb9aae _PyUnicode_FromUCS1 (libpython3.10.so.1.0 + 0x106aae)
      #3  0x00007fd899bb9323 r_object (libpython3.10.so.1.0 + 0x106323)
      #4  0x00007fd899bb8d46 r_object (libpython3.10.so.1.0 + 0x105d46)
      #5  0x00007fd899bb90b4 r_object (libpython3.10.so.1.0 + 0x1060b4)
      #6  0x00007fd899bb8d65 r_object (libpython3.10.so.1.0 + 0x105d65)
      #7  0x00007fd899bb9088 r_object (libpython3.10.so.1.0 + 0x106088)
      #8  0x00007fd899bb8e33 r_object (libpython3.10.so.1.0 + 0x105e33)
      #9  0x00007fd899bb9088 r_object (libpython3.10.so.1.0 + 0x106088)
      #10 0x00007fd899c35c28 read_object (libpython3.10.so.1.0 + 0x182c28)
      #11 0x00007fd899c48f56 marshal_loads (libpython3.10.so.1.0 + 0x195f56)
      #12 0x00007fd899bc88d7 cfunction_vectorcall_O (libpython3.10.so.1.0 + 0x1158d7)
      #13 0x00007fd899bc0c80 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x10dc80)
      #14 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #15 0x00007fd899bbccba _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x109cba)
      #16 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #17 0x00007fd899bbbd6d _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108d6d)
      #18 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #19 0x00007fd899bbbd6d _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108d6d)
      #20 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #21 0x00007fd899bbbac2 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108ac2)
      #22 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #23 0x00007fd899bbbac2 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108ac2)
      #24 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #25 0x00007fd899bc8a9e object_vacall (libpython3.10.so.1.0 + 0x115a9e)
      #26 0x00007fd899bd247c _PyObject_CallMethodIdObjArgs (libpython3.10.so.1.0 + 0x11f47c)
      #27 0x00007fd899bd21d7 PyImport_ImportModuleLevelObject (libpython3.10.so.1.0 + 0x11f1d7)
      #28 0x00007fd899bbfc8e _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x10cc8e)
      #29 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #30 0x00007fd899c360d4 PyEval_EvalCode (libpython3.10.so.1.0 + 0x1830d4)
      #31 0x00007fd899c3d091 builtin_exec (libpython3.10.so.1.0 + 0x18a091)
      #32 0x00007fd899bc94b0 cfunction_vectorcall_FASTCALL (libpython3.10.so.1.0 + 0x1164b0)
      #33 0x00007fd899bc2209 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x10f209)
      #34 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #35 0x00007fd899bc0c80 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x10dc80)
      #36 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #37 0x00007fd899bbbd6d _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108d6d)
      #38 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #39 0x00007fd899bbbac2 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108ac2)
      #40 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #41 0x00007fd899bbbac2 _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x108ac2)
      #42 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #43 0x00007fd899bc8a9e object_vacall (libpython3.10.so.1.0 + 0x115a9e)
      #44 0x00007fd899bd247c _PyObject_CallMethodIdObjArgs (libpython3.10.so.1.0 + 0x11f47c)
      #45 0x00007fd899bd21d7 PyImport_ImportModuleLevelObject (libpython3.10.so.1.0 + 0x11f1d7)
      #46 0x00007fd899bbfc8e _PyEval_EvalFrameDefault (libpython3.10.so.1.0 + 0x10cc8e)
     #47 0x00007fd899bba984 _PyEval_Vector (libpython3.10.so.1.0 + 0x107984)
      #48 0x00007fd899c360d4 PyEval_EvalCode (libpython3.10.so.1.0 + 0x1830d4)
      #49 0x00007fd899c36006 exec_code_in_module (libpython3.10.so.1.0 + 0x183006)
      #50 0x00007fd899ba33e7 PyImport_ExecCodeModuleObject (libpython3.10.so.1.0 + 0xf03e7)
      #51 0x00007fd899ba3482 PyImport_ExecCodeModuleWithPathnames (libpython3.10.so.1.0 + 0xf0482)
      #52 0x00007fd899e0f542 wsgi_load_source.lto_priv.0 (mod_wsgi_python3.so + 0x17542)
      #53 0x00007fd899e107ed wsgi_execute_script.lto_priv.0 (mod_wsgi_python3.so + 0x187ed)
      #54 0x00007fd899e1b0f6 wsgi_daemon_thread (mod_wsgi_python3.so + 0x230f6)
      #55 0x00007fd89ab52a87 start_thread (libc.so.6 + 0x8da87)
      #56 0x00007fd89abd7640 __clone3 (libc.so.6 + 0x112640)

I see this in /var/log/httpd/ssl_error_log:

[Sat Jan 01 05:17:21.248640 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249193 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249222 2022] [wsgi:error] [pid 257749:tid 257758]   File "/usr/lib64/python3.10/sqlite3/dbapi2.py", line
 83, in register_adapters_and_converters
[Sat Jan 01 05:17:21.249453 2022] [wsgi:error] [pid 257749:tid 257758]     register_converter("timestamp", convert_timestamp)
[Sat Jan 01 05:17:21.249469 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'timepart_full'
[Sat Jan 01 05:17:21.249484 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249488 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249493 2022] [wsgi:error] [pid 257749:tid 257758]   File "/usr/lib64/python3.10/sqlite3/dbapi2.py", line
 83, in register_adapters_and_converters
[Sat Jan 01 05:17:21.249572 2022] [wsgi:error] [pid 257749:tid 257758]     register_converter("timestamp", convert_timestamp)
[Sat Jan 01 05:17:21.249582 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'timepart'
[Sat Jan 01 05:17:21.249590 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249594 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249598 2022] [wsgi:error] [pid 257749:tid 257758]   File "/usr/lib64/python3.10/sqlite3/dbapi2.py", line
 83, in register_adapters_and_converters
[Sat Jan 01 05:17:21.249667 2022] [wsgi:error] [pid 257749:tid 257758]     register_converter("timestamp", convert_timestamp)
[Sat Jan 01 05:17:21.249676 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'datepart'
[Sat Jan 01 05:17:21.249697 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249701 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249706 2022] [wsgi:error] [pid 257749:tid 257758]   File "<frozen importlib._bootstrap>", line 688, in _
load_unlocked
[Sat Jan 01 05:17:21.249804 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'convert_timestamp'
[Sat Jan 01 05:17:21.249813 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249817 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249822 2022] [wsgi:error] [pid 257749:tid 257758]   File "<frozen importlib._bootstrap>", line 688, in _
load_unlocked
[Sat Jan 01 05:17:21.249889 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'convert_date'
[Sat Jan 01 05:17:21.249898 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249901 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249906 2022] [wsgi:error] [pid 257749:tid 257758]   File "<frozen importlib._bootstrap>", line 688, in _
load_unlocked
[Sat Jan 01 05:17:21.249946 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'adapt_datetime'
[Sat Jan 01 05:17:21.249971 2022] [wsgi:error] [pid 257749:tid 257758] Exception ignored deletion of interned string failed:
[Sat Jan 01 05:17:21.249977 2022] [wsgi:error] [pid 257749:tid 257758] Traceback (most recent call last):
[Sat Jan 01 05:17:21.249981 2022] [wsgi:error] [pid 257749:tid 257758]   File "<frozen importlib._bootstrap>", line 688, in _
load_unlocked
[Sat Jan 01 05:17:21.250021 2022] [wsgi:error] [pid 257749:tid 257758] KeyError: 'adapt_date'
[Sat Jan 01 05:17:22.058701 2022] [wsgi:error] [pid 249217:tid 249327] [client 31.13.127.13:53220] Truncated or oversized response headers received from daemon process 'https_site': /home/site/dmnes-site/viewer.wsgi

libpython is crashing httpd a few times a minute for me, and I definitely was not seeing this with Fedora 34. I have Python 3.10.1 on F35, and had 3.9.9 on F34.

If there's any further information I can provide, I'd be happy to help.
msg409573 - (view) Author: (graysky) Date: 2022-01-03 10:27
In reply to the first comment here https://bugs.python.org/issue46070#msg408520 which affects several Kodi plugins, it seems that commenting out lines 80-84 in /usr/lib/python3.10/sqlite3/dbapi2.py "fixes" the bug in python 3.10.1.  I do not know if that helps diagnose this future or not.
msg409686 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-04 15:02
Fedora issue: https://bugzilla.redhat.com/show_bug.cgi?id=2034962
msg409772 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-05 15:20
I applied PR 30123 of bpo-46006: "./python bug.py" does still crash. So bpo-46006 is unrelated to this issue.
msg409778 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-05 15:48
bug.py:

* Python 3.7 branch doesn't crash
* Python 3.8 branch does crash

So the regression was introduced in Python 3.8.

git bisect points me to this change:
---
commit 13915a3100608f011b29da2f3716c990f523b631 (refs/bisect/bad)
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Mon Oct 7 09:38:00 2019 -0700

    bpo-36356: Fix memory leak in _asynciomodule.c (GH-16598)
    
    (cherry picked from commit 321def805abc5b7c92c7e90ca90cb2434fdab855)
    
    Co-authored-by: Ben Harper <btharper1221@gmail.com>
---

Before this change, bug.py doesn't crash. With this change, bug.py does crash.
msg409780 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-05 15:57
> bpo-36356: Fix memory leak in _asynciomodule.c (GH-16598)

Python 3.8.0 is the first Python version containing this change. So it looks like a Python 3.8 regression.
msg409795 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-01-05 18:52
In 13915a3100608f011b29da2f3716c990f523b631, the init flag is set before we even know if module initialisation was successful. Applying the following patch upon the aforementioned commit (in 3.8) fixes the issue for me on latest Debian:

```
diff --git a/Modules/_asynciomodule.c b/Modules/_asynciomodule.c
index 5261ed3d4c..782138e4e4 100644
--- a/Modules/_asynciomodule.c
+++ b/Modules/_asynciomodule.c
@@ -3250,17 +3250,14 @@ static int
 module_init(void)
 {
     PyObject *module = NULL;
+    if (module_initialized) {
+        return 0;
+    }
 
     asyncio_mod = PyImport_ImportModule("asyncio");
     if (asyncio_mod == NULL) {
         goto fail;
     }
-    if (module_initialized != 0) {
-        return 0;
-    } 
-    else {
-        module_initialized = 1;
-    }
 
     current_tasks = PyDict_New();
     if (current_tasks == NULL) {
@@ -3322,6 +3319,7 @@ module_init(void)
     }
 
     Py_DECREF(module);
+    module_initialized = 1;
     return 0;
 
 fail:
```
msg409798 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-01-05 19:13
> Applying the following patch upon the aforementioned commit (in 3.8) fixes
> the issue for me on latest Debian

By "the issue", I mean bug.py, not win_py399_crash_reproducer.py.

Victor: perhaps we should open a separate issue for the bug.py issue.

Applying the patch from msg409795 onto the 3.9 branch also fixes the segfault from bug.py there.

main does not segfault for me, but I would apply the patch anyway, since the current code is broken.
msg409802 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-01-05 20:07
Note, GH-30423 does _not_ fix the win_py399_crash_reproducer.py issue.
msg409962 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 14:08
New changeset b127e70a8a682fe869c22ce04c379bd85a00db67 by Erlend Egeberg Aasland in branch 'main':
bpo-46070: Fix asyncio initialisation guard (GH-30423)
https://github.com/python/cpython/commit/b127e70a8a682fe869c22ce04c379bd85a00db67
msg409965 - (view) Author: miss-islington (miss-islington) Date: 2022-01-07 14:35
New changeset 9d18045804f6db8224be14f7a618b77977f90144 by Miss Islington (bot) in branch '3.10':
bpo-46070: Fix asyncio initialisation guard (GH-30423)
https://github.com/python/cpython/commit/9d18045804f6db8224be14f7a618b77977f90144
msg409966 - (view) Author: miss-islington (miss-islington) Date: 2022-01-07 14:36
New changeset 4d2cfd354969590ba8e0af0447fd84f8b5e61952 by Miss Islington (bot) in branch '3.9':
bpo-46070: Fix asyncio initialisation guard (GH-30423)
https://github.com/python/cpython/commit/4d2cfd354969590ba8e0af0447fd84f8b5e61952
msg409970 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 14:40
Even with PR 30454, I can still reproduce the crash on Python 3.9 (randomly, it takes a few attempts to reproduce the crash):

vstinner@DESKTOP-DK7VBIL C:\vstinner\python\3.9>python -X dev win_py399_crash_reproducer.py 
Running Debug|x64 interpreter...
exit subinterpreter
exit subinterpreter
exit subinterpreter
Windows fatal exception: access violation

Thread 0x00002230 (most recent call first):
<no Python frame>

Thread 0x00002124 (most recent call first):
  File "C:\vstinner\python\3.9\win_py399_crash_reproducer.py", line 13 in doIt
  File "C:\vstinner\python\3.9\lib\threading.py", line 910 in run
  File "C:\vstinner\python\3.9\lib\threading.py", line 973 in _bootstrap_inner
  File "C:\vstinner\python\3.9\lib\threading.py", line 930 in _bootstrap

Current thread 0x000027f0 (most recent call first):
  File "C:\vstinner\python\3.9\win_py399_crash_reproducer.py", line 13 in doIt
  File "C:\vstinner\python\3.9\lib\threading.py", line 910 in run
  File "C:\vstinner\python\3.9\lib\threading.py", line 973 in _bootstrap_inner
  File "C:\vstinner\python\3.9\lib\threading.py", line 930 in _bootstrap

Thread 0x00001c18 (most recent call first):
  File "C:\vstinner\python\3.9\lib\threading.py", line 312 in wait
  File "C:\vstinner\python\3.9\lib\threading.py", line 574 in wait
  File "C:\vstinner\python\3.9\lib\threading.py", line 897 in start
  File "C:\vstinner\python\3.9\win_py399_crash_reproducer.py", line 19 in <module>
msg409980 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 16:07
> The problematic change could be (d0d29655ff) affecting import.c

This change is part of the 3.10 branch. For 3.9, git bisect tells me that it's the following change:

commit 52d9d3b75441ae6038fadead89eac5eecdd34501
Author: Łukasz Langa <lukasz@langa.pl>
Date:   Tue Oct 5 22:30:25 2021 +0200

    [3.9] bpo-44050: Extension modules can share state when they don't support sub-interpreters. (GH-27794) (GH-28741)

    (cherry picked from commit b9bb74871b27d9226df2dd3fce9d42bda8b43c2b)

    Co-authored-by: Hai Shi <shihai1992@gmail.com>


The problem is that this change fixed another bug, well, see: bpo-44050. While a revert should fix win_py399_crash_reproducer.py, it will reintroduce bpo-44050 bug.
msg409983 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 16:13
I reverted manually the commit 52d9d3b75441ae6038fadead89eac5eecdd34501 (in my local Python 3.9 checkout): I confirm that the revert fix the  win_py399_crash_reproducer.py crash in Python 3.9.
msg409984 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2022-01-07 16:30
> The problem is that this change fixed another bug, well, see: bpo-44050. While a revert should fix win_py399_crash_reproducer.py, it will reintroduce bpo-44050 bug.

bpo-44050 is an attempt to fix a regression introduced in bpo-38858, perhaps that regression should be reverted as well?
msg409991 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 17:45
In the 3.9 branch, the commit 4d2cfd354969590ba8e0af0447fd84f8b5e61952 fixed the _asyncio extension. win_py399_crash_reproducer.py still branch on Windows in the 3.9 branch. The code can be simplified with:

    code = "import _sre"

Moreover, even if I modify PyInit__sre() to only call PyModule_Create(), it does still crash. I can still reproduce the crash with the following simplified _sre.c code:
---
static PyMethodDef _functions[] = {
    _SRE_COMPILE_METHODDEF
    {NULL, NULL}
};

static struct PyModuleDef sremodule = {
        PyModuleDef_HEAD_INIT,
        "_" SRE_MODULE,
        NULL,
        -1,
        _functions,
        NULL,
        NULL,
        NULL,
        NULL
};

PyMODINIT_FUNC PyInit__sre(void)
{
    return PyModule_Create(&sremodule);
}
---

If _SRE_COMPILE_METHODDEF is removd from _functions, the script no longer crash.

Is there something specific about method objects? Is it safe to share them between multiple interpreters? See my message msg408662 which gives some details.
msg410010 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 19:21
The _sre crash has a complex history in the 3.9 branch:

* (1) 2019-11-20, commit 7247407c35330f3f6292f1d40606b7ba6afd5700: first CRASH! The parent commit (488d02a24142948bfb1fafd19fa48e61fcbbabc5) doesn't crash.
* (2) 2019-11-22, commit 82c83bd907409c287a5bd0d0f4598f2c0538f34d: no crash (fix/workaround the crash)
* (3) 2021-10-05, commit 52d9d3b75441ae6038fadead89eac5eecdd34501: crash again! (somehow revert the previous fix/workaround the crash, but fix another bug)

It seems like the initial regression comes from this change:

commit 7247407c35330f3f6292f1d40606b7ba6afd5700
Author: Victor Stinner <vstinner@python.org>
Date:   Wed Nov 20 12:25:50 2019 +0100

    bpo-36854: Move _PyRuntimeState.gc to PyInterpreterState (GH-17287)
    
    * Rename _PyGC_InitializeRuntime() to _PyGC_InitState()
    * finalize_interp_clear() now also calls _PyGC_Fini() in
      subinterpreters (clear the GC state).
msg410014 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-07 19:36
This bug is hard to reproduce for different reasons:

* It occurs randomly: I need between 1 and 50 attempts to reproduce the bug using win_py399_crash_reproducer.py

* So far, the bug was only reproduced on Windows.

* I failed to reproduce the crash on Linux. I tried PYTHONMALLOC=malloc and PYTHONMALLOC=malloc_debug with and without LD_PRELOAD=/usr/lib64/libjemalloc.so.2 (jemalloc memory allocator).

* The _sre extension has been converted to multi-phase init in Python 3.10. "import _sre" is no longer enough to reproduce the crash on Python 3.10 and newer.
msg410442 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-12 23:31
I prepared 3 pull requests to revert the commit 7247407c35330f3f6292f1d40606b7ba6afd5700:

* PR 30564: main branch
* PR 30565: 3.10 branch
* PR 30566: 3.9 branch

The problem is that the "Check if the ABI has changed" CI job fails in 3.9 and 3.10 branches.

I recently had the issue for a different revert in bpo-46006: I decided to keep the "removed" member, and mark it as "unused". See the commit 72c260cf0c71eb01eb13100b751e9d5007d00b70 in the 3.10 branch:

struct _Py_unicode_state {
(...)

-    PyObject *interned;

+    // Unused member kept for ABI backward compatibility with Python 3.10.0:
+    // see bpo-46006.
+    PyObject *unused_interned;

(...)
}


I can keep the "gc" member in PyInterpreterState, but adding a new "gc" member in the _PyRuntimeState structure also causes the ABI CI check to fail.
msg410444 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-01-12 23:41
> adding a new "gc" member in the _PyRuntimeState structure also causes the ABI CI check to fail.

What if you move it to the end of the struct?
msg410446 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 00:01
I modified PR 30565 (3.10) and PR 30566 (3.9) to fix the ABI. I added _PyGC_GetState() which always use PyInterpreterState.gc of the main interpreter.
msg410447 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 00:08
I wrote 3 scripts to reproduce the bug in a more reliable way. So I just have to type "bisect" and it runs the test 12 times.

(1) bisect.bat:
---
@"C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe" bisect.py
---


(2) bisect.py:
---
import subprocess
import os
import sys

BISECT = False

def run(*args):
    print("+ ", ' '.join(args))
    env = dict(os.environ)
    env['PYTHONFAULTHANDLER'] = '1'
    proc = subprocess.run(args, env=env)
    exitcode = proc.returncode
    if exitcode:
        print()
        print(f"COMMAND FAILED: {exitcode}")
        if BISECT:
            print()
            print("type: git bisect bad")
        sys.exit(exitcode)

python = sys.executable
#script = "win_py399_crash_reproducer.py"
script = "bug.py"
nrun = 12
for i in range(1, nrun+1):
    print(f"Run #{i}/{nrun}")
    if i % 2:
        run(python, script)
    else:
        run(python, "-X", "dev", script)

if BISECT:
    print()
    print("Not reproduced")
    print()
    run("git", "checkout", ".")
    run("git", "bisect", "good")
---


(3) win_py399_crash_reproducer.py (import "_sre"):
---
# When this program is run on windows using python 3.9.9 it crashes about 50%
# of the time.

import _testcapi
import threading

code = """
import _sre
print("exit subinterpreter")
"""

def doIt():
    _testcapi.run_in_subinterp(code)

tt=[]

for i in range(16):
    t = threading.Thread(target=doIt)
    t.start()
    tt.append(t)

for t in tt:
    t.join()
print("exit main")
---


Example:
---
vstinner@DESKTOP-DK7VBIL C:\vstinner\python\3.9>bisect
Run #1/12
+  C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe bug.py
Run #2/12
+  C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe -X dev bug.py
Run #3/12
+  C:\vstinner\python\3.9\PCbuild\amd64\python_d.exe bug.py
Windows fatal exception: access violation
(...)
Current thread 0x00000420 (most recent call first):
  File "C:\vstinner\python\3.9\bug.py", line 13 in doIt
  File "C:\vstinner\python\3.9\lib\threading.py", line 910 in run
  File "C:\vstinner\python\3.9\lib\threading.py", line 973 in _bootstrap_inner
  File "C:\vstinner\python\3.9\lib\threading.py", line 930 in _bootstrap
(...)
COMMAND FAILED: 3221225477
---
msg410493 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 15:32
When the crash occurs, the _sre.compile function is not destroyed in the interpreter which created the function.



The crash is related to _sre.compile method. This method is created in PyInit__sre() called by "import _sre".

On Windows, the _sre module is not imported at startup. So it's imported first in a subinterpreter.

In Python 3.9, the _sre module doesn't use the multiphase initialization API and PyModuleDef.m_size = -1. When the module is imported, _PyImport_FixupExtensionObject() copies the module dictionary into PyModuleDef.m_copy.

In Py_Finalize() and Py_EndInterpreter(), _PyImport_Cleanup() does two things:

* (1) set _sre.__dict__['compile'] to None -> kill the first reference to the function
* (2) call _PyInterpreterState_ClearModules() which does Py_CLEAR(def->m_base.m_copy), clear the cached copy of the _sre module dict -> kill the second reference

I modified Python to add an "ob_interp" member to PyObject to log in which interpreter an object is created. I also modified meth_dealloc() to log when _sre.compile function is deleted.

Extract of the reformatted output to see what's going on:
---
(...)

(1)
fixup: COPY _sre ModuleDef copy: def=00007FFF19209810 interp=000001EC1846F2A0

    (2)
    import: UPDATE(_sre ModuleDef copy): interp=000001EC184AB790

(3)
_PyImport_Cleanup: interp=000001EC1846F2A0
_PyInterpreterState_ClearModules: PY_CLEAR _sre ModuleDef m_copy: def=00007FFF19209810 interp=000001EC1846F2A0

    (4)
    _PyImport_Cleanup: interp=000001EC184AB790
    meth_dealloc(compile): m->ob_interp=000001EC1846F2A0, interp=000001EC184AB790

    Windows fatal exception: access violation
    (...)
---

Steps:

* (1)

  * interpreter #1 (000001EC1846F2A0) creates the _sre.compile function
  * interpreter #1 (000001EC1846F2A0) copies _sre module dict into PyModuleDef.m_copy
  * at this point, _sre.compile should have 2 references

* (2)

  * interpreter #2 (000001EC184AB790) imports _sre: it creates a new module object and copies the function from PyModuleDef.m_copy
  * at this point, _sre.compile should have 3 references

* (3)

  * interpreter #1 exit: Py_EndInterpreter() calls _PyImport_Cleanup()
  * at this point, _sre.compile should have 1 reference

* (4)

  * interpreter #2 exit: Py_EndInterpreter() calls _PyImport_Cleanup()
  * the last reference to _sre.compile is deleted: 0 reference
  * meth_dealloc() is called

The first problem is that the function was created in the interpreter #1 but deleted in the interpreter #2.

The second problem is that the function is tracked by the GC and it is part of the GC list of the interpreter #1. When the interpreter #2 destroys the function, the GC list of interpreter #1 is already freed: PyGC_Head contains dangling pointers.
msg410497 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 17:18
Oh. I managed to write a simple fix which doesn't require to revert the whole "per-interpreter GC" change: GH-30577.
msg410498 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 17:31
This issue has a complex history.

(*) I made the GC state per-interpreter: commit 7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

(*) This change triggered a _PyImport_FixupExtensionObject() bug in sub-interpreter, I fixed it with commit 82c83bd907409c287a5bd0d0f4598f2c0538f34d (Nov 22, 2019)

(*) My _PyImport_FixupExtensionObject() fix introduced bpo-44050 regression, it was fixed by commit b9bb74871b27d9226df2dd3fce9d42bda8b43c2b (Oct 5, 2021)

(*) A race condition in the _asyncio extension has been identified and fixed by the commit b127e70a8a682fe869c22ce04c379bd85a00db67 (Jan 7, 2021)

(*) I identified a race condition introduced by the per-interpreter GC state cahnge: I proposed GH-30577 to fix it.


So far, the GC race condition has only been reproduced on Windows with Python 3.9 and the _sre exception. On Python 3.10 and newer, it's harder to reproduce the crash using stdlib extensions since many of them have been ported to the multi-phase initializatioin API.

The GC race condition involves dangling pointers and depends on the memory allocator and when GC collections are triggered.

The bug is that a C function object (_sre.compile) is created in an interpreter, tracked by the GC list of this interpreter, and then it is destroye and untracked in another interpreter. The problem is that the object is untracked after the GC list has been destroyed and so "prev" and "next" objects of the PyGC_Head structure *can* become dangling pointers.

It's unclear to me what are the "prev" and "next" objects of the C function causing the crash (_sre.compile). At least, it seems like it's also used by more than one interpreter: it should *not* be done, see bpo-40533.
msg410500 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-01-13 18:12
> (*) I made the GC state per-interpreter: commit 7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

FYI, this was done by me in an earlier comment which we ended up
reverting.  Later you basically un.reverted that.

> The bug is that a C function object (_sre.compile) is created in an interpreter, tracked by the GC list of this interpreter, and then it is destroye and untracked in another interpreter.

FWIW, at one point I had a branch that supported sharing read-only
Py_Buffer data.  When the receiving interpreter was done with it I'd
call Py_AddPendingCall() to schedule the cleanup in the "owner"
interpreter.  However, this only worked because I kept track of the
owner.  Adding that pointer to every object wouldn't be feasible but I
suppose there are other things we could do that wouldn't be super
inefficient, like don't worry about it for the main interpreter, use a
hash table (Victor's idea), borrow some of the bits of the PyObject
head to store a flag or even an index into an array (if there are only
a few interpreters), or even make the allocator per-interpreter and
then extrapolate the interpreter based on the object's address.

Regardless, it is still much simpler to make all objects per-interpreter.
msg410505 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 18:28
New changeset 1a4d1c1c9b08e75e88aeac90901920938f649832 by Victor Stinner in branch 'main':
bpo-46070: _PyGC_Fini() untracks objects (GH-30577)
https://github.com/python/cpython/commit/1a4d1c1c9b08e75e88aeac90901920938f649832
msg410507 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 18:36
I tested manually my fix GH-30580 using:

* (1) attached  win_py399_crash_reproducer.py
* (2) https://bugs.python.org/issue46070#msg410447 mthod

Without my fix, I can easily reproduce the crash with (1) and (2).

With my fix, I can no longer reproduce the crash with (1) or (2).
msg410509 - (view) Author: miss-islington (miss-islington) Date: 2022-01-13 18:50
New changeset e6bb17fe29713368e1fd93d9ac9611017c4f570c by Miss Islington (bot) in branch '3.10':
bpo-46070: _PyGC_Fini() untracks objects (GH-30577)
https://github.com/python/cpython/commit/e6bb17fe29713368e1fd93d9ac9611017c4f570c
msg410510 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 19:12
New changeset 52937c26adc35350ca0402070160cf6dc838f359 by Victor Stinner in branch '3.9':
bpo-46070: _PyGC_Fini() untracks objects (GH-30577) (GH-30580)
https://github.com/python/cpython/commit/52937c26adc35350ca0402070160cf6dc838f359
msg410513 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 19:45
It would be nice to add some tests.
msg410517 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 20:13
Victor:
> (*) I made the GC state per-interpreter: commit 7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

Eric Snow:
> FYI, this was done by me in an earlier comment which we ended up
reverting.  Later you basically un.reverted that.

Well, I recall that your change had to be reverted 2 or 3 times because there were many crashes on FreeBSD, and no one understood why it crashed. The root cause was bugs related to the GIL and daemon threads. It took me a while (and multiple commits) to identify and fix all of them:
https://vstinner.github.io/gil-bugfixes-daemon-threads-python39.html

I decided to split your work into smaller changes to better debug these crashes. bpo-36854 contains a few changes, but these changes are based on work that I pushed earlier.

For example, there was a tricky bug related to clearing a Python thread state:
https://github.com/python/cpython/commit/9da7430675ceaeae5abeb9c9f7cd552b71b3a93a

Also, once the GC was made per interpreter, we started to discover more and more tricky reference leaks:
https://vstinner.github.io/subinterpreter-leaks.html

I spent a significant time to reorder code of Py_Finalize() and Py_EndInterpreter() to clear objects earlier or in a different order. Recently, I made sure that the free lists can no longer be used after they are cleared. It took some notes at:
https://pythondev.readthedocs.io/finalization.html

One of the hardest fix was the commit 9ad58acbe8b90b4d0f2d2e139e38bb5aa32b7fb6 of bpo-19466. To make this change, first I had to fix a very old bug of PyThreadState_Clear() with commit 5804f878e779712e803be927ca8a6df389d82cdf (bpo-20526).

Well, it was a long journey and it's not done yet :-)
msg410518 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 20:24
pyobject_ob_interp.patch: Quick & dirty patch that I wrote to add PyObject.ob_interp, store in which interpreter an object has been created.
msg410520 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-13 20:27
I created bpo-46368: "faulthandler: add the ability to dump all interpreters, not only the current interpreter".
msg415954 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 16:27
sqlite3_crash.py does not crashes on python 3.9 below and equal 3.9.9 and python main branch 12c0012cf97d21bc637056983ede0eaf4c0d9c33. I confirm it crashes on python 3.9.10, 3.9.11, 3.10 branch commit 9006b4471cc4d74d0cbf782d841a330b0f46a3d0 .
It is fixed in main branch commit 12c0012cf97d21bc637056983ede0eaf4c0d9c33 .

Currently bisecting both 3.9.9 to 3.9.10 and 3.10 to 3.11 main branch for bad to good.

The patches in this bug report are already merged in the 3.10 branch which crash.

I cannot reproduce win_py399_crash_reproducer.py which I used as a basis for this test case.
The backtrace is the same as the ones from the crashes of the kodi addons (me Jellyfin Kodi addon), which is the initial report .
This looks like importing sqlite3 in threads plays badly.

I can reproduce on aarch64 (Odroid C2) LibreElec and builds of cpython on Debian stable x86_64 (the extensive testing of the broken interpreters is done on x86_64 Debian stable bullseye with a cpython clone and running from builddir).
msg415955 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 16:28
By "It is fixed in main branch commit 12c0012cf97d21bc637056983ede0eaf4c0d9c33 ." I mean that this commit is good not that this commit fixes the issue.
msg415963 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 17:41
I did 3.9 branch bisect and commit 52937c26adc35350ca0402070160cf6dc838f359 bpo-46070: _PyGC_Fini() untracks objects (GH-30577) (GH-30580) is the one that broke 3.9 . While with my main branch builds this commit is fine, it is not for the 3.9 branches.

Proceeding with 3.10 branch bisect of first bad commit and redoing main branch first good commit.
msg415964 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 18:24
3.10 branch is fixed if I revert e6bb17fe29713368e1fd93d9ac9611017c4f570c bpo-46070: _PyGC_Fini() untracks objects (GH-30577). Be it if I revert it atop current head 9006b4471cc4d74d0cbf782d841a330b0f46a3d0 or if I test the commit before e6bb17fe29713368e1fd93d9ac9611017c4f570c was merged.

As this made no sense with regards to this bug report history that this fix broke the branch, I tried v3.10.1 which lacks this fix. Vanilla is broken too. Also applying the "_PyGC_Fini untracks objects" upon 3.10.1 does not fix the test case crash.

I am puzzled. Will try to bisect the commit that fixed the testcase in the 3.10 branch before "_PyGC_Fini untracks objects" was merged and after 3.10.1.
msg415970 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 19:07
While bisecting main branch I did not only get segfaults but also exceptions, namely:

$ ./python  ../python-crash-kodi/sqlite3_crash.py 
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 81, in register_adapters_and_converters
    register_adapter(datetime.datetime, adapt_datetime)
KeyError: 'isoformat'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
    register_converter("timestamp", convert_timestamp)
KeyError: 'timepart_full'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
    register_converter("timestamp", convert_timestamp)
KeyError: 'day'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
    register_converter("timestamp", convert_timestamp)
KeyError: 'month'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
    register_converter("timestamp", convert_timestamp)
KeyError: 'year'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
    register_converter("timestamp", convert_timestamp)
KeyError: 'timepart'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "/home/prahal/Projects/WIP/libreelec/cpython_bisect/Lib/sqlite3/dbapi2.py", line 83, in register_adapters_and_converters
    register_converter("timestamp", convert_timestamp)
KeyError: 'datepart'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
KeyError: 'convert_timestamp'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
KeyError: 'convert_date'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
KeyError: 'adapt_datetime'
Exception ignored deletion of interned string failed:
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
KeyError: 'adapt_date'


The 3.10 branch bisect pointed to one commit that fix the crash after 3.10.1 which is 72c260cf0c71eb01eb13100b751e9d5007d00b70 [3.10] bpo-46006: Revert "bpo-40521: Per-interpreter interned strings (GH-20085)" (GH-30422) (GH-30425) which makes sense regarding main branch logs. It is commit 
35d6540c904ef07b8602ff014e520603f84b5886 in the main branch.

What remains to be seen is why "bpo-46070: _PyGC_Fini() untracks objects (GH-30577)" looks fine in the main branch (though it has no effect on the import crash) and not in 3.9 and 3.10 branch.
Mind in the main branch "bpo-46006: Revert "bpo-40521: Per-interpreter interned strings (GH-20085)" (GH-30422)" was already applied when "bpo-46070: _PyGC_Fini() untracks objects (GH-30577)" went in so it was also unrelated to the fix of the initial report.
msg415975 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 21:03
bisect of main for bug.py with each local python builds get me to commit b127e70a8a682fe869c22ce04c379bd85a00db67 "bpo-46070: Fix asyncio initialization guard (GH-30423)" as the one that fixed bug.py most of the time.

At times I can make bug.py segfault be it on python 3.9, 3.10 or main branch. It is pretty hard (I can have a batch of 200 runs without an issue) but seems easier to reproduce with a CPU stressed, then I can have two segfaults in a batch of 50 runs.

Bash:
for i in {1..50}; do ./python  ../python-crash-kodi/bug.py ; done
or sh:
for i in `seq 1 50`; do ./python  ../python-crash-kodi/bug.py ; done

with:
stress -c `nproc --all` at the same time.
msg415977 - (view) Author: Alban Browaeys (prahal) Date: 2022-03-24 21:16
I managed to reproduce the bug.py crash with main branch up to commit 12c0012cf97d21bc637056983ede0eaf4c0d9c33 .
History
Date User Action Args
2022-04-11 14:59:53adminsetgithub: 90228
2022-03-24 21:16:20prahalsetfiles: + bug.py_asyncio_cpustressed-crash.log

messages: + msg415977
2022-03-24 21:03:14prahalsetmessages: + msg415975
2022-03-24 19:07:11prahalsetmessages: + msg415970
2022-03-24 18:24:20prahalsetmessages: + msg415964
2022-03-24 17:41:24prahalsetmessages: + msg415963
2022-03-24 16:28:37prahalsetmessages: + msg415955
2022-03-24 16:27:15prahalsetfiles: + sqlite3_crash.py
nosy: + prahal
messages: + msg415954

2022-01-21 00:39:17vstinnersetmessages: - msg411050
2022-01-20 22:21:22vstinnersetmessages: + msg411050
2022-01-20 20:45:48jokot3setnosy: + jokot3
2022-01-13 20:27:02vstinnersetmessages: + msg410520
2022-01-13 20:24:19vstinnersetfiles: + pyobject_ob_interp.patch

messages: + msg410518
2022-01-13 20:13:31vstinnersetmessages: + msg410517
2022-01-13 19:45:10vstinnersetmessages: + msg410513
2022-01-13 19:12:58vstinnersetmessages: + msg410510
2022-01-13 18:50:18miss-islingtonsetmessages: + msg410509
2022-01-13 18:36:52vstinnersetmessages: + msg410507
2022-01-13 18:30:47vstinnersetpull_requests: + pull_request28779
2022-01-13 18:28:55miss-islingtonsetpull_requests: + pull_request28778
2022-01-13 18:28:51miss-islingtonsetpull_requests: + pull_request28777
2022-01-13 18:28:40vstinnersetmessages: + msg410505
2022-01-13 18:12:00eric.snowsetmessages: + msg410500
2022-01-13 17:31:53vstinnersetmessages: + msg410498
2022-01-13 17:18:29vstinnersetmessages: + msg410497
2022-01-13 17:04:03vstinnersetpull_requests: + pull_request28776
2022-01-13 15:32:44vstinnersetmessages: + msg410493
2022-01-13 00:08:19vstinnersetmessages: + msg410447
2022-01-13 00:01:23vstinnersetmessages: + msg410446
2022-01-12 23:41:19eric.snowsetmessages: + msg410444
2022-01-12 23:31:33vstinnersetmessages: + msg410442
2022-01-12 23:26:46vstinnersetpull_requests: + pull_request28769
2022-01-12 23:26:43vstinnersetpull_requests: + pull_request28768
2022-01-12 23:26:39vstinnersetpull_requests: + pull_request28767
2022-01-07 19:36:13vstinnersetmessages: + msg410014
2022-01-07 19:33:21vstinnersettitle: [subinterpreters] asyncio crash when importing _asyncio in subinterpreter (Python 3.8 regression) -> [subinterpreters] crash when importing _sre in subinterpreters in parallel (Python 3.9 regression)
2022-01-07 19:21:50vstinnersetmessages: + msg410010
2022-01-07 17:45:00vstinnersetmessages: + msg409991
2022-01-07 16:30:19petr.viktorinsetmessages: + msg409984
2022-01-07 16:13:10vstinnersetmessages: + msg409983
2022-01-07 16:07:36vstinnersetmessages: + msg409980
2022-01-07 14:40:24vstinnersetmessages: + msg409970
2022-01-07 14:36:10miss-islingtonsetmessages: + msg409966
2022-01-07 14:35:19miss-islingtonsetmessages: + msg409965
2022-01-07 14:08:45vstinnersetmessages: + msg409962
2022-01-07 14:08:45miss-islingtonsetpull_requests: + pull_request28658
2022-01-07 14:08:41miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request28657
2022-01-05 20:07:45erlendaaslandsetmessages: + msg409802
2022-01-05 19:29:10erlendaaslandsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request28628
2022-01-05 19:13:50erlendaaslandsetmessages: + msg409798
2022-01-05 18:52:09erlendaaslandsetmessages: + msg409795
2022-01-05 15:57:15vstinnersetmessages: + msg409780
2022-01-05 15:55:28vstinnersettitle: _PyImport_FixupExtensionObject() regression causing a crash in subintepreters -> [subinterpreters] asyncio crash when importing _asyncio in subinterpreter (Python 3.8 regression)
2022-01-05 15:48:54vstinnersetmessages: + msg409778
2022-01-05 15:20:00vstinnersetmessages: + msg409772
2022-01-04 15:02:54vstinnersetmessages: + msg409686
2022-01-03 10:27:26grayskysetmessages: + msg409573
2022-01-01 14:36:31uckelmansetnosy: + uckelman
messages: + msg409461
2021-12-30 18:39:11M-Reimersetnosy: + M-Reimer
2021-12-28 13:19:36corona10setmessages: + msg409255
2021-12-28 13:17:05corona10setnosy: + corona10
2021-12-27 23:22:32erlendaaslandsetnosy: + erlendaasland
messages: + msg409245
2021-12-22 08:20:05hroncoksetnosy: + hroncok
2021-12-16 00:53:27vstinnersetmessages: + msg408665
2021-12-16 00:41:32vstinnersettitle: broken subinterpreters -> _PyImport_FixupExtensionObject() regression causing a crash in subintepreters
2021-12-16 00:40:46vstinnersetfiles: + bug.py

messages: + msg408664
2021-12-16 00:35:36vstinnersetmessages: + msg408662
2021-12-15 18:35:10bsteffensmeiersetfiles: + win_py399_crash_reproducer.py
nosy: + bsteffensmeier
messages: + msg408633

2021-12-15 18:27:52ndjensensetnosy: + ndjensen
2021-12-14 21:38:04petr.viktorinsetmessages: + msg408568
2021-12-14 20:17:18grayskysetmessages: + msg408558
2021-12-14 16:03:26grayskysetmessages: + msg408541
2021-12-14 15:44:53petr.viktorinsetnosy: + vstinner
messages: + msg408540
2021-12-14 15:36:55eric.snowsettype: behavior
components: + Subinterpreters
versions: + Python 3.9, Python 3.11
nosy: + petr.viktorin, eric.snow, shihai1991

messages: + msg408539
stage: test needed
2021-12-14 10:57:50grayskycreate