This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients M-Reimer, bsteffensmeier, corona10, eric.snow, erlendaasland, graysky, hroncok, miss-islington, ndjensen, petr.viktorin, shihai1991, uckelman, vstinner
Date 2022-01-13.17:31:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1642095113.7.0.233625691568.issue46070@roundup.psfhosted.org>
In-reply-to
Content
This issue has a complex history.

(*) I made the GC state per-interpreter: commit 7247407c35330f3f6292f1d40606b7ba6afd5700 (Nov 20, 2019)

(*) This change triggered a _PyImport_FixupExtensionObject() bug in sub-interpreter, I fixed it with commit 82c83bd907409c287a5bd0d0f4598f2c0538f34d (Nov 22, 2019)

(*) My _PyImport_FixupExtensionObject() fix introduced bpo-44050 regression, it was fixed by commit b9bb74871b27d9226df2dd3fce9d42bda8b43c2b (Oct 5, 2021)

(*) A race condition in the _asyncio extension has been identified and fixed by the commit b127e70a8a682fe869c22ce04c379bd85a00db67 (Jan 7, 2021)

(*) I identified a race condition introduced by the per-interpreter GC state cahnge: I proposed GH-30577 to fix it.


So far, the GC race condition has only been reproduced on Windows with Python 3.9 and the _sre exception. On Python 3.10 and newer, it's harder to reproduce the crash using stdlib extensions since many of them have been ported to the multi-phase initializatioin API.

The GC race condition involves dangling pointers and depends on the memory allocator and when GC collections are triggered.

The bug is that a C function object (_sre.compile) is created in an interpreter, tracked by the GC list of this interpreter, and then it is destroye and untracked in another interpreter. The problem is that the object is untracked after the GC list has been destroyed and so "prev" and "next" objects of the PyGC_Head structure *can* become dangling pointers.

It's unclear to me what are the "prev" and "next" objects of the C function causing the crash (_sre.compile). At least, it seems like it's also used by more than one interpreter: it should *not* be done, see bpo-40533.
History
Date User Action Args
2022-01-13 17:31:53vstinnersetrecipients: + vstinner, petr.viktorin, eric.snow, ndjensen, hroncok, uckelman, corona10, miss-islington, shihai1991, erlendaasland, graysky, bsteffensmeier, M-Reimer
2022-01-13 17:31:53vstinnersetmessageid: <1642095113.7.0.233625691568.issue46070@roundup.psfhosted.org>
2022-01-13 17:31:53vstinnerlinkissue46070 messages
2022-01-13 17:31:53vstinnercreate