This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: contextvars: hamt_alloc() must initialize h_root and h_count fields
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ned.deily, vstinner, yselivanov
Priority: Keywords: patch

Created on 2018-06-07 23:44 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7504 merged yselivanov, 2018-06-08 00:19
PR 7505 merged miss-islington, 2018-06-08 00:30
Messages (12)
msg318990 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-07 23:44
test_asyncio started to crash in https://github.com/python/cpython/pull/7487

I debugged the crash with Yury: _PyHAMT_New() triggers indirectly a GC collection at "o->h_root = hamt_node_bitmap_new(0);". Problem: the object that is being created is already tracked by the GC, whereas its h_root field is not set. Then the GC does crash because the field is a random pointer.

hamt_alloc() must initialize h_root and h_count before tracking the object in the GC.

Thinking about the GC when writing an object constructor is hard :-(

Maybe we should add a debug option to trigger the GC more often to make such random bug more likely. contextvars has been implemented a few months ago, and we only spotted the bug a few days before Python 3.7.0 final release...
msg318991 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 00:06
One solution to trigger so crash more frequently is to reduce the threshold of the generation 0 of the garbage collector.

Here is a patch to do that when using -X dev: change the default threshold from 700 to 5 for the generation 0.

With this patch, "PYTHONDEVMODE=1 python -m test test_context" does also crash as expected.

diff --git a/Python/pylifecycle.c b/Python/pylifecycle.c
index 958219b744..81218fb964 100644
--- a/Python/pylifecycle.c
+++ b/Python/pylifecycle.c
@@ -622,6 +622,10 @@ _Py_InitializeCore(const _PyCoreConfig *core_config)
         return _Py_INIT_ERR("runtime core already initialized");
     }
 
+    if (core_config->dev_mode) {
+        _PyRuntime.gc.generations[0].threshold = 5;
+    }
+
     /* Py_Finalize leaves _Py_Finalizing set in order to help daemon
      * threads behave a little more gracefully at interpreter shutdown.
      * We clobber it here so the new interpreter can start with a clean
msg318992 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 00:11
Don't forget to merge https://github.com/python/cpython/pull/7487 once this bug is fixed. I would like to see https://bugs.python.org/issue33792 in Python 3.7 *if possible* (the feature now seems "required" for the new asyncio.loop() function).
msg318994 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-06-08 00:21
Thanks Victor for debugging this.  I made a PR (which is now trivial) and double checked all other calls to GCTrack in context.c & hamt.c.
msg318997 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-06-08 00:29
New changeset 378c53cc3187dba57c7560ccc2557f516c8a7bc8 by Yury Selivanov in branch 'master':
bpo-33803: Fix a crash in hamt.c (#7504)
https://github.com/python/cpython/commit/378c53cc3187dba57c7560ccc2557f516c8a7bc8
msg319001 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-06-08 00:44
New changeset a971a6fdb111bb62911ccf45aa9fe734e2e7a590 by Yury Selivanov (Miss Islington (bot)) in branch '3.7':
bpo-33803: Fix a crash in hamt.c (GH-7504) (GH-7505)
https://github.com/python/cpython/commit/a971a6fdb111bb62911ccf45aa9fe734e2e7a590
msg319013 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-06-08 03:44
> +    if (core_config->dev_mode) {
> +        _PyRuntime.gc.generations[0].threshold = 5;
> +    }

I'd love to have a flag to turn this on, or maybe we should enable it for -X dev.
msg319026 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 06:51
> I'd love to have a flag to turn this on, or maybe we should enable it for -X dev.

Well, there is already a public API to do it manually: gc.set_threshold(5).

I'm not sure about a threshold of 5 for -X dev: that's very very low and so kill performances. -X dev can be slower, not but 10x slower for example. I didn't measure performance.

Such bug is rare, so I'm not sure about putting gc.set_threshold(5) in -X dev.
msg319032 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 07:59
FYI the bug was also seen 8 hours ago on a different asyncio PR:
https://github.com/python/cpython/pull/423#issuecomment-395681351
msg319033 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 08:01
> FYI the bug was also seen 8 hours ago on a different asyncio PR:

Oops, my message is misleading: the crash is not a regression. I just wanted to notice that a different PR also triggered the crash before the crash has been fixed. I'm just surprised that the bug decided to wake up 4 months after contextvars has been merged. Why yesterday and not previously?
msg354172 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-08 07:09
Notes for myself.

In bpo-38392, I modified PyObject_GC_Track() in debug mode to detect this bug.

For example, this bug can be reproduced with this change:
---
diff --git a/Python/hamt.c b/Python/hamt.c
index 28b4e1ef6c..d7dd555d15 100644
--- a/Python/hamt.c
+++ b/Python/hamt.c
@@ -2478,8 +2478,6 @@ hamt_alloc(void)
     if (o == NULL) {
         return NULL;
     }
-    o->h_count = 0;
-    o->h_root = NULL;
     o->h_weakreflist = NULL;
     PyObject_GC_Track(o);
     return o;
---

Python now detects the bug in debug mode:
---
$ ./python -m test -v test_context
(...)
test_context_copy_1 (test.test_context.ContextTest) ...
Modules/gcmodule.c:1931: visit_validate: Assertion failed: PyObject_GC_Track() object is not valid
Enable tracemalloc to get the memory block allocation traceback

object address  : 0x7f2b17408d70
object refcount : 1
object type     : 0x76bc20
object type name: hamt
object repr     : <hamt object at 0x7f2b17408d70>

Fatal Python error: _PyObject_AssertFailed
Python runtime state: initialized

Current thread 0x00007f2b25590740 (most recent call first):
  File "/home/vstinner/python/master/Lib/test/test_context.py", line 322 in test_context_copy_1
  File "/home/vstinner/python/master/Lib/unittest/case.py", line 616 in _callTestMethod
  File "/home/vstinner/python/master/Lib/unittest/case.py", line 659 in run
  (...)
  File "/home/vstinner/python/master/Lib/runpy.py", line 192 in _run_module_as_main
Aborted (core dumped)
---
msg354199 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-10-08 12:47
> In bpo-38392, I modified PyObject_GC_Track() in debug mode to detect this bug.

Awesome!!!
History
Date User Action Args
2022-04-11 14:59:01adminsetgithub: 77984
2019-10-08 12:47:13yselivanovsetmessages: + msg354199
2019-10-08 07:09:47vstinnersetmessages: + msg354172
2018-08-30 11:23:33berker.peksagsettype: enhancement -> crash
2018-08-30 10:26:43ernest ruizsettype: crash -> enhancement
2018-06-08 08:01:01vstinnersetpriority: release blocker ->

messages: + msg319033
2018-06-08 07:59:33vstinnersetmessages: + msg319032
2018-06-08 06:51:36vstinnersetmessages: + msg319026
2018-06-08 03:44:16yselivanovsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-06-08 03:44:03yselivanovsetmessages: + msg319013
2018-06-08 00:44:12yselivanovsetmessages: + msg319001
2018-06-08 00:30:12miss-islingtonsetpull_requests: + pull_request7133
2018-06-08 00:29:58yselivanovsetmessages: + msg318997
2018-06-08 00:21:11yselivanovsetmessages: + msg318994
2018-06-08 00:19:45yselivanovsetkeywords: + patch
stage: patch review
pull_requests: + pull_request7132
2018-06-08 00:11:22vstinnersetmessages: + msg318992
2018-06-08 00:06:52vstinnersetmessages: + msg318991
2018-06-07 23:44:54vstinnercreate