Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make struct module PEP-384 compatible #82257

Closed
DinoV opened this issue Sep 9, 2019 · 19 comments
Closed

Make struct module PEP-384 compatible #82257

DinoV opened this issue Sep 9, 2019 · 19 comments
Assignees
Labels
3.9 only security fixes extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@DinoV
Copy link
Contributor

DinoV commented Sep 9, 2019

BPO 38076
Nosy @Yhg1s, @vstinner, @DinoV, @ericsnowcurrently, @pablogsal, @miss-islington, @eduardo-elizondo
PRs
  • bpo-38076: Make struct module PEP-384 compatible #15805
  • bpo-38076: Fix crash on shutdown caused by PEP-384ification of _struct #18038
  • bpo-38076 Clear the interpreter state only after clearing module globals #18039
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/DinoV'
    closed_at = <Date 2020-03-11.16:40:12.810>
    created_at = <Date 2019-09-09.16:35:08.392>
    labels = ['extension-modules', 'type-feature', '3.9']
    title = 'Make struct module PEP-384 compatible'
    updated_at = <Date 2020-03-11.16:40:12.809>
    user = 'https://github.com/DinoV'

    bugs.python.org fields:

    activity = <Date 2020-03-11.16:40:12.809>
    actor = 'petr.viktorin'
    assignee = 'dino.viehland'
    closed = True
    closed_date = <Date 2020-03-11.16:40:12.810>
    closer = 'petr.viktorin'
    components = ['Extension Modules']
    creation = <Date 2019-09-09.16:35:08.392>
    creator = 'dino.viehland'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38076
    keywords = ['patch']
    message_count = 19.0
    messages = ['351524', '351608', '360127', '360129', '360132', '360137', '360138', '360206', '360211', '360412', '360413', '360414', '360415', '360416', '360417', '360500', '360648', '360656', '361342']
    nosy_count = 7.0
    nosy_names = ['twouters', 'vstinner', 'dino.viehland', 'eric.snow', 'pablogsal', 'miss-islington', 'eelizondo']
    pr_nums = ['15805', '18038', '18039']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38076'
    versions = ['Python 3.9']

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Sep 9, 2019

    Make struct module PEP-384 compatible

    @DinoV DinoV added the 3.9 only security fixes label Sep 9, 2019
    @DinoV DinoV self-assigned this Sep 9, 2019
    @DinoV DinoV added the extension-modules C modules in the Modules dir label Sep 9, 2019
    @Yhg1s
    Copy link
    Member

    Yhg1s commented Sep 10, 2019

    New changeset 4f384af by T. Wouters (Dino Viehland) in branch 'master':
    bpo-38076: Make struct module PEP-384 compatible (bpo-15805)
    4f384af

    @tiran tiran closed this as completed Sep 10, 2019
    @tiran tiran added the type-feature A feature request or enhancement label Sep 10, 2019
    @pablogsal
    Copy link
    Member

    Executing this simple code after this commit segfaults:

    from multiprocessing.pool import Pool
    class A(object):
        def __init__(self):
            self.pool = Pool()
        def __del__(self):
            self.pool.close()
            self.pool.join()
    a = A()

    [1] 28019 segmentation fault ./python.exe ../lel.py

    The reason is that there is a call to PyModule_GetState with the module being NULL:

    • thread Support "bpo-" in Misc/NEWS #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
      • frame #0: 0x00000001000b9737 python.exePyModule_GetState(m=0x0000000000000000) at moduleobject.c:565:10 frame #1: 0x00000001013d5972 _struct.cpython-39d-darwin.sos_pack(self=0x00000001032fae50, args=0x00000001033051f8, nargs=1) at _struct.c:1858:5
        frame Rename README to README.rst and enhance formatting #2: 0x00000001013d4a8a _struct.cpython-39d-darwin.sopack(self=0x00000001013be890, args=0x00000001033051f0, nargs=2) at _struct.c:2165:14 frame #3: 0x00000001000b6e5d python.execfunction_vectorcall_FASTCALL(func=0x00000001013beb30, args=0x00000001033051f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at methodobject.c:380:24
        frame Update Python Software Foundation Copyright Year. #4: 0x00000001001efa89 python.exe_PyObject_Vectorcall(callable=0x00000001013beb30, args=0x00000001033051f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21 frame #5: 0x00000001001efbfa python.execall_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbf6e60, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
        frame Add Pycharm's .idea directory to gitignore #6: 0x00000001001ea00c python.exe_PyEval_EvalFrameDefault(f=0x0000000103305050, throwflag=0) at ceval.c:3465:23 frame #7: 0x00000001001d8c87 python.exePyEval_EvalFrameEx(f=0x0000000103305050, throwflag=0) at ceval.c:737:12
        frame Change some mercurial/ hg.python.org references. #8: 0x000000010004374e python.exefunction_code_fastcall(co=0x0000000103165380, args=0x0000000103301400, nargs=2, globals=0x0000000103131d10) at call.c:293:14 frame #9: 0x000000010004317a python.exe_PyFunction_Vectorcall(func=0x000000010324d910, stack=0x00000001033013f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at call.c:331:20
        frame bpo-29474: Improve documentation for weakref.WeakValueDictionary #10: 0x00000001001efa89 python.exe_PyObject_Vectorcall(callable=0x000000010324d910, args=0x00000001033013f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21 frame #11: 0x00000001001efbfa python.execall_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbf8950, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
        frame bpo-29524: Add Objects/call.c file #12: 0x00000001001ea09c python.exe_PyEval_EvalFrameDefault(f=0x0000000103301250, throwflag=0) at ceval.c:3482:23 frame #13: 0x00000001001d8c87 python.exePyEval_EvalFrameEx(f=0x0000000103301250, throwflag=0) at ceval.c:737:12
        frame Disable Travis docs job until a fix is found #14: 0x00000001001f126c python.exe_PyEval_EvalCodeWithName(_co=0x0000000103159930, globals=0x0000000103131d10, locals=0x0000000000000000, args=0x00000001032b2f68, argcount=2, kwnames=0x0000000000000000, kwargs=0x00000001032b2f78, kwcount=0, kwstep=1, defs=0x0000000103173f18, defcount=2, kwdefs=0x0000000000000000, closure=0x0000000000000000, name=0x000000010315c2e0, qualname=0x0000000103158430) at ceval.c:4296:14 frame #15: 0x00000001000435d0 python.exe_PyFunction_Vectorcall(func=0x000000010324d230, stack=0x00000001032b2f68, nargsf=9223372036854775810, kwnames=0x0000000000000000) at call.c:356:12
        frame Make Travis docs build more lenient #16: 0x00000001001efa89 python.exe_PyObject_Vectorcall(callable=0x000000010324d230, args=0x00000001032b2f68, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21 frame #17: 0x00000001001efbfa python.execall_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbfa610, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
        frame Rename Doc/README.txt to Doc/README.rst #18: 0x00000001001ea09c python.exe_PyEval_EvalFrameDefault(f=0x00000001032b2de0, throwflag=0) at ceval.c:3482:23 frame #19: 0x00000001001d8c87 python.exePyEval_EvalFrameEx(f=0x00000001032b2de0, throwflag=0) at ceval.c:737:12
        frame Update link destination of the Mersenne Twister homepage #20: 0x000000010004374e python.exefunction_code_fastcall(co=0x00000001031d5860, args=0x0000000103302fd8, nargs=2, globals=0x00000001031d1b90) at call.c:293:14 frame #21: 0x000000010004317a python.exe_PyFunction_Vectorcall(func=0x0000000103256b90, stack=0x0000000103302fc8, nargsf=9223372036854775810, kwnames=0x0000000000000000) at call.c:331:20
        frame [backport to 3.6] bpo-29474: Improve documentation for weakref.WeakValueDictionary #22: 0x00000001001efa89 python.exe_PyObject_Vectorcall(callable=0x0000000103256b90, args=0x0000000103302fc8, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21 frame #23: 0x00000001001efbfa python.execall_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbfc100, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
        frame bpo-28837: Fix lib2to3 handling of map/zip/filter #24: 0x00000001001ea09c python.exe_PyEval_EvalFrameDefault(f=0x0000000103302e50, throwflag=0) at ceval.c:3482:23 frame #25: 0x00000001001d8c87 python.exePyEval_EvalFrameEx(f=0x0000000103302e50, throwflag=0) at ceval.c:737:12
        frame [backport to 3.5] bpo-29529: Add .travis.yml to 3.5 branch #26: 0x000000010004374e python.exefunction_code_fastcall(co=0x000000010177aee0, args=0x0000000103306380, nargs=1, globals=0x00000001012a97d0) at call.c:293:14 frame #27: 0x000000010004317a python.exe_PyFunction_Vectorcall(func=0x0000000103252410, stack=0x0000000103306378, nargsf=9223372036854775809, kwnames=0x0000000000000000) at call.c:331:20
        frame bpo-28556: Various updates to typing #28: 0x00000001001efa89 python.exe_PyObject_Vectorcall(callable=0x0000000103252410, args=0x0000000103306378, nargsf=9223372036854775809, kwnames=0x0000000000000000) at abstract.h:106:21 frame #29: 0x00000001001efbfa python.execall_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbfdbf0, oparg=1, kwnames=0x0000000000000000) at ceval.c:4984:13
        frame Allow up to a 0.01% drop in coverage #30: 0x00000001001ea09c python.exe_PyEval_EvalFrameDefault(f=0x0000000103306200, throwflag=0) at ceval.c:3482:23 frame #31: 0x00000001001d8c87 python.exePyEval_EvalFrameEx(f=0x0000000103306200, throwflag=0) at ceval.c:737:12
        frame bpo-29576: Improve some deprecations in the importlib #32: 0x000000010004374e python.exefunction_code_fastcall(co=0x0000000101251930, args=0x00007ffeefbfed20, nargs=1, globals=0x0000000101155b30) at call.c:293:14 frame #33: 0x000000010004317a python.exe_PyFunction_Vectorcall(func=0x00000001032535f0, stack=0x00007ffeefbfed18, nargsf=9223372036854775809, kwnames=0x0000000000000000) at call.c:331:20
        frame bpo-29026: Clarify documentation of time.time #34: 0x00000001000ec5b9 python.exe_PyObject_Vectorcall(callable=0x00000001032535f0, args=0x00007ffeefbfed18, nargsf=9223372036854775809, kwnames=0x0000000000000000) at abstract.h:106:21 frame #35: 0x00000001000ec422 python.exe_PyObject_CallOneArg(func=0x00000001032535f0, arg=0x00000001012afc80) at abstract.h:144:12
        frame [backport to 3.5] bpo-28929: Link the documentation to its source file on GitHub #36: 0x00000001000ec36a python.execall_unbound_noarg(unbound=1, func=0x00000001032535f0, self=0x00000001012afc80) at typeobject.c:1462:16 frame #37: 0x00000001000e8220 python.exeslot_tp_finalize(self=0x00000001012afc80) at typeobject.c:6815:15
        frame [backport to 2.7] bpo-28929: Link the documentation to its source file on GitHub #38: 0x00000001002932ff python.exefinalize_garbage(collectable=0x00007ffeefbfee38) at gcmodule.c:866:13 frame #39: 0x000000010028fcd3 python.execollect(state=0x0000000100427890, generation=2, n_collected=0x0000000000000000, n_uncollectable=0x0000000000000000, nofail=1) at gcmodule.c:1098:5
        frame [backport to 3.5] bpo-29438: Fixed use-after-free in key sharing dict #40: 0x000000010028f9a9 python.exe_PyGC_CollectNoFail at gcmodule.c:1851:13 frame #41: 0x000000010022ff1c python.exe_PyImport_Cleanup(tstate=0x0000000101003ec0) at import.c:560:5
        frame [backport to 3.6] Bpo news support 3.6 #42: 0x000000010025298e python.exePy_FinalizeEx at pylifecycle.c:1232:5 frame #43: 0x000000010028d88d python.exePy_RunMain at main.c:648:9
        frame [backport to 2.7] Change documentation's Show Source link to GitHub #44: 0x000000010028dc69 python.exepymain_main(args=0x00007ffeefbff080) at main.c:676:12 frame #45: 0x000000010028dcb7 python.exePy_BytesMain(argc=2, argv=0x00007ffeefbff0f0) at main.c:700:12
        frame bpo-29463: Add docstring field to some AST nodes. #46: 0x0000000100001482 python.exemain(argc=2, argv=0x00007ffeefbff0f0) at python.c:16:12 frame #47: 0x00007fff75bff3d5 libdyld.dylibstart + 1

    I am marking this as a release blocked

    @vstinner
    Copy link
    Member

    I can reproduce the crash on Linux. I interrupt the problem with CTRL+c (why is it blocked? I don't know). Then it does crash.

    First, _PyImport_Cleanup() does clear all modules including _struct.

    Then, _PyImport_Cleanup() calls gc.collect() which finalize_garbage() which calls A.__del__().

    Problem: at this point, the _struct became unusable.

    --

    Thread 1 "python" received signal SIGSEGV, Segmentation fault.
    0x0000000000473f30 in PyModule_GetState (m=0x0) at Objects/moduleobject.c:565
    565	    if (!PyModule_Check(m)) {
    (gdb) py-bt
    Traceback (most recent call first):
      <built-in method pack of module object at remote 0x7fffea99acb0>
      File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", line 400, in _send_bytes
        header = struct.pack("!i", n)
      File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", line 200, in send_bytes
        self._send_bytes(m[offset:offset + size])
      File "/home/vstinner/python/master/Lib/multiprocessing/queues.py", line 368, in put
        self._writer.send_bytes(obj)
      File "/home/vstinner/python/master/Lib/multiprocessing/pool.py", line 649, in close
        self._change_notifier.put(None)
      File "/home/vstinner/python/master/x.py", line 7, in __del__
        self.pool.close()
      Garbage-collecting

    In debug mode, the crash occurs in s_pack() at:

    assert(PyStruct_Check(self));
    

    --

    #define _structmodulestate(o) ((_structmodulestate *)PyModule_GetState(o))
    
    static struct PyModuleDef _structmodule;
    
    #define _structmodulestate_global _structmodulestate(PyState_FindModule(&_structmodule))
    
    #define PyStruct_Check(op) PyObject_TypeCheck(op, (PyTypeObject *)_structmodulestate_global->PyStructType)

    The problem is "_structmodulestate_global->PyStructType": PyState_FindModule(&_structmodule) returns NULL, _structmodulestate() calls PyModule_GetState(NULL) which does crash.

    --

    The question is why the _struct module was cleared whereas there was still a reference to it. Is it part of a reference cycle?

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Jan 16, 2020

    It seems problematic that_PyInterpreterState_ClearModules runs before all instances from a module have been cleared. If PyState_FindModule is no longer able to return module state then there's no way for a module to reliably work at shutdown other than having all instances hold onto the module (or module state) directly from all of their insatances. Certainly that would mimic more closely what happens w/ pure Python instances and modules - the type will hold onto the functions which will hold onto the module global state.

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Jan 16, 2020

    This is a relatively simple repro of the underlying problem:

    import _struct
    
    s = _struct.Struct('i')
    
    class C:
        def __del__(self):
            s.pack(42, 100)
    
    _struct.x = C()

    It's a little bit different in that it is actually causing the module to attempt to throw an exception instead of do a type check.

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Jan 16, 2020

    And here's a variation which doesn't involve any instances from the module:

    import _struct
    
    class C:
        def __init__(self):
            self.pack = _struct.pack
        def __del__(self):
            self.pack('I', -42)
    
    _struct.x = C()

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Jan 17, 2020

    #18038 is a partial fix for this. I think it fixes the crash at shutdown, although I'm still seeing a hang on master on Linux which is different then earlier versions of Python. I seem to have a really bogus stack trace when I attach to it so I'm not quite certain what's going on there.

    @eduardo-elizondo
    Copy link
    Mannequin

    eduardo-elizondo mannequin commented Jan 17, 2020

    Hey all, I've got a fix for this bug and the CI is green: #18039

    TL;DR: The module state have to be cleared at a later time. I explain in detail in the PR.

    Also, I didn't add a new test since there was a test that was already checking for module states in io. I added a short comment on it in the PR as well. Otherwise, lmk and I can create a new test for it.

    @vstinner
    Copy link
    Member

    #18038 is a partial fix for this

    Is it enough to reduce the issue priority from release blocker to normal?

    @pablogsal
    Copy link
    Member

    Is it enough to reduce the issue priority from release blocker to normal?

    The error can still happened in other modules and under similar conditions, no?

    @vstinner
    Copy link
    Member

    The error can still happened in other modules and under similar conditions, no?

    The question was if the next 3.9 *alpha* release must be blocked by this issue. I don't think so. I reduce the priority to normal (not set). If someone disagrees, feel free to raise it again to release blocker ;-)

    @eduardo-elizondo
    Copy link
    Mannequin

    eduardo-elizondo mannequin commented Jan 21, 2020

    The PR that I sent out already fixes the issue. @vstinner, @pablogsal, please take a look again #18039

    That should close this issue, no need to work around the bug priority.

    @vstinner
    Copy link
    Member

    That should close this issue, no need to work around the bug priority.

    I'm concerned by release blocker because 3.9.0a3 version is supposed to be released soon, and usually release blocker do block a release :-)

    @eduardo-elizondo
    Copy link
    Mannequin

    eduardo-elizondo mannequin commented Jan 21, 2020

    I'm concerned by release blocker because 3.9.0a3 version is supposed to be released soon, and usually release blocker do block a release :-)

    Ah! That makes sense!

    In any case, feel free to ping me if you need help on my side to get this PR through (or to remove the release blocker).

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Jan 22, 2020

    With either fix, or with both, on Linux I still see this hang at shutdown. Victor mentioned the fact that he had to hit Ctrl-C on Linux to see this, and I have to do the same thing. Then with the fixes in place the original test case still hangs on shutdown.

    On Python 3.7 (I don't readily have 3.8 available) at least this just runs and completes with no ctrl-C and no crashes. So while either of the fixes may be good to prevent the crashes, there's still probably some underlying issue in multiprocessing. I haven't tested on Mac OS/X.

    It looks like the clearing was originally introduced here: https://bugs.python.org/issue10241 Interestingly there was a similar issue w/ _tkinter, which also used PyType_FromSpec (although it sounds like it was just a ref count issue on the type). Unfortunately there's no associated test cases added to verify the behavior. Antoine and Neil are both now on the PR which removes the collection behavior so hopefully they can chime in on the safety of that fix.

    @DinoV
    Copy link
    Contributor Author

    DinoV commented Jan 24, 2020

    One more data point: Backporting this change to Python 3.6 (I just happened to have it applied there already, so I haven't tried it on 3.7 or 3.8) has no crash and no hangs in multiprocessing on Linux. So something definitely changed in multiproessing which is causing the hang on shutdown, and forces us into this code path where we crash as well.

    @ericsnowcurrently
    Copy link
    Member

    there's still probably some underlying issue in multiprocessing.

    Whoa, I've never heard that before! <wink>

    @miss-islington
    Copy link
    Contributor

    New changeset 4590f72 by Eddie Elizondo in branch 'master':
    bpo-38076 Clear the interpreter state only after clearing module globals (GH-18039)
    4590f72

    @encukou encukou closed this as completed Mar 11, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants