classification
Title: Make struct module PEP-384 compatible
Type: enhancement Stage: resolved
Components: Extension Modules Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: dino.viehland Nosy List: dino.viehland, eelizondo, eric.snow, miss-islington, pablogsal, twouters, vstinner
Priority: Keywords: patch

Created on 2019-09-09 16:35 by dino.viehland, last changed 2020-03-11 16:40 by petr.viktorin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 15805 merged dino.viehland, 2019-09-09 16:35
PR 18038 open dino.viehland, 2020-01-17 18:10
PR 18039 merged eelizondo, 2020-01-17 19:09
Messages (19)
msg351524 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2019-09-09 16:35
Make struct module PEP-384 compatible
msg351608 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2019-09-10 10:18
New changeset 4f384af067d05b16a554bfd976934fca9f87a1cf by T. Wouters (Dino Viehland) in branch 'master':
bpo-38076: Make struct module PEP-384 compatible (#15805)
https://github.com/python/cpython/commit/4f384af067d05b16a554bfd976934fca9f87a1cf
msg360127 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-01-16 15:57
Executing this simple code after this commit segfaults:

from multiprocessing.pool import Pool
class A(object):
    def __init__(self):
        self.pool = Pool()
    def __del__(self):
        self.pool.close()
        self.pool.join()
a = A()

[1]    28019 segmentation fault  ./python.exe ../lel.py

The reason is that there is a call to PyModule_GetState with the module being NULL:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x8)
  * frame #0: 0x00000001000b9737 python.exe`PyModule_GetState(m=0x0000000000000000) at moduleobject.c:565:10
    frame #1: 0x00000001013d5972 _struct.cpython-39d-darwin.so`s_pack(self=0x00000001032fae50, args=0x00000001033051f8, nargs=1) at _struct.c:1858:5
    frame #2: 0x00000001013d4a8a _struct.cpython-39d-darwin.so`pack(self=0x00000001013be890, args=0x00000001033051f0, nargs=2) at _struct.c:2165:14
    frame #3: 0x00000001000b6e5d python.exe`cfunction_vectorcall_FASTCALL(func=0x00000001013beb30, args=0x00000001033051f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at methodobject.c:380:24
    frame #4: 0x00000001001efa89 python.exe`_PyObject_Vectorcall(callable=0x00000001013beb30, args=0x00000001033051f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21
    frame #5: 0x00000001001efbfa python.exe`call_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbf6e60, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
    frame #6: 0x00000001001ea00c python.exe`_PyEval_EvalFrameDefault(f=0x0000000103305050, throwflag=0) at ceval.c:3465:23
    frame #7: 0x00000001001d8c87 python.exe`PyEval_EvalFrameEx(f=0x0000000103305050, throwflag=0) at ceval.c:737:12
    frame #8: 0x000000010004374e python.exe`function_code_fastcall(co=0x0000000103165380, args=0x0000000103301400, nargs=2, globals=0x0000000103131d10) at call.c:293:14
    frame #9: 0x000000010004317a python.exe`_PyFunction_Vectorcall(func=0x000000010324d910, stack=0x00000001033013f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at call.c:331:20
    frame #10: 0x00000001001efa89 python.exe`_PyObject_Vectorcall(callable=0x000000010324d910, args=0x00000001033013f0, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21
    frame #11: 0x00000001001efbfa python.exe`call_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbf8950, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
    frame #12: 0x00000001001ea09c python.exe`_PyEval_EvalFrameDefault(f=0x0000000103301250, throwflag=0) at ceval.c:3482:23
    frame #13: 0x00000001001d8c87 python.exe`PyEval_EvalFrameEx(f=0x0000000103301250, throwflag=0) at ceval.c:737:12
    frame #14: 0x00000001001f126c python.exe`_PyEval_EvalCodeWithName(_co=0x0000000103159930, globals=0x0000000103131d10, locals=0x0000000000000000, args=0x00000001032b2f68, argcount=2, kwnames=0x0000000000000000, kwargs=0x00000001032b2f78, kwcount=0, kwstep=1, defs=0x0000000103173f18, defcount=2, kwdefs=0x0000000000000000, closure=0x0000000000000000, name=0x000000010315c2e0, qualname=0x0000000103158430) at ceval.c:4296:14
    frame #15: 0x00000001000435d0 python.exe`_PyFunction_Vectorcall(func=0x000000010324d230, stack=0x00000001032b2f68, nargsf=9223372036854775810, kwnames=0x0000000000000000) at call.c:356:12
    frame #16: 0x00000001001efa89 python.exe`_PyObject_Vectorcall(callable=0x000000010324d230, args=0x00000001032b2f68, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21
    frame #17: 0x00000001001efbfa python.exe`call_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbfa610, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
    frame #18: 0x00000001001ea09c python.exe`_PyEval_EvalFrameDefault(f=0x00000001032b2de0, throwflag=0) at ceval.c:3482:23
    frame #19: 0x00000001001d8c87 python.exe`PyEval_EvalFrameEx(f=0x00000001032b2de0, throwflag=0) at ceval.c:737:12
    frame #20: 0x000000010004374e python.exe`function_code_fastcall(co=0x00000001031d5860, args=0x0000000103302fd8, nargs=2, globals=0x00000001031d1b90) at call.c:293:14
    frame #21: 0x000000010004317a python.exe`_PyFunction_Vectorcall(func=0x0000000103256b90, stack=0x0000000103302fc8, nargsf=9223372036854775810, kwnames=0x0000000000000000) at call.c:331:20
    frame #22: 0x00000001001efa89 python.exe`_PyObject_Vectorcall(callable=0x0000000103256b90, args=0x0000000103302fc8, nargsf=9223372036854775810, kwnames=0x0000000000000000) at abstract.h:106:21
    frame #23: 0x00000001001efbfa python.exe`call_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbfc100, oparg=2, kwnames=0x0000000000000000) at ceval.c:4984:13
    frame #24: 0x00000001001ea09c python.exe`_PyEval_EvalFrameDefault(f=0x0000000103302e50, throwflag=0) at ceval.c:3482:23
    frame #25: 0x00000001001d8c87 python.exe`PyEval_EvalFrameEx(f=0x0000000103302e50, throwflag=0) at ceval.c:737:12
    frame #26: 0x000000010004374e python.exe`function_code_fastcall(co=0x000000010177aee0, args=0x0000000103306380, nargs=1, globals=0x00000001012a97d0) at call.c:293:14
    frame #27: 0x000000010004317a python.exe`_PyFunction_Vectorcall(func=0x0000000103252410, stack=0x0000000103306378, nargsf=9223372036854775809, kwnames=0x0000000000000000) at call.c:331:20
    frame #28: 0x00000001001efa89 python.exe`_PyObject_Vectorcall(callable=0x0000000103252410, args=0x0000000103306378, nargsf=9223372036854775809, kwnames=0x0000000000000000) at abstract.h:106:21
    frame #29: 0x00000001001efbfa python.exe`call_function(tstate=0x0000000101003ec0, pp_stack=0x00007ffeefbfdbf0, oparg=1, kwnames=0x0000000000000000) at ceval.c:4984:13
    frame #30: 0x00000001001ea09c python.exe`_PyEval_EvalFrameDefault(f=0x0000000103306200, throwflag=0) at ceval.c:3482:23
    frame #31: 0x00000001001d8c87 python.exe`PyEval_EvalFrameEx(f=0x0000000103306200, throwflag=0) at ceval.c:737:12
    frame #32: 0x000000010004374e python.exe`function_code_fastcall(co=0x0000000101251930, args=0x00007ffeefbfed20, nargs=1, globals=0x0000000101155b30) at call.c:293:14
    frame #33: 0x000000010004317a python.exe`_PyFunction_Vectorcall(func=0x00000001032535f0, stack=0x00007ffeefbfed18, nargsf=9223372036854775809, kwnames=0x0000000000000000) at call.c:331:20
    frame #34: 0x00000001000ec5b9 python.exe`_PyObject_Vectorcall(callable=0x00000001032535f0, args=0x00007ffeefbfed18, nargsf=9223372036854775809, kwnames=0x0000000000000000) at abstract.h:106:21
    frame #35: 0x00000001000ec422 python.exe`_PyObject_CallOneArg(func=0x00000001032535f0, arg=0x00000001012afc80) at abstract.h:144:12
    frame #36: 0x00000001000ec36a python.exe`call_unbound_noarg(unbound=1, func=0x00000001032535f0, self=0x00000001012afc80) at typeobject.c:1462:16
    frame #37: 0x00000001000e8220 python.exe`slot_tp_finalize(self=0x00000001012afc80) at typeobject.c:6815:15
    frame #38: 0x00000001002932ff python.exe`finalize_garbage(collectable=0x00007ffeefbfee38) at gcmodule.c:866:13
    frame #39: 0x000000010028fcd3 python.exe`collect(state=0x0000000100427890, generation=2, n_collected=0x0000000000000000, n_uncollectable=0x0000000000000000, nofail=1) at gcmodule.c:1098:5
    frame #40: 0x000000010028f9a9 python.exe`_PyGC_CollectNoFail at gcmodule.c:1851:13
    frame #41: 0x000000010022ff1c python.exe`_PyImport_Cleanup(tstate=0x0000000101003ec0) at import.c:560:5
    frame #42: 0x000000010025298e python.exe`Py_FinalizeEx at pylifecycle.c:1232:5
    frame #43: 0x000000010028d88d python.exe`Py_RunMain at main.c:648:9
    frame #44: 0x000000010028dc69 python.exe`pymain_main(args=0x00007ffeefbff080) at main.c:676:12
    frame #45: 0x000000010028dcb7 python.exe`Py_BytesMain(argc=2, argv=0x00007ffeefbff0f0) at main.c:700:12
    frame #46: 0x0000000100001482 python.exe`main(argc=2, argv=0x00007ffeefbff0f0) at python.c:16:12
    frame #47: 0x00007fff75bff3d5 libdyld.dylib`start + 1


I am marking this as a release blocked
msg360129 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-16 16:47
I can reproduce the crash on Linux. I interrupt the problem with CTRL+c (why is it blocked? I don't know). Then it does crash.

First, _PyImport_Cleanup() does clear all modules including _struct.

Then, _PyImport_Cleanup() calls gc.collect() which finalize_garbage() which calls A.__del__().

Problem: at this point, the _struct became unusable.

--

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x0000000000473f30 in PyModule_GetState (m=0x0) at Objects/moduleobject.c:565
565	    if (!PyModule_Check(m)) {
(gdb) py-bt
Traceback (most recent call first):
  <built-in method pack of module object at remote 0x7fffea99acb0>
  File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", line 400, in _send_bytes
    header = struct.pack("!i", n)
  File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/vstinner/python/master/Lib/multiprocessing/queues.py", line 368, in put
    self._writer.send_bytes(obj)
  File "/home/vstinner/python/master/Lib/multiprocessing/pool.py", line 649, in close
    self._change_notifier.put(None)
  File "/home/vstinner/python/master/x.py", line 7, in __del__
    self.pool.close()
  Garbage-collecting

In debug mode, the crash occurs in s_pack() at:

    assert(PyStruct_Check(self));

--

#define _structmodulestate(o) ((_structmodulestate *)PyModule_GetState(o))

static struct PyModuleDef _structmodule;

#define _structmodulestate_global _structmodulestate(PyState_FindModule(&_structmodule))

#define PyStruct_Check(op) PyObject_TypeCheck(op, (PyTypeObject *)_structmodulestate_global->PyStructType)

The problem is "_structmodulestate_global->PyStructType": PyState_FindModule(&_structmodule) returns NULL, _structmodulestate() calls PyModule_GetState(NULL) which does crash.

--

The question is why the _struct module was cleared whereas there was still a reference to it. Is it part of a reference cycle?
msg360132 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2020-01-16 17:25
It seems problematic that_PyInterpreterState_ClearModules runs before all instances from a module have been cleared.  If PyState_FindModule is no longer able to return module state then there's no way for a module to reliably work at shutdown other than having all instances hold onto the module (or module state) directly from all of their insatances.  Certainly that would mimic more closely what happens w/ pure Python instances and modules - the type will hold onto the functions which will hold onto the module global state.
msg360137 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2020-01-16 19:40
This is a relatively simple repro of the underlying problem:

import _struct

s = _struct.Struct('i')

class C:
    def __del__(self):
        s.pack(42, 100)

_struct.x = C()

It's a little bit different in that it is actually causing the module to attempt to throw an exception instead of do a type check.
msg360138 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2020-01-16 19:47
And here's a variation which doesn't involve any instances from the module:

import _struct

class C:
    def __init__(self):
        self.pack = _struct.pack
    def __del__(self):
        self.pack('I', -42)

_struct.x = C()
msg360206 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2020-01-17 18:12
https://github.com/python/cpython/pull/18038 is a partial fix for this.  I think it fixes the crash at shutdown, although I'm still seeing a hang on master on Linux which is different then earlier versions of Python.  I seem to have a really bogus stack trace when I attach to it so I'm not quite certain what's going on there.
msg360211 - (view) Author: Eddie Elizondo (eelizondo) * Date: 2020-01-17 19:59
Hey all, I've got a fix for this bug and the CI is green: https://github.com/python/cpython/pull/18039

TL;DR: The module state have to be cleared at a later time. I explain in detail in the PR.

Also, I didn't add a new test since there was a test that was already checking for module states in `io`. I added a short comment on it in the PR as well. Otherwise, lmk and I can create a new test for it.
msg360412 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-21 17:26
> https://github.com/python/cpython/pull/18038 is a partial fix for this

Is it enough to reduce the issue priority from release blocker to normal?
msg360413 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-01-21 17:34
> Is it enough to reduce the issue priority from release blocker to normal?

The error can still happened in other modules and under similar conditions, no?
msg360414 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-21 17:36
> The error can still happened in other modules and under similar conditions, no?

The question was if the next 3.9 *alpha* release must be blocked by this issue. I don't think so. I reduce the priority to normal (not set). If someone disagrees, feel free to raise it again to release blocker ;-)
msg360415 - (view) Author: Eddie Elizondo (eelizondo) * Date: 2020-01-21 17:39
The PR that I sent out already fixes the issue. @vstinner, @pablogsal, please take a look again https://github.com/python/cpython/pull/18039

That should close this issue, no need to work around the bug priority.
msg360416 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-21 17:42
> That should close this issue, no need to work around the bug priority.

I'm concerned by release blocker because 3.9.0a3 version is supposed to be released soon, and usually release blocker do block a release :-)
msg360417 - (view) Author: Eddie Elizondo (eelizondo) * Date: 2020-01-21 17:46
> I'm concerned by release blocker because 3.9.0a3 version is supposed to be released soon, and usually release blocker do block a release :-)

Ah! That makes sense!

In any case, feel free to ping me if you need help on my side to get this PR through (or to remove the release blocker).
msg360500 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2020-01-22 19:29
With either fix, or with both, on Linux I still see this hang at shutdown.  Victor mentioned the fact that he had to hit Ctrl-C on Linux to see this, and I have to do the same thing. Then with the fixes in place the original test case still hangs on shutdown.  

On Python 3.7 (I don't readily have 3.8 available) at least this just runs and completes with no ctrl-C and no crashes.  So while either of the fixes may be good to prevent the crashes, there's still probably some underlying issue in multiprocessing.  I haven't tested on Mac OS/X.

It looks like the clearing was originally introduced here: https://bugs.python.org/issue10241  Interestingly there was a similar issue w/ _tkinter, which also used PyType_FromSpec (although it sounds like it was just a ref count issue on the type).  Unfortunately there's no associated test cases added to verify the behavior.  Antoine and Neil are both now on the PR which removes the collection behavior so hopefully they can chime in on the safety of that fix.
msg360648 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2020-01-24 20:00
One more data point: Backporting this change to Python 3.6 (I just happened to have it applied there already, so I haven't tried it on 3.7 or 3.8) has no crash and no hangs in multiprocessing on Linux.  So something definitely changed in multiproessing which is causing the hang on shutdown, and forces us into this code path where we crash as well.
msg360656 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2020-01-24 20:57
> there's still probably some underlying issue in multiprocessing.

Whoa, I've never heard that before!  <wink>
msg361342 - (view) Author: miss-islington (miss-islington) Date: 2020-02-04 10:29
New changeset 4590f72259ecbcea66e12a28af14d867255d2e66 by Eddie Elizondo in branch 'master':
bpo-38076 Clear the interpreter state only after clearing module globals (GH-18039)
https://github.com/python/cpython/commit/4590f72259ecbcea66e12a28af14d867255d2e66
History
Date User Action Args
2020-03-11 16:40:12petr.viktorinsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-02-04 10:29:49miss-islingtonsetnosy: + miss-islington
messages: + msg361342
2020-01-24 20:57:56eric.snowsetmessages: + msg360656
2020-01-24 20:00:05dino.viehlandsetmessages: + msg360648
2020-01-22 19:29:56dino.viehlandsetmessages: + msg360500
2020-01-21 17:46:44eelizondosetmessages: + msg360417
2020-01-21 17:42:03vstinnersetmessages: + msg360416
2020-01-21 17:39:28eelizondosetmessages: + msg360415
2020-01-21 17:36:19vstinnersetpriority: release blocker ->

messages: + msg360414
2020-01-21 17:34:41pablogsalsetmessages: + msg360413
2020-01-21 17:26:23vstinnersetmessages: + msg360412
2020-01-17 19:59:52eelizondosetmessages: + msg360211
2020-01-17 19:09:31eelizondosetpull_requests: + pull_request17438
2020-01-17 18:12:09dino.viehlandsetmessages: + msg360206
2020-01-17 18:10:27dino.viehlandsetstage: resolved -> patch review
pull_requests: + pull_request17437
2020-01-16 19:47:40dino.viehlandsetmessages: + msg360138
2020-01-16 19:40:37dino.viehlandsetmessages: + msg360137
2020-01-16 17:25:23dino.viehlandsetnosy: + dino.viehland
messages: + msg360132
2020-01-16 16:54:03dino.viehlandsetnosy: + eelizondo, - dino.viehland
2020-01-16 16:47:12vstinnersetnosy: + vstinner
messages: + msg360129
2020-01-16 15:57:08pablogsalsetstatus: closed -> open
priority: normal -> release blocker

nosy: + pablogsal
messages: + msg360127

resolution: fixed -> (no value)
2019-09-10 11:21:23christian.heimessetstatus: open -> closed
type: enhancement
resolution: fixed
stage: patch review -> resolved
2019-09-10 10:18:40twouterssetnosy: + twouters
messages: + msg351608
2019-09-09 16:35:56dino.viehlandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request15456
2019-09-09 16:35:08dino.viehlandcreate