Title: PyStructSequence_NewType is broken; makes GC type without setting Py_TPFLAGS_HEAPTYPE
testnewtype.c josh.r, 2016-11-16 00:00 C source for test module josh.r, 2016-11-16 00:00
Author: Josh Rosenberg (josh.r) Date: 2016-11-16 00:00
I could be missing something, but it looks like PyStructSequence_NewType is guaranteed broken. Specifically, it allocates the memory with PyType_GenericAlloc (use PyType_Type as the base), and PyType declares itself to have garbage collected instances, so the memory is allocated with _PyObject_GC_Malloc and added to GC tracking just before PyType_GenericAlloc returns.

Problem is, PyStructSequence_Init2 copies from a template struct which sets Py_TPFLAGS_DEFAULT. So even though the new struct sequence is GC allocated and managed, it doesn't set Py_TPFLAGS_HEAPTYPE, which means when GC tries to traverse it, type's type_traverse errors out with:

Fatal Python error: type_traverse() called for non-heap type 'NameOfStructSequence'

It's possible I'm missing something here, so I've attached simple test code for others to confirm (it omits most error checking for simplicity/readability).

Just compile the extension module, then run (with the module in the working directory):

    python -c "import testnewtype; Foo = testnewtype.makeseq('Foo', ['x', 'y'])"

There is a commented out line in the test code that explicitly sets the HEAPTYPE flag after type construction (no idea if that's supposed to be supported), and uncommenting it seems to fix the crash (though again, if retroactively flagging as HEAPTYPE is unsupported, something else may break here).

I can't find any actual use of PyStructSequence_NewType in the CPython code base, which probably explains why this hasn't been seen; odds are, most extensions using struct sequences are using Init, not NewType, and I only ran into this because I was experimenting with a struct sequence based replacement for collections.namedtuple (to end the start up time objections to using namedtuple in the built-in modules, e.g. #28638).
Author: Josh Rosenberg (josh.r) Date: 2016-11-16 00:14
Note: Uncommenting the line that forces Py_TPFLAGS_HEAPTYPE isn't enough, since it looks like the PyHeapTypeObject fields aren't initialized properly, causing seg faults if you access, for example, __name__/__qualname__ (or print the type's repr, which implicitly accesses same):

    python -c "import testnewtype; Foo = testnewtype.makeseq('Foo', ['x', 'y']); print(Foo.__name__)"

The type behaves properly otherwise (you can make instances, access values on them), but crashing on repr is probably poor form. :-)
Author: Josh Rosenberg (josh.r) Date: 2016-11-16 02:07
On further checking, looks like there is a lot of work that should be done to initialize heap types (see PyType_FromSpecWithBases) that PyStructSequeuence_Init2 doesn't do (because it thinks it's working on a static type). I think the solution here is decouple PyStructSequeuence_NewType from PyStructSequeuence_Init2 (or to minimize code duplication, make both of them call a third internal function that accepts additional flags, e.g. to make the type a HEAPTYPE, BASETYPE, or both, and performs the additional work required for those flags if given); Init2 clearly expects a static type, and definitionally, NewType is producing a heap/dynamic type.
Author: Petr Viktorin (petr.viktorin) Date: 2018-11-14 10:01
Should be fixed in PR9665 (bpo-34784), thanks to Eddie Elizondo.
