Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The danger of PyType_FromSpec() #71166

Closed
serhiy-storchaka opened this issue May 8, 2016 · 14 comments
Closed

The danger of PyType_FromSpec() #71166

serhiy-storchaka opened this issue May 8, 2016 · 14 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes docs Documentation in the Doc dir type-security A security issue

Comments

@serhiy-storchaka
Copy link
Member

BPO 26979
Nosy @loewis, @nascheme, @ncoghlan, @encukou, @serhiy-storchaka, @ctismer, @Dormouse759

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-05-08.18:24:38.790>
labels = ['type-security', '3.7', '3.8', 'docs']
title = 'The danger of PyType_FromSpec()'
updated_at = <Date 2018-10-26.14:37:17.269>
user = 'https://github.com/serhiy-storchaka'

bugs.python.org fields:

activity = <Date 2018-10-26.14:37:17.269>
actor = 'Christian.Tismer'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2016-05-08.18:24:38.790>
creator = 'serhiy.storchaka'
dependencies = []
files = []
hgrepos = []
issue_num = 26979
keywords = []
message_count = 13.0
messages = ['265152', '318493', '325061', '325070', '325149', '325307', '325321', '325360', '325427', '325516', '325523', '325526', '328554']
nosy_count = 8.0
nosy_names = ['loewis', 'nascheme', 'ncoghlan', 'petr.viktorin', 'docs@python', 'serhiy.storchaka', 'Christian.Tismer', 'Dormouse759']
pr_nums = []
priority = 'high'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'security'
url = 'https://bugs.python.org/issue26979'
versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

@serhiy-storchaka
Copy link
Member Author

There is a catch when convert static types to dynamic types using PyType_FromSpec(). tp_new is not inherited for static types whose base class is 'object' (see a comment in inherit_special() in Objects/typeobject.c:4569 for explanation). Such types left not callable and not pickleable unless tp_new is explicitly specified. But since PyType_FromSpec() creates heap types, tp_new is inherited from the base 'object' class. Converted types unexpectedly becomes callable and pickleable. Since object.__new__ wouldn't insure the invariants that the extension type's own factory function ensures, instantiated object can be in inconsistent state. Using it can lead to a crash. We already fallen two time in this catch with _tkinter (bpo-15721) and curses.panel (bpo-14936) modules (fixed in bpo-23815).

We should document this catch and a way to avoid it (explicitly set tp_new to NULL for the result of PyType_FromSpec()). Both the C API documentation and PEP-384 should be updated. Would be great if standard examples include a case of non-callable heap type.

As option we can introduce more handy way to set tp_new to NULL by adding {Py_tp_new, NULL} to slots array (currently this doesn't have any effect). But this is new feature.

@serhiy-storchaka serhiy-storchaka added docs Documentation in the Doc dir type-security A security issue labels May 8, 2016
@terryjreedy terryjreedy added 3.7 (EOL) end of life 3.8 only security fixes labels Jun 2, 2018
@ctismer
Copy link
Contributor

ctismer commented Jun 2, 2018

There is another catch that belongs in the same category.

There is line 2841 in typeobject.c

    if (type->tp_dealloc == NULL) {
        /* It's a heap type, so needs the heap types' dealloc.
           subtype_dealloc will call the base type's tp_dealloc, if
           necessary. */
        type->tp_dealloc = subtype_dealloc;
    }

When converting the PySide types, it became a real problem. Types with an
existing tp_dealloc behaved normal, but those with tp_dealloc=NULL
gave segfaults. I ended up supplying a default dummy_dealloc()
that is just there to occupy this slot. Then it worked.

And another source of incompatibility:

PyType_FromSpec() enforces a tp_name field to be dotted, to compute a module
name. Very annoying incompatibility that could be avoided.

A small but remarkable glitch:
Most fields of the PyType_Spec/PyType_Slot combo are copied, but some are not.
So expect funny behaviour when creating types with dynamically generated
specs without strdup. That should IMHO be changed.

@encukou
Copy link
Member

encukou commented Sep 11, 2018

I don't think PEP-384 should be updated. It documents the feature as it was added, and shouldn't be used as documentation now.
Instead, Python documentation should be improved to cover all of PEP-389.
I submitted a pull request for that (GH-9154).

For this issue, the fix should be updating C API documentation and, for Pyhon 3.8+, making {*, NULL} slots override the inherited value.

I think the PyType_Spec/PyType_Slot copying should have its own issue.

@encukou
Copy link
Member

encukou commented Sep 11, 2018

Christian, do you have a specific example of the default tp_dealloc doing the wrong thing?

@ctismer
Copy link
Contributor

ctismer commented Sep 12, 2018

Hi Petr,

yes I have that what generated the wrong thing, but it is inside
a specific PySide repository in a big project. Before I try to extract
and simulate that, let me just show it.

All types which had been no heaptypes before were already quite complicated
types which had their own deallocators before, or a 0 if they did not
need one, or better "should not have one".

When switching to the new interface, those deallocators which were 0 were
turned into the default deallocator, which crashed most of the time.

By replacing it by a dummy function fixed the problem. Example:

static PyType_Slot PySideSignalMetaType_slots[] = {
{Py_tp_methods, (void *)Signal_methods},
{Py_tp_base, (void *)&PyType_Type},
{Py_tp_free, (void *)PyObject_GC_Del},
{Py_tp_dealloc, (void *)SbkDummyDealloc},
{0, 0}
};
static PyType_Spec PySideSignalMetaType_spec = {
"PySide2.QtCore.MetaSignal",
0,
// sizeof(PyHeapTypeObject) is filled in by PyType_FromSpecWithBases
// which calls PyType_Ready which calls inherit_special.
0,
Py_TPFLAGS_DEFAULT,
PySideSignalMetaType_slots,
};

You can find the checkins here. This branch contains everything relevant in small steps.

pyside/pyside2-setup@2f0658b

The first working transformation was in the commit
"PEP-384-8-HT-4: Successful Restart using PyType_FromSpec".

I have yet no idea how to extract a minimum example that produces this problem.

Summary:
In a way, without being heaptypes, the old types were dynamic, too, and inserting
something that was not there before was a bad idea.

(sorry, there is more...)

@encukou
Copy link
Member

encukou commented Sep 13, 2018

Thanks! I think that explains enough of the issue.

Converting static types to heap ones is just one way you can use PyType_Spec. Another is writing new types, which should work much like Python classes.

So I don't think we should change the default, but rather document the use case. I'd rather update the docs after #53400 is merged, so this is blocked now.

I'll also try making an explicit {Py_tp_dealloc, NULL} override the inherited value, as per Serhiy's suggestion. But it'll only go to 3.8, so it won't help PySide now.

@ctismer
Copy link
Contributor

ctismer commented Sep 14, 2018

Petr, Fine! PySide's problems are solved for now, no worries.
But as mentioned, there is more.

Problem 1)
----------

There is another thing that gives problems:

When creating types which have fields "behind" the normal type fields,
PySide had explicit structs that contained PyHeaptypeObject explicitly,
plus the extra fields.

We changed that into a normal PyType declaration (opaque) and a pointer
expression that dynamically computed the offset.

For that, we needed sizeof(PyType_Type). Ok, that could be solved by Python
code, but this "trick" is not so obvious and should be documented, or
a variable should be provided that gives the heap type size, dynamically.

Problem 2)
----------

Harder is this:

For some operations, we needed access to fields of normal types, for instance the
tp_new field of PyType_Type. That is not possible, although it would be easy:

PyType_GetSlot forbids access to normal types, rigorously.
But concerning the "normal" type fields, those which are not indirections,
it would cost nothing to allow this for normal types.

This uses a property of types which has not explicitly been used:

  • All types have the same basic structure *

That means, we can allow access to all "tp_" fields with simply changing a check
in PyType_GetSlot.

I'm not sure if that information should be made official. May be it should be hidden
and PyType_GetSlot should be made more complicated. But the proposed change comes
at no cost. And the prefixes like "nb_" or "sq_" are still visible, so I don't
think there was an intent to make type structures completely opaque?

For PySide, it was crucial to use that information to compute the offset of
certain fields of normal types, which is a bit of a PEP-384 breach.
We wrote code that verifies that these assumptions are all valid, but I would be
happy to remove that and revert to only using PyType_GetSlot.

Please let me know if I can help with something!

Cheers - Chris

@encukou
Copy link
Member

encukou commented Sep 14, 2018

  1. If I understand correctly, this problem could be solved by per-class C storage? Something like PyModuleDef.m_size / PyModule_GetState, but for classes?

  2. Making PyType_GetSlot work for static types would not be trivial.
    All types do have the same *basic* structure (the tp_* fields), but the nb_/sq_/etc. fields don't need to be in the same order, and might not exist at all.
    Even making it work for the tp_* fields only would bake in some assumptions I do want to get rid of.
    I'd like to understand your use case better. Do you have a specific example here?

@ctismer
Copy link
Contributor

ctismer commented Sep 15, 2018

  1. If I understand correctly, this problem could be solved by
    per-class C storage? Something like PyModuleDef.m_size /
    PyModule_GetState, but for classes?

No. To obtain sizeof(PyType_Type), we simply used the Python code

type.__basicsize__

It is a simple trick, but probably very few people know this path
to get that info. At least, for me it took way too long ;-)

  1. Making PyType_GetSlot work for static types would not be trivial.
    All types do have the same *basic* structure (the tp_* fields), but
    the nb_/sq_/etc. fields don't need to be in the same order, and
    might not exist at all. Even making it work for the tp_* fields only
    would bake in some assumptions I do want to get rid of.

That's my question. Do you want to hide the fact that some fields
are direct and others are indirect? I don't think that will ever be
changed before Python 4.0, so why put this burden on every user, then?

I would simply give access to tp_* for all types and that's it.

I'd like to
understand your use case better. Do you have a specific example
here?

Yes. I need access to PyType_Type's tp_new field.

PyType_Type is a very central structure in Python, and it really hurts
to hide every attribute away.

You can see the use of this field here:
https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/basewrapper.cpp#L319

// The meta type creates a new type when the Python programmer extends a wrapped C++ class.
newfunc type_new = reinterpret_cast<newfunc>(PyType_Type.tp_new);
SbkObjectType *newType = reinterpret_cast<SbkObjectType*>(type_new(metatype, args, kwds));
if (!newType)
    return 0;

Not being able to use this field led to many weird workaround attempts,
which did not really work. We tried to call this function from Python code,
but then there are several checks which make it impossible to use.

We could maybe generate a heaptype clone of PyType_Type and grab the function
from there. But this is quite obscure and cannot be the recommended way to
get at the tp_new function?

Maybe there also is a way to avoid the use of this function all together.
But we did not want to re-implement a central function of a huge project
that we only understood good enough to change it a bit.

Maybe we approach it again. At that time, it was not even clear that we
would succeed with the limited API. Now that we know, we can try more.

@encukou
Copy link
Member

encukou commented Sep 17, 2018

Ah! It seems you don't need access to all tp_* slots of any type here, but just to PyType.tp_new – only one specific type, and only one specific slot.
Specifically, you need a way to create class with a metaclass, from C.

Is that right? Or was this only an example of a larger problem?

(Sorry for drawing the discussion out! Dropping the checks in PyType_GetSlot would be a simple change technically, but it doesn't match my mental model of what the stable ABI should be. I'm hoping to get a better solution for your use case, and others like it.)

To obtain sizeof(PyType_Type), we simply used the Python code
type.__basicsize__
It is a simple trick, but probably very few people know this path
to get that info. At least, for me it took way too long ;-)

Well, basicsize might be exposed through Python, but it's still not part of the limited API. Which is fine – all that means is you might need to change the extension for some future version of Python. I'm sure your tests will tell you when the time comes, and I hope we'll have a better solution then!
BTW, I'm honestly very impressed how far PySide got with the limited API!

@ctismer
Copy link
Contributor

ctismer commented Sep 17, 2018

... Specifically, you need a way to create class with a metaclass, from C.

Is that right? Or was this only an example of a larger problem?

Yes, you are exactly right. I needed access to very few fields.
In particular:

PyTypeObject.tp_init
PyTypeObject.tp_basicsize    (workaround using Python)

<a_few_types>.tp_dict

The latter would not be necessary for PEP-384, but for my __signature__
extension, but that is another story :)

<basicsize>

Well, basicsize might be exposed through Python, but it's still not
part of the limited API. Which is fine – all that means is you might
need to change the extension for some future version of Python.

Oh? I thought it is automatically allowed to use if it is exposed
by Python. I thought that the central property of basicsize would be
that it is no longer a constant, but that a size is always there.

Sure, we could have worked without the size and create a more complex
structure that does not know anything about types. I actually started
with that approach. But in the end I thought it would not hurt to assume
that there is a (variable) size, as long as we are below Python 4.0 .

I'm sure your tests will tell you when the time comes, and I hope we'll
have a better solution then!

Yes, I had a very hard time to convince myself that I _may_ use some
assumptions, as long as I always prove that these assumptions are right.
But in a way, I feel guilty and would prefer to go without any trickery.

BTW, I'm honestly very impressed how far PySide got with the limited API!

Thank you! I was hoping to get ready after 2 months, when I realized that
all types needed to be changed. So I pulled teeth, and it took >5 months.
Btw., I don't understand how the PyQt5 guy(s) solved this. Maybe his
structure is way cleaner and simpler than the PySide mess? I'll ask him.

Ciao -- Chris

@ctismer
Copy link
Contributor

ctismer commented Sep 17, 2018

If somebody is interested to see how the transition to the
Limited API was done, here is the documentation that I was
forced to write :)

https://github.com/pyside/pyside2-setup/blob/5.11/sources/shiboken2/libshiboken/pep384impl_doc.rst

@ctismer
Copy link
Contributor

ctismer commented Oct 26, 2018

The default of PyType_FromSpec for tp_dealloc is wrong!
-------------------------------------------------------

After a long struggle with crashes and leaks, the situation
was finally clarified:

When a type is converted from a static type to a heaptype
via PyType_FromSpec, and the tp_dealloc field is NULL,
then subtype_dealloc is inserted.

This is not correct.
The reasoning in the code is that a heaptype has subtype_dealloc.
But before the conversion, the type was static, and for
static types the function object_dealloc was inserted.

In the given type, the only change is to become a heaptype.
The by default inserted object_dealloc should not be changed,
since the existing code was written with object_dealloc in
mind and not the consequences of replacing it with subtype_dealloc.

Before this solution, I used a dummy function to prevend
subpype_dealloc being inserted, but that caused leaks.

It was hard to understand that the default for a static type
is object_dealloc. After that, it was easy to fix that:

The now correctly working workaround is to explicitly insert
an object_dealloc function whenever the default for tp_dealloc
is NULL.

Again, in order to use this fix, it is necessary to break the
Limited API, because in order to write an object_dealloc function
(it is not public) you need access to type objects.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@vstinner
Copy link
Member

vstinner commented Oct 6, 2023

I close the issue.

But since PyType_FromSpec() creates heap types, tp_new is inherited from the base 'object' class. Converted types unexpectedly becomes callable and pickleable.

Issued fixed by adding Py_TPFLAGS_DISALLOW_INSTANTIATION to Python 3.10.

When a type is converted from a static type to a heaptype via PyType_FromSpec, and the tp_dealloc field is NULL, then subtype_dealloc is inserted.

Since Python 3.8, the dealloc function must now decrement a reference to its type: https://docs.python.org/dev/c-api/typeobj.html#c.PyTypeObject.tp_dealloc

Having a NULL tp_dealloc may be treated as an error.


@ctismer: Your feedback is very valuable. Please open issues for remaining issues which are not fixed in Python 3.13.

@vstinner vstinner closed this as completed Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life 3.8 only security fixes docs Documentation in the Doc dir type-security A security issue
Projects
None yet
Development

No branches or pull requests

5 participants