This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: What does the existence of a struct in a header file imply about the C-API
Type: Stage: resolved
Components: Versions:
process
Status: closed Resolution: postponed
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, gvanrossum, petr.viktorin, ronaldoussoren, vstinner
Priority: normal Keywords:

Created on 2021-01-28 11:12 by Mark.Shannon, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg385851 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-28 11:12
Given the lack of explicit documentation on this subject, and differing opinions among core developers, I though it would be good to discuss how the existence of a struct in a header file constrains the C-API.

Original PR provoking this discussion: https://github.com/python/cpython/pull/24298

Suppose a header file, e.g. funcobject.h, contains a struct, e.g. PyFunctionObject, to what extent is that struct part of the API?


If a struct is present in a header file, there are three options for what means in terms of the API (that make sense to me).

1. That the struct is not part of the API and may be freely changed or deleted.
2. That the struct is produced, or initialized, by an API function, which implies that existing fields will continue to exist, but they can be reorder or added to.
3. That the struct is consumed by an API function, which implies that the struct must keep its exact shape, only adding fields if flags are present in the pre-existing fields to indicate the use of the extension.

PyTypeObject is an example of (3).

We should be able to infer which of the above cases applies, if not explicitly documented, for any struct.

Using PyFunctionObject in funcobject.h as an example:

There is no API function or macro that directly produces or consumes the struct, which would imply case 1. But, the struct is documented as the struct for Python functions, and `PyFunction_Check()` exists, which strongly implies that the following code is OK:

if (PyFunction_Check(obj)) {  
    PyFunctionObject *func = (PyFunctionObject *)obj;
    ...

which therefore implies that (2) applies.
(3) does not apply as there is no API that takes a PyFunctionObject struct as a parameter.

Similar logic can be applied to other parts of the API.


Rather than go through this tortuous analysis for all headers, it might be better to document which structs are part of the API.
msg385854 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-01-28 12:38
When we're talking just about API, not the stable ABI (which contains only a few structs), reordering and additions should be fair game.


> Rather than go through this tortuous analysis for all headers, it might be better to document which structs are part of the API.

Yup. I'm trying to do this for stable ABI/limited API, and it's unfortunately taking quite a long time.
msg386853 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-02-12 10:12
To channel Victor: Another thing to look into is to introduce accessors for struct fields in category 1 and 2 so that the struct can be made private in the future.

The difference between category 1 and 2 is sadly not very clear cut. Anything defined in public headers could be used 3th-party code.

In this particular instance there is no documentation for the fields of the struct, which may indicate that the struct is private. However, this struct is used by code outside of the stdlib and there are currently no accessors that can replace this usage.
msg386856 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-02-12 10:39
Why do you think the distinction between category 1 and 2 is not clear?

If the struct if not produced, or initialized, by an API function, then it cannot be accessed in the first place. If you can't access the struct in the first place, then you can't access its fields.
msg387806 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-02-28 09:49
Sorry about the slow response.

I misread your initial message, the distinction between (1) and (2) in your list is clear.

to (3): New fields can be added while maintaining API (but not ABI) compatibility as long as the default value is the default value for static initialisers (assuming the usual way we initialise structs in CPython). That is, adding a new field to PyTypeObject is ok, as long as it at the end and defaults to NULL or 0.

Especially with category 1 it is not entirely clear which structs are in that category. Is PyLongObject in this category?  The struct is not documented, but has a name that seems to indicate that it is public.   Likewise for PyTupleObject, where the shape of the struct is used by documented APIs but the shape of the strut itself is not documented.

BTW. For my own code I do directly access structs where necessary, even if they aren't documented. I totally expect that this will require adjustments for new Python releases (such as when the unicode representation changed).
msg392904 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-04 14:42
Thanks for the responses.
Probably nothing to do for now.
History
Date User Action Args
2022-04-11 14:59:40adminsetgithub: 87220
2021-05-04 14:42:14Mark.Shannonsetstatus: open -> closed
resolution: postponed
messages: + msg392904

stage: resolved
2021-02-28 09:49:58ronaldoussorensetmessages: + msg387806
2021-02-12 10:39:02Mark.Shannonsetmessages: + msg386856
2021-02-12 10:12:26ronaldoussorensetnosy: + ronaldoussoren
messages: + msg386853
2021-01-28 12:38:51petr.viktorinsetmessages: + msg385854
2021-01-28 11:12:05Mark.Shannoncreate