This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [C-API] Convert obvious unsafe macros to static inline functions
Type: Stage: resolved
Components: C API Versions: Python 3.10
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: corona10, erlendaasland, pablogsal, rhettinger, vstinner
Priority: normal Keywords: patch

Created on 2021-03-15 13:54 by erlendaasland, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
macros-that-reuse-args.txt erlendaasland, 2021-03-16 13:29
Pull Requests
URL Status Linked Edit
PR 24875 closed erlendaasland, 2021-03-15 14:36
Messages (11)
msg388739 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-15 13:54
Convert macros to static inline functions if...:
- the macro contains a clear pitfall (for example "duplication of side effects")
- the fix is trivial
- the macro is not used as an l-value (for example Py_TYPE())


See also:
- https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html
- bpo-43181
- bpo-39573
- bpo-40170
msg388740 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-15 14:12
For me, one of the my worst issue is when a macro evalutes an argument twice.

Example:
---
#include <stdio.h>

#define DOUBLE(value) ((value) + (value))

int main()
{
    int x = 1;
    // expanded to: ((++x) + (++x))
    int y = DOUBLE(++x);
    printf("x=%i y=%i\n", x, y);
    return 0;
}
---

Surprising output:
---
$ gcc x.c -o x; ./x
x=3 y=6
---

Expected output:
---
x=2 y=4
---

Fixed code using a static inline function:
---
#include <stdio.h>

static inline int DOUBLE(int value) {
    return value + value;
}

int main()
{
    int x = 1;
    // expanded to: DOUBLE(++x)
    int y = DOUBLE(++x);
    printf("x=%i y=%i\n", x, y);
    return 0;
}
---

Output:
---
$ gcc x.c -o x; ./x
x=2 y=4
---
msg388741 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-15 14:17
I would add that we should pay attention to:

* not introducing a backward incompatible C API change by mistake
* not introducing new compiler warnings.

A common source of warning is that macros have no type for their arguments or return value, whereas static inline functions are strictly types. A common pattern in the Python header file is to use _PyObject_CAST() or _PyObject_CONST_CAST(). Example:

#define _PyObject_CAST(op) ((PyObject*)(op))
#define _PyObject_CAST_CONST(op) ((const PyObject*)(op))

static inline void _Py_INCREF(PyObject *op) { (...) }
#define Py_INCREF(op) _Py_INCREF(_PyObject_CAST(op))

For macros of the *internal* C API, it's ok to becom more strict about types. For example, in the *public* C API, I replaced:

#define PyObject_INIT(op, typeobj) \
    ( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) )

with:

PyAPI_FUNC(PyObject *) PyObject_Init(PyObject *, PyTypeObject *);
#define PyObject_INIT(op, typeobj) PyObject_Init(_PyObject_CAST(op), (typeobj))

But for the internal C API, I added:

static inline void _PyObject_Init(PyObject *op, PyTypeObject *typeobj) { ... }

which doesn't cast its arguments.
msg388742 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-15 14:28
Should we try to split all macros like that, if possible?
msg388839 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-16 13:27
Under Include/, including subdirectories, there are 88 macros that reuse arguments.
msg388870 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-16 19:25
PyUnicode_WRITE, PyUnicode_READ, and PyUnicode_READ_CHAR all reuse their arguments, and there's a lot of occurrences (58 AFAICS) where they are called with the increment operator applied to the index argument. For example:

Objects/unicodeobject.c:            PyUnicode_WRITE(kind, data, pos++, ch);
Modules/_io/textio.c:               c = PyUnicode_READ(kind, in_str, i++);
Objects/stringlib/unicode_format.h: c = PyUnicode_READ_CHAR(self->str.str, self->index++);


Would a single PR for all three of them be acceptable, Victor?
msg388893 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-03-16 23:34
Be careful about performance.  We know for sure that macros will inline.  In contrast, inline functions might or might not inline (there are rules for when it can be done).  When applied across source files, inlining often only occurs with an LTO build — any other build can't inline across object files.  Historically, we've never had to depend on LTO, so that would be a regression for some users.  We've already have a performance bug that needed to be reverted because of too aggressive conversion of macros to inline functions.  

A problem with these blanket PRs is that they second guess the original developer who may have made a conscious and informed decision to use a macro.  If you really think that developer was wrong, they should be included in the conversation and given the chance to opt in or out.  Otherwise, we're mechanically undoing previous intelligent efforts.

Ideally, there needs to be a more nuanced view than "macros are bad, inline functions are good".
msg388899 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-03-17 00:00
I agree we should be careful here. There are several things to consider:

* Macros transformed into function may or may not get inlined so there is a risk of affecting performance if we just transform them in bulk.

* Because the macro is defined as 'static inline' in the header file, then it *can* be inlined everywhere if the compiled decides to do it. It will be one copy symbol in the text segment on every compilation unit. Notice that 'inline' has a special meaning in C99 and it doesn't only mean "inline this function". It has visibility implications (see section 6.7.4 of the C99 standard).

* There is risk to introducing a backwards-incompatible ABI change by mistake as Victor mentions.

* This can affect also the link patterns of extension modules because these symbols may be visible in the resulting binaries when they were macro-level-inlined before. I cannot see any problem that can happen because of this but we should carefully consider it.

In short, there is the value of converting these macros for extra type safety, but we need to at the very least be very careful and if we decide to go forward this needs as many reviews as possible so we don't miss anything. I would recommend maybe start a thread about this in python-dev.
msg388910 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-17 07:52
Pablo:
> I agree we should be careful here. There are several things to consider:

Yes, and thanks for your thorough reply and insights. I do not suggest converting macros blindly, and I have no intention of doing so. Apart from the three PyUnicode_*() cases mentioned in msg388870, I've not found any misuse of macros that reuse their arguments in the CPython code base.

Perhaps using a meta-issue is the wrong approach. I guess a PEP is too much? I'll start a thread on dpo (or on the list).

Thanks.
msg389078 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-19 12:07
FYI, thread started on https://discuss.python.org/t/what-to-do-with-unsafe-macros/7771?u=erlendaasland
msg389153 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-20 10:32
Based on the feedback, I'm closing this. The PyUnicode_* macros, or the callers, can be fixed in a separate issue.
History
Date User Action Args
2022-04-11 14:59:42adminsetgithub: 87668
2021-03-20 10:32:47erlendaaslandsetstatus: open -> closed
resolution: wont fix
messages: + msg389153

stage: patch review -> resolved
2021-03-19 12:07:02erlendaaslandsetmessages: + msg389078
2021-03-17 07:52:54erlendaaslandsetmessages: + msg388910
2021-03-17 00:00:23pablogsalsetmessages: + msg388899
2021-03-16 23:34:25rhettingersetnosy: + rhettinger, pablogsal
messages: + msg388893
2021-03-16 19:25:49erlendaaslandsetmessages: + msg388870
2021-03-16 13:29:44erlendaaslandsetfiles: - macros-that-reuse-args.txt
2021-03-16 13:29:38erlendaaslandsetfiles: + macros-that-reuse-args.txt
2021-03-16 13:27:02erlendaaslandsetfiles: + macros-that-reuse-args.txt

messages: + msg388839
2021-03-15 14:36:23erlendaaslandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23637
2021-03-15 14:28:00erlendaaslandsetmessages: + msg388742
2021-03-15 14:17:22vstinnersetmessages: + msg388741
2021-03-15 14:12:30vstinnersetmessages: + msg388740
2021-03-15 13:54:44erlendaaslandcreate