New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using __forceinline and __attribute__((always_inline)) on static inline functions (Py_INCREF, Py_TYPE) for debug builds #89257
Comments
I converted many C macros to static inline functions to reduce the risk of programming bugs (macros pitfalls), limit variable scopes to the function, have a more readable function, and benefit of the function name even when it's inlined in debuggers and profilers. When Py_INCREF() macro was converted to a static inline function, using __attribute__((always_inline)) was considered, but the idea was rejected. See bpo-35059. I'm now trying to convert the Py_TYPE() macro to a static inline function. The problem is that by default, MSC disables inlining and test_exceptions does crash with a stack overflow, since my PR 28128 increases the usage of the stack memory: see bpo-44348. For the specific case of CPython built by MSC, we can increase the stack size, or change compiler optimizations to enable inlining. But the problem is wider than just CPython built by MSC in debug mode. Third party C extensions built by distutils may get the same issue. Building CPython on other platforms on debug mode with all compiler optimizations disabled (ex: gcc -O0) can also have the same issue. I propose to reconsider the usage __forceinline (MSC) and __attribute__((always_inline)) (GCC, clang) on the most important static inline functions, like Py_INCREF() and Py_TYPE(), to avoid this issue. |
3 similar comments
I like the idea of Py_ALWAYS_INLINE personally if we have to do that. |
Sadly, Py_ALWAYS_INLINE does *not* prevent test_exceptions to crash with my PR 28128 (convert Py_TYPE macro to a static inline function). Even if the Py_TYPE() static inline function is inlined, the stack memory still increases. MSC produces inefficient machine code. It allocates useless variables on the stack which requires more stack memory. I propose a different approach: bpo-45115 "Windows: enable compiler optimizations when building Python in debug mode". |
Using __forceinline didn't fix test_exceptions: see https://bugs.python.org/issue44348#msg401185 Since I created the issue to fix bpo-44348 and it doesn't work, I close the issue. Moreover, always inlining can have a negative impact on release builds. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: