classification
Title: Consider using __forceinline and __attribute__((always_inline)) on static inline functions (Py_INCREF, Py_TYPE) for debug builds
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.11
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: corona10, erlendaasland, kj, pablogsal, vstinner
Priority: normal Keywords: patch

Created on 2021-09-03 13:26 by vstinner, last changed 2021-09-07 13:45 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28140 merged vstinner, 2021-09-03 13:53
PR 28141 closed vstinner, 2021-09-03 14:54
Messages (8)
msg400990 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-03 13:26
I converted many C macros to static inline functions to reduce the risk of programming bugs (macros pitfalls), limit variable scopes to the function, have a more readable function, and benefit of the function name even when it's inlined in debuggers and profilers.

When Py_INCREF() macro was converted to a static inline function, using __attribute__((always_inline)) was considered, but the idea was rejected. See bpo-35059.

I'm now trying to convert the Py_TYPE() macro to a static inline function. The problem is that by default, MSC disables inlining and test_exceptions does crash with a stack overflow, since my PR 28128 increases the usage of the stack memory: see bpo-44348.

For the specific case of CPython built by MSC, we can increase the stack size, or change compiler optimizations to enable inlining. But the problem is wider than just CPython built by MSC in debug mode. Third party C extensions built by distutils may get the same issue. Building CPython on other platforms on debug mode with all compiler optimizations disabled (ex: gcc -O0) can also have the same issue.

I propose to reconsider the usage __forceinline (MSC) and __attribute__((always_inline)) (GCC, clang) on the most important static inline functions, like Py_INCREF() and Py_TYPE(), to avoid this issue.
msg400994 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-03 14:45
New changeset 7974c30b9fd84fa56ea1515ed2c08b38edf1a383 by Victor Stinner in branch 'main':
bpo-45094: Add Py_NO_INLINE macro (GH-28140)
https://github.com/python/cpython/commit/7974c30b9fd84fa56ea1515ed2c08b38edf1a383
msg400995 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-03 14:45
New changeset 7974c30b9fd84fa56ea1515ed2c08b38edf1a383 by Victor Stinner in branch 'main':
bpo-45094: Add Py_NO_INLINE macro (GH-28140)
https://github.com/python/cpython/commit/7974c30b9fd84fa56ea1515ed2c08b38edf1a383
msg400996 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-03 14:45
New changeset 7974c30b9fd84fa56ea1515ed2c08b38edf1a383 by Victor Stinner in branch 'main':
bpo-45094: Add Py_NO_INLINE macro (GH-28140)
https://github.com/python/cpython/commit/7974c30b9fd84fa56ea1515ed2c08b38edf1a383
msg400997 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-03 14:45
New changeset 7974c30b9fd84fa56ea1515ed2c08b38edf1a383 by Victor Stinner in branch 'main':
bpo-45094: Add Py_NO_INLINE macro (GH-28140)
https://github.com/python/cpython/commit/7974c30b9fd84fa56ea1515ed2c08b38edf1a383
msg400999 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-09-03 15:43
I like the idea of Py_ALWAYS_INLINE personally if we have to do that.
msg401144 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-06 15:31
Sadly, Py_ALWAYS_INLINE does *not* prevent test_exceptions to crash with my PR 28128 (convert Py_TYPE macro to a static inline function). Even if the Py_TYPE() static inline function is inlined, the stack memory still increases. MSC produces inefficient machine code. It allocates useless variables on the stack which requires more stack memory.

I propose a different approach: bpo-45115 "Windows: enable compiler optimizations when building Python in debug mode".
msg401272 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-07 13:45
Using __forceinline didn't fix test_exceptions: see https://bugs.python.org/issue44348#msg401185

Since I created the issue to fix bpo-44348 and it doesn't work, I close the issue.

Moreover, always inlining can have a negative impact on release builds.
History
Date User Action Args
2021-09-07 13:45:12vstinnersetstatus: open -> closed
resolution: rejected
messages: + msg401272

stage: patch review -> resolved
2021-09-06 15:31:14vstinnersetmessages: + msg401144
2021-09-03 15:43:14corona10setmessages: + msg400999
2021-09-03 14:54:30vstinnersetpull_requests: + pull_request26580
2021-09-03 14:45:16vstinnersetmessages: + msg400997
2021-09-03 14:45:09vstinnersetmessages: + msg400996
2021-09-03 14:45:09vstinnersetmessages: + msg400995
2021-09-03 14:45:09vstinnersetmessages: + msg400994
2021-09-03 13:53:06vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request26579
2021-09-03 13:37:05kjsetnosy: + kj
2021-09-03 13:27:09vstinnersetnosy: + corona10, pablogsal, erlendaasland
2021-09-03 13:26:54vstinnersettitle: Consider using __forceinline and __attribute__((always_inline)) for debug builds -> Consider using __forceinline and __attribute__((always_inline)) on static inline functions (Py_INCREF, Py_TYPE) for debug builds
2021-09-03 13:26:39vstinnercreate