This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author neonene
Recipients Mark.Shannon, neonene, paul.moore, rhettinger, steve.dower, tim.golden, vstinner, zach.ware
Date 2021-09-07.18:13:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
@vstinner: __forceinline suggestion

Since PR25244 (mentioned above), it seems link.exe has got to get stuck on python310.dll.
Before the PR, it took 10x~ longer to link than without __forceinline function.
I can confirm with _Py_DECREF() and _Py_XDECREF() and one training-job (the more fucntions forced/jobs used, the slower to link).
Have you tried __forceinline on PGO ?

> I don't understand how to read the table.

Overhead field is the output of pyperf command, not subtraction (the answers are the same just luckily).

ex) 3.10rc1x86 PGO: 
     PGO      : pyperf compare_to 3.10a7 left
     patched  : pyperf compare_to 3.10a7 right
     overhead : pyperf compare_to right  left 
     1.15x slower (slower 52, faster  4, not significant  2)
     1.13x slower (slower 50, faster  4, not significant  4)
     1.02x slower (slower 29, faster 14, not significant 15)

> I'm not sure if PGO builds are reproducible,

MSVC does not produce the same code. Inlining (all or nothing) might be a quite special case in the hottest section.
I suspect the profiler doesn't work well only for _PyEval_EvalFrameDefault(), including branch/align optimization.
So my posted macro or inlining is just for a mesureing, not the solution.
Date User Action Args
2021-09-07 18:13:04neonenesetrecipients: + neonene, rhettinger, paul.moore, vstinner, tim.golden, Mark.Shannon, zach.ware, steve.dower
2021-09-07 18:13:04neonenesetmessageid: <>
2021-09-07 18:13:04neonenelinkissue45116 messages
2021-09-07 18:13:04neonenecreate