Message361419
I wrote a quick & dirty local patch to define again _Py_NewReference() and _Py_Dealloc() as inline function in object.h before Py_DECREF(). I failed to see a clear win in term of performance.
Microbenchmark:
./python -m pyperf timeit --duplicate=4096 'o=object(); o=None' -l 512
Result if _Py_Dealloc() is inlined again:
Mean +- std dev: [opaque] 69.3 ns +- 1.5 ns -> [dealloc] 67.5 ns +- 1.5 ns: 1.03x faster (-3%)
Result if _Py_Dealloc() and _Py_NewReference() are inlined again:
Mean +- std dev: [opaque] 69.3 ns +- 1.5 ns -> [dealloc_newref] 66.1 ns +- 1.3 ns: 1.05x faster (-5%)
It's a matter of 3.2 nanoseconds. Honestly, I don't think that it's worth it to bother with that. I expect way more siginificant speedup with more advanced optimizations like using a tracing GC or tagged pointers, and these optimizations require to better hide implementation details.
_Py_Dealloc() was converted to a regular function was mistake when I moved code to cpython/object.h and nobody noticed.
For all these reasons, I wrote PR 18361 to remove the unused _Py_Dealloc() macro, rather than trying to inline it again. |
|
Date |
User |
Action |
Args |
2020-02-05 10:51:32 | vstinner | set | recipients:
+ vstinner |
2020-02-05 10:51:32 | vstinner | set | messageid: <1580899892.23.0.136566154569.issue39543@roundup.psfhosted.org> |
2020-02-05 10:51:32 | vstinner | link | issue39543 messages |
2020-02-05 10:51:31 | vstinner | create | |
|