Message335864
Well … yes.
The exception fields are performance critical, and we try hard to make them visible to the C compiler so that swapping around exception state eats up as little CPU time as possible.
You could argue that profiling and tracing are less critical, but any nanosecond that is avoided while not tracing a function adds up to making the rest of the program faster, so I'd argue that that's performance critical, too. Profiling definitely is, because it should have as little impact on the code profile as possible. There is a huge difference between having the CPU pre-fetch a pointer and looking at the value, compared to calling into a C function and guessing what the result might be.
The trashcan is only used during deallocation, so … well, I guess it could be replaced by a different API, but that's a bit tricky due to the bracket nature of the current macros.
I also just noticed that "Py_EnterRecursiveCall" and "Py_LeaveRecursiveCall" are on your list. We use them in our inlined call helper functions, which mostly duplicate CPython functionality. Looking at these macros now, I find it a bit annoying that they call "PyThreadState_GET()" directly, rather than accepting one as input. Looking up the current thread-state is a non-local, atomic operation that can be surprisingly costly, and I've invested quite some work into reducing these lookups in Cython. Although it's probably not too bad around a call into an external function…
So, yeah, we do care about the thread state being readily available. :)
Could you explain what benefit you are expecting from hiding the thread state? |
|
Date |
User |
Action |
Args |
2019-02-18 20:17:00 | scoder | set | recipients:
+ scoder, ncoghlan, vstinner, eric.snow |
2019-02-18 20:17:00 | scoder | set | messageid: <1550521020.37.0.920240699316.issue35949@roundup.psfhosted.org> |
2019-02-18 20:17:00 | scoder | link | issue35949 messages |
2019-02-18 20:17:00 | scoder | create | |
|