This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Move PyThreadState into Include/internal/pycore_pystate.h
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: duplicate
Dependencies: Superseder: [C API] Make the PyThreadState structure opaque (move it to the internal C API)
View: 39947
Assigned To: Nosy List: eric.snow, ncoghlan, scoder, vstinner
Priority: normal Keywords:

Created on 2019-02-09 02:00 by eric.snow, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (9)
msg335122 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-09 02:00
(also see issue #35886)

In November Victor created the Include/cpython directory and moved a decent amount of public (but not limited) API there.  This included the PyThreadState struct.  I'd like to move it into the "internal" headers since it is somewhat coupled to the internal runtime implementation.  The alternative is extra complexity.  I ran into this while working on another issue.

There are 2 problems, however, with PyThreadState which make a move potentially challenging.  In fact, they could be (but probably aren't) cause for moving it back to Include/pystate.h.

1. the docs say: "The only public data member is PyInterpreterState *interp, which points to this thread’s interpreter state."
2. the struct has 6 fields that are exposed (likely unintentionally) in "stable" header files via the following macros:

  # Include/object.h
  Py_TRASHCAN_SAFE_BEGIN
  Py_TRASHCAN_SAFE_END
  Include.ceval.h
  Py_EnterRecursiveCall
  Py_LeaveRecursiveCall
  _Py_MakeRecCheck
  Py_ALLOW_RECURSION
  Py_END_ALLOW_RECURSION

I'm not sure how that factors into the stable ABI (PyThreadState wasn't ever guarded by Py_LIMITED_API).  However, keep in mind that Victor already moved PyThreadState out of the "stable" headers in November.

I'm fine with sorting out the situation with the macros if that's okay to do.  Likewise the promised field ("interp") should be solvable one way or another (e.g. remove it or use a private struct).

...or we could do nothing or (if the change in Novemver is a problem) move it back to Include/pystate.h.
msg335465 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-13 17:06
Same ratione than for PyInterpreterState: https://bugs.python.org/issue35886#msg335464

Except that I expect that a few more projects rely on PyThreadState fields. Maybe not. It's hard to guess :-(

I mean that I'm ok-ish with the change but it should be carefully prepared and announced.
msg335677 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-02-16 09:48
From Cython's point of view, the important fields in PyThreadState are the tracing/profiling and exception related ones. We're not using anything else. Users can explicitly opt out of the access to the exception fields by defining a C macro, at the expense of a bit of performance. I doubt that anyone is really doing that, though, because, why would they?

I'm not sure if we could avoid direct field access for profiling and tracing at all. You can look at the code, it's a whole bunch of macros that mimic what CPython does in ceval:

https://github.com/cython/cython/blob/master/Cython/Utility/Profile.c

I should note that both features are not enabled by default. Users have to explicitly enable profiling support via a directive, and even doubly opt in to tracing by enabling a compiler directive and a C macro. So production code will often not rely on these fields, but developers would want to use them and some might keep profiling support enabled also in their production code. My guess is that tracing is most often used for test coverage analysis.
msg335678 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-02-16 10:00
Oh, and I forgot the new trashcan support. Cython will also start to use that in its next release, so that adds the trashcan related attributes to the list.

https://github.com/cython/cython/pull/2842/files
msg335859 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-18 19:31
@Stefan, is it a problem for Cython if the relevant fields are exposed via C-API functions rather than directly on PyThreadState?
msg335864 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-02-18 20:17
Well … yes.

The exception fields are performance critical, and we try hard to make them visible to the C compiler so that swapping around exception state eats up as little CPU time as possible.

You could argue that profiling and tracing are less critical, but any nanosecond that is avoided while not tracing a function adds up to making the rest of the program faster, so I'd argue that that's performance critical, too. Profiling definitely is, because it should have as little impact on the code profile as possible. There is a huge difference between having the CPU pre-fetch a pointer and looking at the value, compared to calling into a C function and guessing what the result might be.

The trashcan is only used during deallocation, so … well, I guess it could be replaced by a different API, but that's a bit tricky due to the bracket nature of the current macros.

I also just noticed that "Py_EnterRecursiveCall" and "Py_LeaveRecursiveCall" are on your list. We use them in our inlined call helper functions, which mostly duplicate CPython functionality. Looking at these macros now, I find it a bit annoying that they call "PyThreadState_GET()" directly, rather than accepting one as input. Looking up the current thread-state is a non-local, atomic operation that can be surprisingly costly, and I've invested quite some work into reducing these lookups in Cython. Although it's probably not too bad around a call into an external function…

So, yeah, we do care about the thread state being readily available. :)

Could you explain what benefit you are expecting from hiding the thread state?
msg335865 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-18 20:33
Thanks for clarifying. :)

On Mon, Feb 18, 2019 at 1:17 PM Stefan Behnel <report@bugs.python.org> wrote:
> The exception fields are performance critical, and we try hard to make them visible to the C compiler so that swapping around exception state eats up as little CPU time as possible.
>
> You could argue that profiling and tracing are less critical, but any nanosecond that is avoided while not tracing a function adds up to making the rest of the program faster, so I'd argue that that's performance critical, too. Profiling definitely is, because it should have as little impact on the code profile as possible. There is a huge difference between having the CPU pre-fetch a pointer and looking at the value, compared to calling into a C function and guessing what the result might be.

Yeah, I had a feeling that was the case. :)  There might be decent
approaches to avoiding a performance hit on this, but I'm not sure
it's worth the extra complexity.  And I'm not terribly motivated to
make the effort personally. :)

> The trashcan is only used during deallocation, so … well, I guess it could be replaced by a different API, but that's a bit tricky due to the bracket nature of the current macros.

Yep, definitely tricky.

> I also just noticed that "Py_EnterRecursiveCall" and "Py_LeaveRecursiveCall" are on your list. We use them in our inlined call helper functions, which mostly duplicate CPython functionality. Looking at these macros now, I find it a bit annoying that they call "PyThreadState_GET()" directly, rather than accepting one as input. Looking up the current thread-state is a non-local, atomic operation that can be surprisingly costly, and I've invested quite some work into reducing these lookups in Cython. Although it's probably not too bad around a call into an external function…

Hmm, perhaps that's something we could address separately?

> Could you explain what benefit you are expecting from hiding the thread state?

This isn't so much about the thread state specifically or about
necessarily hiding *all* the thread state.  Rather it's about a
general effort to reduce the amount of private API we are exposing
publicly.  In this case I was looking at PyInterpreterState (#35886)
and opening a similar issue at the same time for PyThreadState made
sense.

Given your response it's clear that usage of PyThreadState doesn't
match the documentation very well.  That would be worth addressing one
way or another, so I'm glad we're having this conversation. :)
msg372317 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-25 09:25
> Py_TRASHCAN_SAFE_BEGIN and Py_TRASHCAN_SAFE_END

I excluded the TRASHCAN API from the limited C API:

commit 0fa4f43db086ac3459811cca4ec5201ffbee694a
Author: Victor Stinner <vstinner@python.org>
Date:   Wed Feb 5 12:23:27 2020 +0100

    bpo-39542: Exclude trashcan from the limited C API (GH-18362)
    
    Exclude trashcan mechanism from the limited C API: it requires access to
    PyTypeObject and PyThreadState structure fields, whereas these structures
    are opaque in the limited C API.
    
    The trashcan mechanism never worked with the limited C API. Move it
    from object.h to cpython/object.h.

Then I modified these macros to hide implementation details:

commit 38965ec5411da60d312b59be281f3510d58e0cf1
Author: Victor Stinner <vstinner@python.org>
Date:   Fri Mar 13 16:51:52 2020 +0100

    bpo-39947: Hide implementation detail of trashcan macros (GH-18971)
    
    Py_TRASHCAN_BEGIN_CONDITION and Py_TRASHCAN_END macro no longer
    access PyThreadState attributes, but call new private
    _PyTrash_begin() and _PyTrash_end() functions which hide
    implementation details.

> Py_EnterRecursiveCall, Py_LeaveRecursiveCall, _Py_MakeRecCheck

Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() are now function calls. I move the "inline" implementation the internal C API.

commit 224481a8c988fca12f488544edd2f01c0af2a91d
Author: Victor Stinner <vstinner@python.org>
Date:   Fri Mar 13 10:19:38 2020 +0100

    bpo-39947: Move Py_EnterRecursiveCall() to internal C API (GH-18972)
    
    Move the static inline function flavor of Py_EnterRecursiveCall() and
    Py_LeaveRecursiveCall() to the internal C API: they access
    PyThreadState attributes. The limited C API provides regular
    functions which hide implementation details.


> Py_ALLOW_RECURSION, Py_END_ALLOW_RECURSION

Oh right, these macros should be modified to not access PyThreadState.recursion_critical member directly.
msg372318 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-25 09:26
I mark this issue as a duplicate of bpo-39947 "[C API] Make the PyThreadState structure opaque (move it to the internal C API)" since I already pushed multiple changes there. But this issue contains interesting technical information!
History
Date User Action Args
2022-04-11 14:59:11adminsetgithub: 80130
2020-06-25 09:26:12vstinnersetstatus: open -> closed
superseder: [C API] Make the PyThreadState structure opaque (move it to the internal C API)
messages: + msg372318

resolution: duplicate
stage: resolved
2020-06-25 09:25:43vstinnersetmessages: + msg372317
2019-02-18 20:33:23eric.snowsetmessages: + msg335865
2019-02-18 20:17:00scodersetmessages: + msg335864
2019-02-18 19:31:38eric.snowsetmessages: + msg335859
2019-02-16 10:00:46scodersetmessages: + msg335678
2019-02-16 09:48:09scodersetnosy: + scoder
messages: + msg335677
2019-02-13 17:06:53vstinnersetmessages: + msg335465
2019-02-09 02:00:19eric.snowcreate