This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)
Type: performance Stage: patch review
Components: C API, Interpreter Core Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, Xtrem532, erlendaasland, gvanrossum, hroncok, jpe, lukasz.langa, miss-islington, pablogsal, petr.viktorin, rhettinger, scoder, vstinner
Priority: Keywords: patch

Created on 2021-04-07 09:35 by Mark.Shannon, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 25244 merged Mark.Shannon, 2021-04-07 10:16
PR 25276 merged Mark.Shannon, 2021-04-08 11:14
PR 28474 closed Mark.Shannon, 2021-09-20 11:53
PR 28498 closed pablogsal, 2021-09-21 17:50
PR 28527 merged vstinner, 2021-09-23 09:02
PR 28529 merged miss-islington, 2021-09-23 14:38
PR 28542 merged vstinner, 2021-09-24 09:27
PR 28723 merged Mark.Shannon, 2021-10-04 14:48
PR 29032 merged vstinner, 2021-10-18 14:43
Messages (46)
msg390410 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-07 09:35
The DISPATCH() macro has two failings.

1. Its check for tracing involves too much pointer chaser.

2. The logic assumes that computed-gotos is the "fast path" which makes switch dispatch, and therefore Python on Windows unnecessarily slow.
msg390522 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-08 10:22
New changeset 28d28e053db6b69d91c2dfd579207cd8ccbc39e7 by Mark Shannon in branch 'master':
bpo-43760: Streamline dispatch sequence for machines without computed gotos. (GH-25244)
https://github.com/python/cpython/commit/28d28e053db6b69d91c2dfd579207cd8ccbc39e7
msg390951 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-13 10:08
New changeset 9e7b2076fb4380987ad0262c4c0ca900b06475ad by Mark Shannon in branch 'master':
bpo-43760: Speed up check for tracing in interpreter dispatch (#25276)
https://github.com/python/cpython/commit/9e7b2076fb4380987ad0262c4c0ca900b06475ad
msg393379 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-05-10 11:48
I am afraid the "Speed up check for tracing in interpreter dispatch" brought some backwards incompatible changes:

yappi/_yappi.c:1261:9: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘use_tracing’; did you mean ‘tracing’?
 1261 |     ts->use_tracing = 1;
      |         ^~~~~~~~~~~
      |         tracing

This is not mentioned in https://docs.python.org/3.10/whatsnew/3.10.html and I haven't noticed the use_tracing member being deprecated. I am confused. Should this happened?
msg393384 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-05-10 13:03
Fedora packages affected (that we know of now):

greenlet: https://bugzilla.redhat.com/show_bug.cgi?id=1957784
dipy: https://bugzilla.redhat.com/show_bug.cgi?id=1958203
yappi: https://bugzilla.redhat.com/show_bug.cgi?id=1958896
smartcols: https://bugzilla.redhat.com/show_bug.cgi?id=1958938
msg393385 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-10 13:06
At yappi/_yappi.c:1261 sets an undocumented field on a CPython internal data structure.

What did you believe that was supposed to do? use_tracing is not documented anywhere.

We could add the field back and ignore it, but I doubt that would help you much.
msg393389 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-10 13:28
If there is no C-API function that supports your needs, feel free to suggest one.
msg393390 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-05-10 13:33
Disclaimer: I have not written the code nor do I understand what is trying to achieve. I merely collect the data and report the problems to the package maintainers.

It just seems to me that a non-underscored (and hence public) member variable on a non-underscored (and hence public) structure should not suddenly go missing. Although, I am not familiar with the rules that define what part of the API falls under https://www.python.org/dev/peps/pep-0497/
msg393401 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-05-10 14:21
scikit-learn: https://bugzilla.redhat.com/show_bug.cgi?id=1958976

gcc: sklearn/cluster/_k_means_fast.c
In file included from /usr/lib64/python3.10/site-packages/numpy/core/include/numpy/ndarraytypes.h:1944,
                 from /usr/lib64/python3.10/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                 from /usr/lib64/python3.10/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from sklearn/cluster/_k_means_fast.c:635:
/usr/lib64/python3.10/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
   17 | #warning "Using deprecated NumPy API, disable it with " \
      |  ^~~~~~~
sklearn/cluster/_k_means_fast.c: In function ‘__Pyx_call_return_trace_func’:
sklearn/cluster/_k_means_fast.c:1596:15: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘use_tracing’; did you mean ‘tracing’?
 1596 |       tstate->use_tracing = 0;
      |               ^~~~~~~~~~~
      |               tracing
sklearn/cluster/_k_means_fast.c:1602:15: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘use_tracing’; did you mean ‘tracing’?
 1602 |       tstate->use_tracing = 1;
      |               ^~~~~~~~~~~
      |               tracing



The usage comes from https://github.com/cython/cython/blob/master/Cython/Utility/Profile.c
msg393403 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-05-10 14:43
PEP 0497 is rejected; the active one is PEP 387, which says "backwards incompatibility" means preexisting code ceases to comparatively function after a change.
So, this does look like a backwards-incompatible change.

Unfortunately, not all of the C API is documented, so unless it's explicitly marked private, people will use it :(
msg393404 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-10 14:57
But what does "use it" mean?
What does setting `tstate->use_tracing = 1` do?
There is no documented behavior, so how do we know what assumptions people are making about what happens when they set some field to 1?

As I said, we could keep the field and ignore it, but that seems worse.
msg393407 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-05-10 15:06
I don't think the PEP meant to restrict individual struct member such as this.  For example, we were able to switch from byte code to word code without violating the intended rules.  Consider asking Brett and Benjamin for clarification.  I would think that if a new function were introduced to provide a reliable way to determine whether tracing was enabled, that would suffice for external packages to have a minimally disruptive migration path.
msg393410 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-10 15:43
I understand that some projects manually call the profile and/or trace functions, and temporarily set use_tracing 0 while calling these functions.

Some projects restore use_tracing to the correct value (compute the efficient value), some projects simply set use_tracing to 1.

I see 3 use cases:

* disable tracing temporarily (set use_tracing to 0)
* reenable tracing (compute use_tracing to the correct value)
* check if tracing is used (get use_tracing)

We can add 3 functions:

* PyThreadState_DisableTracing()
* PyThreadState_EnableTracing()
* PyThreadState_GetTracing()

PyThreadState_EnableTracing(tstate) would do something like:

    tstate->cframe->use_tracing = (tstate->c_tracefunc || tstate->c_profilefunc);

If we added these functions, I can then add an implementation for Python 3.9 and older to my https://github.com/pythoncapi/pythoncapi_compat project for backward compatibility.

The problem is that some projects also increase temporarily ts->tracing. Since I would like to make PyThreadState opaque, I would prefer to hide this access behind a function call as well. Maybe we need an API to call profile and/or trace functions?

--

According to the bugzilla compiler errors:

> greenlet: https://bugzilla.redhat.com/show_bug.cgi?id=1957784

It has already been fixed:

* https://github.com/python-greenlet/greenlet/commit/6c5f0963eb00eeb1cfb337c6edbd3efaf7d8eacc
* https://github.com/python-greenlet/greenlet/commit/352b974447bb489cb2778e07c3832d0cc60e0e4a

It uses:

* "tstate->use_tracing = 0;"
* "tstate->use_tracing = (tstate->tracing <= 0 && (...)"

> dipy: https://bugzilla.redhat.com/show_bug.cgi?id=1958203

It uses:

* "tstate->use_tracing = 0;"
* "tstate->use_tracing = 1;"
* "tstate->use_tracing = (tstate->c_profilefunc || (...)"
* "return tstate->use_tracing && retval;"
* "if (tstate->use_tracing) {"

> yappi: https://bugzilla.redhat.com/show_bug.cgi?id=1958896

* "ts->use_tracing = 1;"
* "ts->use_tracing = 0;"

> smartcols: https://bugzilla.redhat.com/show_bug.cgi?id=1958938

It uses "tstate->use_tracing = 0;".

> scikit-learn: https://bugzilla.redhat.com/show_bug.cgi?id=1958976

It uses:

* "tstate->use_tracing = 0;"
* "tstate->use_tracing = 1;"

> The usage comes from https://github.com/cython/cython/blob/master/Cython/Utility/Profile.c

Simplified code:

--------------
static int __Pyx_TraceSetupAndCall(...)
{
    ...
    tstate->tracing++;
    tstate->use_tracing = 0;

    if (tstate->c_tracefunc)
        retval = tstate->c_tracefunc(tstate->c_traceobj, *frame, PyTrace_CALL, NULL) == 0;
    if (retval && tstate->c_profilefunc)
        retval = tstate->c_profilefunc(tstate->c_profileobj, *frame, PyTrace_CALL, NULL) == 0;

    tstate->use_tracing = (tstate->c_profilefunc ||
                           (CYTHON_TRACE && tstate->c_tracefunc));
    tstate->tracing--;
    ...
}

  int __Pyx_use_tracing = 0;

  #define __Pyx_TraceCall(funcname, srcfile, firstlineno, nogil, goto_error)             \
  if (nogil) {                                                                           \
      if (CYTHON_TRACE_NOGIL) {                                                          \
          PyThreadState *tstate;                                                         \
          PyGILState_STATE state = PyGILState_Ensure();                                  \
          tstate = __Pyx_PyThreadState_Current;                                          \
          if (unlikely(tstate->use_tracing) && !tstate->tracing &&                       \
                  (tstate->c_profilefunc || (CYTHON_TRACE && tstate->c_tracefunc))) {    \
              __Pyx_use_tracing = __Pyx_TraceSetupAndCall(&$frame_code_cname, &$frame_cname, tstate, funcname, srcfile, firstlineno);  \
          }                                                                              \
          PyGILState_Release(state);                                                     \
          if (unlikely(__Pyx_use_tracing < 0)) goto_error;                               \
      }                                                                                  \
  } else {                                                                               \
      PyThreadState* tstate = PyThreadState_GET();                                       \
      if (unlikely(tstate->use_tracing) && !tstate->tracing &&                           \
              (tstate->c_profilefunc || (CYTHON_TRACE && tstate->c_tracefunc))) {        \
          __Pyx_use_tracing = __Pyx_TraceSetupAndCall(&$frame_code_cname, &$frame_cname, tstate, funcname, srcfile, firstlineno);  \
          if (unlikely(__Pyx_use_tracing < 0)) goto_error;                               \
      }                                                                                  \
  }
--------------
msg393425 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-05-10 18:07
+1 for Victor's suggestions.  It provides a reasonable way forward without locking in eval-loop implementation details that weren't intended to be public and frozen in time.
msg393459 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-05-11 14:04
A Cython issue report: https://github.com/cython/cython/issues/4153
msg393466 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2021-05-11 18:16
For the same reason that motivated this ticket, I think the functions should be inline functions. They should also take the current thread-state as argument, because that's probably known on the caller side already.

I guess a macro would be fine, too. :)

Cython previously used "use_tracing" directly because it needs to implement the exact same tracing/profiling behaviour as CPython, regardless of who called a Cython implemented function (Cython or CPython).

Naming nit: Get/Is/UsesTracing?

Also, given that a common use case seems to be "make sure tracing is disabled, do something, enable tracing if it was enabled", I think DisableTracing() should return the previous state.
msg393666 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2021-05-14 15:00
I just noticed that new C-API functions are probably useless for Cython since I think it will have to maintain the CFrame stack, so not to enable "use_tracing" for the (Python) caller but the current (Cython) function. This then means that we own The current CFrame as well as its "use_tracing" field and don't need any help from CPython in order to change the state.

I'm not sure if this is any different for other users of the "use_tracing" field.
msg402150 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-19 10:04
The commit 28d28e053db6b69d91c2dfd579207cd8ccbc39e7 caused a performance regression on Windows which is currently blocking the Python 3.10.0 final release: bpo-45116.

Moroever, this issue introduced a incompatible C API change which is not documented in What's New in Python 3.10, and it doesn't provide any solution for projects broken by these changes. So far, the following projects are known to be broken by the change:

* Cython
* greenlet (fixed)
* dipy
* yappi
* smartcols

Would it be possible to:

* Bare minimum: document the change in What's in Python 3.10?

* Provide a solution to broken project? If possible, solution working on old and new Python versions. Maybe a compatibility functions can added to https://github.com/pythoncapi/pythoncapi_compat if needed.

* Maybe revert the change in Python 3.10 since a full solution may require additional work.

By the way, I'm also disappointed that nothing was done to enhance the situation for 4 months (since the first known projects were reported here in May).

I raise the priority to release blocker to make more people aware of the situation.
msg402162 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-09-19 19:10
Also Numba is broken: https://bugzilla.redhat.com/show_bug.cgi?id=2005686
msg402216 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-20 09:03
> A Cython issue report: https://github.com/cython/cython/issues/4153

Cython 0.29.24 released at July 13, 2021 with a fix (2 commits):

* https://github.com/cython/cython/commit/be3b178296975b976107f41a6383291701e0297f
* https://github.com/cython/cython/commit/8d177f4aa51a663e8c51de3210ccb329d7629d36

The Cython master branch was fixed as well: see https://github.com/cython/cython/issues/4153
msg402225 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-09-20 11:05
IMO those failures are bugs in the projects listed not in CPython.


Relying on the exact meaning, or even the existence of an undocumented field of a C struct is not, nor ever has been, safe.
The user of the field is assuming a meaning that is not known to the developers of the library/application, so there is no way for them to preserve that meaning.
This is not specific to CPython, but applies to any C API.

The code in the examples given above are using `tstate->use_tracing` assuming that its meaning is to determine whether tracing is turned on or not.
However, it has no such meaning. It is an internal performance optimization to avoid checking both `tstate->c_tracefunc` and `tstate->c_profilefunc`.
It is `tstate->tracing` that determines whether tracing is active or not.

I propose adding back `tstate->use_tracing` as a copy of `tstate->cframe->us_tracing`. Any writes to `tstate->use_tracing` will be ignored, but any code that also sets `tstate->tracing`, as the Cython code does, will work correctly. Any code that reads `tstate->use_tracing` will work correctly.

I'm minded to prefix all the names of all fields in all C structs that happen to be included by Python.h with "if_you_use_this_then_we_will_break_your_code_in_some_way_that_will_ruin_your_reputation_as_a_library_developer__maybe_not_tomorrow_maybe_not_next_year_but_someday"
Although that might prove tricky with a 80 character line limit :)

My attempts to avoid this happening again next year, and the year after that, and...
https://bugs.python.org/issue45247
https://github.com/cython/cython/issues/4382
msg402231 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2021-09-20 13:29
> The code in the examples given above are using `tstate->use_tracing` assuming that its meaning is to determine whether tracing is turned on or not.

No, actually not. It is using the field in the same way as CPython, simply because most of this code was originally copied from CPython, and we also copied the optimisation of avoiding to check the other fields (for the obvious reason of being an optimisation).


> I propose adding back `tstate->use_tracing` as a copy of `tstate->cframe->us_tracing`.

Cython 0.29.24 has already been adapted to this change and will use the new field in CPython 3.10b1+.


> Any code that reads `tstate->use_tracing` will work correctly.

Any code that reads and /writes/ the field would probably also continue to work correctly, which is what older Cython versions did.


> if_you_use_this_then_we_will_break_your_code_in_some_way_that_will_ruin_your_reputation_as_a_library_developer…

The thing is, new APIs can only be added to new CPython releases. Supporting features in older CPython versions (currently 2.7+) means that we always *have to* use the existing fields, and can only switch to new APIs by duplicating code based on a PY_VERSION_HEX preprocessor check. Even if a new low-latency profiling API was added in CPython 3.11, we'd have to wait until there is at least an alpha release that has it before enabling this code switch.

And if the new API proves to be slower, we may end up keeping the old code around and adding a C compile-time configuration option for users to enable (or disable) its use. Cython has lots of those these days, mostly to support the different C-API capabilities of different Python implementations, e.g. to take advantage of the PyLong or PyUnicode internals if available, and use generic C-API calls if not.
msg402234 - (view) Author: John Ehresman (jpe) * Date: 2021-09-20 14:36
Is adding the field back an option at this point? It would mean that extensions compiled against the release candidates may not be binary compatible with the final release

My take is that use_tracing is an implementation and version dependent field, and that binary compatibility will be maintained for a specific release (e.g. 3.10) but that there's no assurance that it will be there in the next release -- though these things tend not to change. I also regard generated cython code as only being valid for the releases that a specific cython version supports.

Code and API's change slowly, but eventually they do change.
msg402235 - (view) Author: Miro Hrončok (hroncok) * Date: 2021-09-20 14:38
> It would mean that extensions compiled against the release candidates may not be binary compatible with the final release

If that's true, I definitively argue not to do that. We've told everybody it won't happen.
msg402358 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-21 20:30
From PR 28498:

@vstinner @ambv The ABI is not broken, the only thing that this PR change is the size of the struct. All the offsets to the members are the same and therefore will be valid in any compiled code.

Any compiled wheels will still work. Look at the ABI report:

  [C]'function void PyEval_AcquireThread(PyThreadState*)' at ceval.c:447:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          1 data member changes (2 filtered):
           type of 'PyInterpreterState* _ts::interp' changed:
             in pointed to type 'typedef PyInterpreterState' at pystate.h:22:1:
               underlying type 'struct _is' at pycore_interp.h:220:1 changed:
                 type size hasn't changed
                 1 data member changes (3 filtered):
                  type of '_PyFrameEvalFunction _is::eval_frame' changed:
                    underlying type 'PyObject* (PyThreadState*, PyFrameObject*, int)*' changed:
                      in pointed to type 'function type PyObject* (PyThreadState*, PyFrameObject*, int)':
                        parameter 1 of type 'PyThreadState*' has sub-type changes:
                          in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
                            underlying type 'struct _ts' at pystate.h:62:1 changed:
                              type size changed from 2240 to 2304 (in bits)
                              1 data member insertion:
                                'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
                              1 data member changes (2 filtered):
                               type of '_ts* _ts::next' changed:
                                 in pointed to type 'struct _ts' at pystate.h:62:1:
                                   type size changed from 2240 to 2304 (in bits)
                                   1 data member insertion:
                                     'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
                                   no data member changes (2 filtered);




  [C]'function PyThreadState* PyGILState_GetThisThreadState()' at pystate.c:1455:1 has some indirect sub-type changes:
    return type changed:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          no data member changes (4 filtered);

  [C]'function int _PyErr_CheckSignalsTstate(PyThreadState*)' at signalmodule.c:1767:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          no data member changes (3 filtered);

  [C]'function void _PyErr_Clear(PyThreadState*)' at errors.c:453:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          no data member changes (3 filtered);
As you can see, the leaves of the change is only type size changed from 2240 to 2304. As the member is added
msg402360 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-21 20:35
Also, just to clarify, I also opened PR 28498 to discuss the possibility of going ahead, I still don't want to move on without consensus.
msg402362 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-21 20:39
Also, I personally thing there is absolutely no guarantee that Cython code generated for 3.9 should work for 3.10 and the thread state is a private structure that has undocumented fields and is not part of the stable API nor the limited API so, tstate->tracing disappearing is totally withing the guarantees between Python versions.
msg402415 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-22 10:05
I discussed this particular instance with the Steering Council and the conclusion was that this field (use_tracing) is considered an implementation detail and therefore its removal it's justified so we won't be restoring it.

I'm therefore closing PR28498

Notice that this decision only affects this particular issue and should not be generalized to other fields or structures. We will try to determine and open a discusion in the future about what is considered public/private in these ambiguous cases and what can users expect regarding stability and backwards compatibility.
msg402416 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-22 10:06
I'm removing the release blocker as per above, feel free to close of there is nothing else to discuss or act on here.
msg402426 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-09-22 12:24
> The ABI is not broken, the only thing that this PR change is the size of the struct. All the offsets to the members are the same and therefore will be valid in any compiled code.

I'll just note that a change in struct size does technically break ABI, since *arrays* of PyThreadState will break.

So the size shouldn't be changed in RCs or point releases. (But since it's not part of stable ABI, it was OK to change it for 3.10.)

> We will try to determine and open a discussion in the future about what is considered public/private in these ambiguous cases and what can users expect regarding stability and backwards compatibility.

Please keep me in the loop; I'm working on summarizing my understanding of this (in a form that can be added to the docs if approved).
msg402432 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-22 13:19
> I'll just note that a change in struct size does technically break ABI, since *arrays* of PyThreadState will break.

Not that matters now because we are not proceeding but just to clarify why I deemed this acceptable: arrays of PyThreadState is extremelly unlikely in extensions because we pass it by Pointer and is always manipulated by pointer. To place it in an array you either need to create one or copy one into an array, which I cannot see what would be the point because the fields are mainly pointers that would become useless as the interpreter will not update anything
msg402434 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-22 13:22
Also, I checked the DWARF tree of all existing wheels for 3.10 on PyPI (there aren't many) and none had anything that uses the size of the struct.
msg402481 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-23 09:04
I created PR 28527 to document PyThreadState.use_tracing removal and explain how to port existing code to Python 3.10.
msg402496 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-23 14:38
New changeset f4ccb79d52ee726d58bbb038ea98b4deec52001d by Victor Stinner in branch 'main':
bpo-43760: Document PyThreadState.use_tracing removal (GH-28527)
https://github.com/python/cpython/commit/f4ccb79d52ee726d58bbb038ea98b4deec52001d
msg402525 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-09-23 20:40
New changeset 55576893b31452ba739e01189424cd62cf11ed20 by Miss Islington (bot) in branch '3.10':
bpo-43760: Document PyThreadState.use_tracing removal (GH-28527) (GH-28529)
https://github.com/python/cpython/commit/55576893b31452ba739e01189424cd62cf11ed20
msg402594 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-24 23:12
Analysis use use_tracing usage in 3rd part code.

I see two main ways to add C API functions covering these use cases:

* Provide high-level functions like "call a trace function" (disable tracing, call trace function, reenable tracing, increment/decrement tstate->tracing)
* Provide low-level functions just to control use_tracing: make PyThreadState structure opaque, but stil make the assumption that it is possible to disable temporarily tracing and profiling (in practice, it's implemented as use_tracing=0).



(*) greenlet

greenlet disables temporarily tracing in g_calltrace(), and then restore it, to call a "tracing" function:
---
    tstate->tracing++;
    TSTATE_USE_TRACING(tstate) = 0;
    retval = PyObject_CallFunction(tracefunc, "O(OO)", event, origin, target);
    tstate->tracing--;
    TSTATE_USE_TRACING(tstate) =
        (tstate->tracing <= 0 &&
         ((tstate->c_tracefunc != NULL) || (tstate->c_profilefunc != NULL)));
---

It also saves and then restores use_tracing value:
---
ts__g_switchstack_use_tracing = tstate->cframe->use_tracing;
(...)
tstate->cframe->use_tracing = ts__g_switchstack_use_tracing;
---

=> it can use PyThreadState_IsTracing(), PyThreadState_DisableTracing() and PyThreadState_ResetTracing().

These functions don't handle "tstate->tracing++;" and "tstate->tracing--;" which is also used by greenlet.

greenlet also saves and restores tstate->cframe:
https://github.com/python-greenlet/greenlet/blob/master/src/greenlet/greenlet.c


(*) dipy

Code generated by Cython.


(*) smartcols

Code generated by Cython.


(*) yappi

yappi is Python profiler.

yappi sets use_tracing to 1 when it sets its profile function: "ts->c_profilefunc = _yapp_callback;".

It sets use_tracing to 0 when it clears the profile function: "ts->c_profilefunc = NULL;". That's wrong, it ignores the trace function.

PyEval_SetProfile() cannot be used because yappi works on a PyThreadState (ts).

Code: https://github.com/sumerc/yappi/blob/master/yappi/_yappi.c

It can use PyThreadState_DisableTracing() and PyThreadState_ResetTracing(). Maybe a PyThreadState_SetProfile(tstate, func) function would fit better yappi's use case.


(*) Cython

Cython defines 2 compatibility functions:

* __Pyx_IsTracing(tstate, check_tracing, check_funcs): it can check c_profilefunc and c_tracefunc
* __Pyx_SetTracing(tstate, enable)

Code: https://github.com/cython/cython/blob/0.29.x/Cython/Utility/Profile.c

The code is quite complicated. In short, it checks if tracing and/or profiling is enabled. If it's enabled, it disables temporarily tracing (use_tracing=0) while calling trace and profile functions.

=> it requires PyThreadState_IsTracing(), PyThreadState_DisableTracing() and PyThreadState_ResetTracing().
msg402603 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-09-25 04:00
Ah, I think the docs need to be clarified a bit. Here's what I was missing:

The key thing to know here is that there are *three* state variables; `c_tracefunc`, `c_profilefunc` (on the thread state), and `use_tracing` (on the C frame).

Normally `use_tracing` is initialized to false if both functions are NULL, and true otherwise (if at least one of the functions is set).

*Disabling* means setting `use_tracing` to false regardless. *Resetting* means setting `use_tracing` to the value computed above.

There's also a fourth variable, `tstate->tracing`, which indicates whether a tracing function is active (i.e., it has been called and hasn't exited yet). This can be incremented and decremented. But none of the proposed APIs affect it.

Would it be reasonable to just put these APIs in pythoncapi_compat, instead of in the stdlib? (It would be yet one more selling point for people to start using that. :-)
msg403166 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-10-04 19:18
New changeset 78184fa6b0e6129203673e425718e08f5edb6e2e by Pablo Galindo (Miss Islington (bot)) in branch '3.10':
bpo-43760: Document PyThreadState.use_tracing removal (GH-28527) (GH-28529)
https://github.com/python/cpython/commit/78184fa6b0e6129203673e425718e08f5edb6e2e
msg403217 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-10-05 10:01
New changeset bd627eb7ed08a891dd1356756feb1ce2600358e4 by Mark Shannon in branch 'main':
bpo-43760: Check for tracing using 'bitwise or' instead of branch in dispatch. (GH-28723)
https://github.com/python/cpython/commit/bd627eb7ed08a891dd1356756feb1ce2600358e4
msg404020 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-15 14:06
New changeset 547d26aa08aa5e4ec6e4f8a5587b30b39064a5ba by Victor Stinner in branch 'main':
bpo-43760: Add PyThreadState_EnterTracing() (GH-28542)
https://github.com/python/cpython/commit/547d26aa08aa5e4ec6e4f8a5587b30b39064a5ba
msg404024 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-15 14:31
> bpo-43760: Add PyThreadState_EnterTracing() (GH-28542)

I created changes to use it:

* pythoncapi_compat: https://github.com/pythoncapi/pythoncapi_compat/commit/10fde24739cab4547e9c27c31c8804a25e23e8a0
* Cython: https://github.com/cython/cython/pull/4411
* greenlet: https://github.com/python-greenlet/greenlet/pull/267
msg404025 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-15 14:38
PyThreadState.cframe.use_tracing format changed again: set value set to 0 or 255.
https://github.com/python/cpython/commit/bd627eb7ed08a891dd1356756feb1ce2600358e4
msg404173 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-18 10:20
> * Cython: https://github.com/cython/cython/pull/4411

Merged:

* 0.29.x: https://github.com/cython/cython/commit/cbddad23e30ea6d31e0178a4c623f1f6d75452c3
* master: https://github.com/cython/cython/commit/4df1103bd30143ce022b07f98a2f62678d417e92
msg404197 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-18 16:40
New changeset 034f607906de6e375cc9a98cc3b09f6d56f8be10 by Victor Stinner in branch 'main':
bpo-43760: Rename _PyThreadState_DisableTracing() (GH-29032)
https://github.com/python/cpython/commit/034f607906de6e375cc9a98cc3b09f6d56f8be10
msg404600 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-10-21 13:23
I created https://github.com/python/cpython/pull/29121 to add PyThreadState_SetProfile() and PyThreadState_SetTrace() functions.
msg405968 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-11-08 17:16
greenlet now uses PyThreadState_EnterTracing() and PyThreadState_LeaveTracing() rather than accessing directly use_tracing:

https://github.com/python-greenlet/greenlet/commit/9b49da5c7e4808bd61b992e40f5b5243bfa9be6f

On Python 3.10, it implements these functions with:
---

// bpo-43760 added PyThreadState_EnterTracing() to Python 3.11.0a2
#if PY_VERSION_HEX < 0x030B00A2 && !defined(PYPY_VERSION)
static inline void PyThreadState_EnterTracing(PyThreadState *tstate)
{
    tstate->tracing++;
#if PY_VERSION_HEX >= 0x030A00A1
    tstate->cframe->use_tracing = 0;
#else
    tstate->use_tracing = 0;
#endif
}
#endif

// bpo-43760 added PyThreadState_LeaveTracing() to Python 3.11.0a2
#if PY_VERSION_HEX < 0x030B00A2 && !defined(PYPY_VERSION)
static inline void PyThreadState_LeaveTracing(PyThreadState *tstate)
{
    tstate->tracing--;
    int use_tracing = (tstate->c_tracefunc != NULL
                       || tstate->c_profilefunc != NULL);
#if PY_VERSION_HEX >= 0x030A00A1
    tstate->cframe->use_tracing = use_tracing;
#else
    tstate->use_tracing = use_tracing;
#endif
}
#endif
---

This code was copied from my https://github.com/pythoncapi/pythoncapi_compat project. (I wrote the greenlet change.)
History
Date User Action Args
2022-04-11 14:59:43adminsetgithub: 87926
2021-11-08 17:16:37vstinnersetmessages: + msg405968
2021-10-21 13:23:06vstinnersetmessages: + msg404600
2021-10-18 16:40:51vstinnersetmessages: + msg404197
2021-10-18 14:43:05vstinnersetpull_requests: + pull_request27304
2021-10-18 10:20:51vstinnersetmessages: + msg404173
2021-10-15 14:38:14vstinnersetmessages: + msg404025
2021-10-15 14:31:49vstinnersetmessages: + msg404024
2021-10-15 14:06:39vstinnersetmessages: + msg404020
2021-10-05 10:01:16Mark.Shannonsetmessages: + msg403217
2021-10-04 19:18:43pablogsalsetmessages: + msg403166
2021-10-04 14:48:05Mark.Shannonsetpull_requests: + pull_request27070
2021-09-25 04:00:35gvanrossumsetmessages: + msg402603
2021-09-24 23:12:48vstinnersetmessages: + msg402594
2021-09-24 09:27:01vstinnersetpull_requests: + pull_request26926
2021-09-23 20:40:19lukasz.langasetnosy: + lukasz.langa
messages: + msg402525
2021-09-23 14:38:51miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request26919
2021-09-23 14:38:38vstinnersetmessages: + msg402496
2021-09-23 09:04:17vstinnersetmessages: + msg402481
2021-09-23 09:02:49vstinnersetpull_requests: + pull_request26918
2021-09-22 13:22:30pablogsalsetmessages: + msg402434
2021-09-22 13:19:57pablogsalsetmessages: + msg402432
2021-09-22 12:24:33petr.viktorinsetmessages: + msg402426
2021-09-22 10:06:31pablogsalsetpriority: release blocker ->

messages: + msg402416
2021-09-22 10:05:13pablogsalsetmessages: + msg402415
2021-09-21 20:39:08pablogsalsetmessages: + msg402362
2021-09-21 20:35:38pablogsalsetmessages: + msg402360
2021-09-21 20:30:27pablogsalsetmessages: + msg402358
2021-09-21 17:50:26pablogsalsetpull_requests: + pull_request26893
2021-09-20 19:59:48gvanrossumsetnosy: + gvanrossum
2021-09-20 14:38:01hroncoksetmessages: + msg402235
2021-09-20 14:36:31jpesetnosy: + jpe
messages: + msg402234
2021-09-20 13:29:25scodersetmessages: + msg402231
2021-09-20 12:13:48Xtrem532setnosy: + Xtrem532
2021-09-20 11:53:48Mark.Shannonsetpull_requests: + pull_request26872
2021-09-20 11:05:13Mark.Shannonsetmessages: + msg402225
2021-09-20 09:22:33erlendaaslandsetnosy: + erlendaasland
2021-09-20 09:03:42vstinnersetmessages: + msg402216
2021-09-19 19:10:40hroncoksetmessages: + msg402162
2021-09-19 10:04:44vstinnersetpriority: normal -> release blocker
nosy: + pablogsal
messages: + msg402150

2021-05-14 16:04:07scodersetmessages: - msg393667
2021-05-14 15:09:02scodersetmessages: + msg393667
2021-05-14 15:00:49scodersettype: performance
messages: + msg393666
2021-05-11 18:16:10scodersetnosy: + scoder
messages: + msg393466
2021-05-11 14:04:57hroncoksetmessages: + msg393459
2021-05-10 18:07:10rhettingersetmessages: + msg393425
2021-05-10 15:44:19vstinnersetcomponents: + Interpreter Core, C API
2021-05-10 15:44:12vstinnersettitle: The DISPATCH() macro is not as efficient as it could be. -> The DISPATCH() macro is not as efficient as it could be (move PyThreadState.use_tracing)
versions: + Python 3.10
2021-05-10 15:43:58vstinnersetmessages: + msg393410
2021-05-10 15:06:27rhettingersetnosy: + rhettinger
messages: + msg393407
2021-05-10 14:57:43Mark.Shannonsetmessages: + msg393404
2021-05-10 14:43:12petr.viktorinsetmessages: + msg393403
2021-05-10 14:21:32hroncoksetmessages: + msg393401
2021-05-10 13:33:12hroncoksetmessages: + msg393390
2021-05-10 13:28:14Mark.Shannonsetmessages: + msg393389
2021-05-10 13:06:45Mark.Shannonsetmessages: + msg393385
2021-05-10 13:03:30hroncoksetmessages: + msg393384
2021-05-10 11:48:57hroncoksetnosy: + vstinner, petr.viktorin, hroncok
messages: + msg393379
2021-04-13 10:08:31Mark.Shannonsetmessages: + msg390951
2021-04-08 11:14:23Mark.Shannonsetpull_requests: + pull_request24013
2021-04-08 10:22:59Mark.Shannonsetmessages: + msg390522
2021-04-07 10:16:34Mark.Shannonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23984
2021-04-07 09:35:49Mark.Shannoncreate