This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients seberg, vstinner
Date 2020-12-28.14:21:30
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1609165291.44.0.792855424994.issue40522@roundup.psfhosted.org>
In-reply-to
Content
There are many ways to get the current interpreter (interp) and the current Python thread state (tstate).

Public C API, opaque function call:

* PyInterpreterState_Get() (new in Python 3.9)
* PyThreadState_Get()

Internal C API, static inline functions:

* _PyInterpreterState_GET()
* _PyThreadState_GET()

There are so many variants that I wrote notes for myself:
https://pythondev.readthedocs.io/pystate.html

This issue is about optimizing _PyInterpreterState_GET() and _PyThreadState_GET() which are supposed to be the most efficient implementations.

Currently, _PyInterpreterState_GET() is implemented as _PyThreadState_GET()->interp, and _PyThreadState_GET() is implemented as:

    _Py_atomic_load_relaxed(_PyRuntime.gilstate.tstate_current)

--

To find the _PyInterpreterState_GET() machine code, I read the PyInterpreterState_Get() assembly code (not optimized, it adds tstate==NULL test) and PyTuple_New() assembly code, since PyTuple_New() now needs to get the current interpreter:

static struct _Py_tuple_state *
get_tuple_state(void)
{
    PyInterpreterState *interp = _PyInterpreterState_GET();
    return &interp->tuple;
}

To find the _PyThreadState_GET() machine code, I read the PyThreadState_Get() assembly code.

I looked at the x86-64 machine code generated by GCC -O3 (no LTO, no PGO, it should not be relevant here), using GCC 10.2.1 on Fedora 33.

_PyThreadState_GET():

   mov    rax,QWORD PTR [rip+0x2292b1]   # 0x743118 <_PyRuntime+568>

_PyInterpreterState_GET():

   mov    rax,QWORD PTR [rip+0x22a7dd]        # 0x743118 <_PyRuntime+568>
   mov    rax,QWORD PTR [rax+0x10]

By default, Python is built with -fPIC: _PyRuntime variable does not have a fixed address.

$ objdump -t ./python|grep '\<_PyRuntime\>'
0000000000742ee0 g     O .bss	00000000000002a0              _PyRuntime

The "[rip+0x2292b1] # 0x743118 <_PyRuntime+568>" indirection is needed by PIC.
History
Date User Action Args
2020-12-28 14:21:31vstinnersetrecipients: + vstinner, seberg
2020-12-28 14:21:31vstinnersetmessageid: <1609165291.44.0.792855424994.issue40522@roundup.psfhosted.org>
2020-12-28 14:21:31vstinnerlinkissue40522 messages
2020-12-28 14:21:30vstinnercreate