classification
Title: Expose `_PyRuntime` through a section name
Type: enhancement Stage: patch review
Components: Interpreter Core, macOS, Windows Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Maxime Belanger, eric.snow, ncoghlan, ned.deily, paul.moore, ronaldoussoren, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords: patch

Created on 2017-12-11 23:24 by Maxime Belanger, last changed 2019-09-04 01:46 by Maxime Belanger.

Pull Requests
URL Status Linked Edit
PR 4802 open maxbelanger, 2017-12-11 23:40
Messages (5)
msg308079 - (view) Author: Maxime Belanger (Maxime Belanger) Date: 2017-12-11 23:24
We've recently been toying with more sophisticated crash reporting machinery for our Desktop Python application (which we deploy on macOS, Windows and Linux).

To assist in debugging, we are storing `_PyRuntime` in a predictable location our crash reporter can later look up without needing the symbol table (this can grow complicated across multiple platforms).

Taking a page from `crashpad`'s book (https://chromium.googlesource.com/crashpad/crashpad/+/master/client/crashpad_info.cc), we've patched `pylifecycle.c` to store the `_PyRuntime` struct in a section of the same name. Upon a crash,  this section is then used by the tool to annotate each report with enough information to reconstruct the Python stack frames in each thread (as applicable).

We're contributing our patch here in the hopes this can be helpful to others.
msg308149 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2017-12-12 18:42
Note that in the long term we are considering support for embedding multiple runtimes in a single process.  So anything that assumes there is only a single runtime in each process is problematic.
msg308154 - (view) Author: Maxime Belanger (Maxime Belanger) Date: 2017-12-12 21:55
Interesting, would this imply potentially multiple GILs? The major thing we need out of the structure is the (`Py_tss_t`) `autoTSSKey` in order to associate a native thread with its `PyThreadState`.
msg345616 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-06-14 17:29
(Sorry for the delay!)

Is the need for a named section due to the fact that _PyRuntime is part of the "internal" C-API (hidden behind the Py_BUILD_CORE macro)?  I.E. would it help to build with Py_BUILD_CORE?


Regarding the GIL:

The plan for 3.9 is to have one GIL per interpreter, so the relevant struct would be PyInterpreterState (a part of the public C-API).  That would be accessible at runtime via the fields of _PyRuntime.interpreters.


Regarding what I said before about multiple runtimes per process:

Honestly, I've yet to see any use case that would justify providing multi-runtime support.  However, there are code-health benefits to keeping *all* runtime state out of static globals.  So eliminating the _PyRuntime static variable is still a realistic possibility (for 3.9 or maybe later).

If that happens then there would have to be a new embedding C-API oriented around (opaque) PyRuntimestate pointers.  See PEP 432 and 587 for ongoing work to improve the embedding experience (for runtime initialization), including what the successors to Py_Initialize() look like.
msg351108 - (view) Author: Maxime Belanger (Maxime Belanger) Date: 2019-09-04 01:46
Thanks for taking a look! To answer your question, the need for the named section comes not from the API being being "internal", but because we need to access it at runtime from a tool running in a separate process.

We have augmented Crashpad (Google's crash reporter) tool, which inspects the memory of a crashing process to create/upload a report, to access each native thread's stack and reconstruct its Python stack frames. This allows us to quickly make sense of native crashes involving Python code. 

To do this, the tool needs to know where the Python runtime stores state within thread-local storage. This is actually the only reason we need to access `_PyRuntime`: it allows us to retrieve `autoTSSKey` for the process, which we can use to look up the `PyThreadState` for each underlying/native thread.

At the time, `_PyRuntime` seemed like a natural structure to expose and a named section a simple-enough way of doing so. We're certainly open to alternatives if you think there's a better way.

Given your plans for 3.9: I'm assuming `autoTSSKey` will remain the same per-process?
History
Date User Action Args
2019-09-04 01:46:26Maxime Belangersetmessages: + msg351108
2019-06-14 17:29:23eric.snowsetmessages: + msg345616
versions: + Python 3.9, - Python 3.8
2019-04-05 21:37:50eric.snowsetversions: + Python 3.8, - Python 3.7
2017-12-12 21:58:13vstinnersetnosy: + vstinner
2017-12-12 21:55:54Maxime Belangersetmessages: + msg308154
2017-12-12 18:42:09eric.snowsetmessages: + msg308149
2017-12-11 23:40:44maxbelangersetkeywords: + patch
stage: patch review
pull_requests: + pull_request4700
2017-12-11 23:35:25steve.dowersetnosy: + ncoghlan
2017-12-11 23:35:16steve.dowersetnosy: + eric.snow
2017-12-11 23:24:41Maxime Belangercreate