classification
Title: Add _PyEval_SetTrace(tstate, func, arg) function
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: fabioz, vstinner
Priority: normal Keywords: patch

Created on 2018-12-01 10:40 by fabioz, last changed 2020-03-16 19:01 by fabioz.

Pull Requests
URL Status Linked Edit
PR 18975 merged vstinner, 2020-03-13 13:39
PR 18977 merged vstinner, 2020-03-13 15:44
PR 19029 merged vstinner, 2020-03-16 16:45
Messages (21)
msg330849 - (view) Author: Fabio Zadrozny (fabioz) Date: 2018-12-01 10:40
Right now it's hard for debuggers to set the tracing function to be used for running threads.

This would be really handy for debuggers when attaching to a running program to debug all threads.

-- Note: currently there is a way to achieve that by pausing all the threads then selectively switching to a thread to make it current and setting the tracing function using the C-API (see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd_attach_to_process/dll/attach.cpp#L1224), but I believe this is very hacky and not portable to other Python implementations.
msg364081 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-13 11:42
As a note, the workaround is now in https://github.com/fabioz/PyDev.Debugger/blob/pydev_debugger_1_9_0/pydevd_attach_to_process/common/py_settrace_37.hpp#L150
msg364082 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 12:05
Can't you use PyEval_SetTrace()?
https://docs.python.org/dev/c-api/init.html#c.PyEval_SetTrace
msg364085 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 12:14
> Note: currently there is a way to achieve that by pausing all the threads then selectively switching to a thread to make it current and setting the tracing function using the C-API (see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd_attach_to_process/dll/attach.cpp#L1224), but I believe this is very hacky and not portable to other Python implementations.

I'm not sure that I understand your need. Do you need a variant of PyEval_SetTrace() which accepts a tstate argument?
msg364086 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-13 12:19
I'd like to, but it only sets the tracing to the currently running thread (the request is for setting the tracing for other threads).

Just for info, the problem I'm solving is that the debugger is multi-threaded, but the user can start the code without any tracing in place, and then, when it gets to an attach to process or a programmatic attach to the debugger, the debugger needs to set the tracing to threads that are already running (and any new thread created afterward) and Python doesn't really have any structure for that in place (so, I'm using the C-API as a workaround to do what PyEval_SetTrace does but targeting any thread).
msg364087 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-13 12:20
>> Note: currently there is a way to achieve that by pausing all the threads then selectively switching to a thread to make it current and setting the tracing function using the C-API (see: https://github.com/fabioz/PyDev.Debugger/blob/master/pydevd_attach_to_process/dll/attach.cpp#L1224), but I believe this is very hacky and not portable to other Python implementations.

> I'm not sure that I understand your need. Do you need a variant of PyEval_SetTrace() which accepts a tstate argument?

Yes
msg364088 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-13 12:21
Maybe better would be the thread id so that the tstate structure is not needed there.
msg364092 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 13:42
Attached PR 18975 adds _PyEval_SetTrace(tstate, func, arg) function. It requires that the caller holds the GIL. The function calls PySys_Audit() in the context of the current thread state (not in "tstate").
msg364093 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 13:46
> Maybe better would be the thread id so that the tstate structure is not needed there.

PyInterpreterState.tstate_head allows to iterate on all thread states, but I'm not aware of an API to get a PyThreadState from its identifier (tstate->id).

--

See also bpo-1021318: "PyThreadState_Next not thread safe".
msg364097 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-13 14:04
I'm iterating on all threads and getting its thread id to find out the thread state (in my use case) and then doing what you just did there...

So, while this solution does work for me, if the idea is making tstate opaque, then having (an optional) thread id in settrace (which iterates to find the proper thread if given) could solve it without having to rely on any CPython internals on my side (although it should probably return a bool to say if it did work then).
msg364098 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 15:39
New changeset 309d7cc5df4e2bf3086c49eb2b1b56b929554500 by Victor Stinner in branch 'master':
bpo-35370: Add _PyEval_SetTrace() function (GH-18975)
https://github.com/python/cpython/commit/309d7cc5df4e2bf3086c49eb2b1b56b929554500
msg364100 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 15:40
PyEval_SetTrace() and PyEval_SetProfile() function have no return value but can raise an exception. I'm not comfortable about these functions. Maybe one option would be to call PyErr_WriteUnraisable() on PySys_Audit() error. It might be better than "leaking" an exception which is unexpected to PyEval_SetTrace() and PyEval_SetProfile() callers.
msg364101 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 15:48
> PyEval_SetTrace() and PyEval_SetProfile() function have no return value but can raise an exception. I'm not comfortable about these functions. Maybe one option would be to call PyErr_WriteUnraisable() on PySys_Audit() error. It might be better than "leaking" an exception which is unexpected to PyEval_SetTrace() and PyEval_SetProfile() callers.

I wrote PR 18977 to implement this idea.
msg364102 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-13 15:50
Fabio: I added _PyEval_SetProfile() and _PyEval_SetTrace() which take a tstate parameter. These functions have a constraint: the caller must hold the GIL. Is it an acceptable constraint for you?

That's not something new, it's already the code in Python 3.8, it's just that it wasn't documented. In Python 3.7, it was less important: Python 3.8 added a call to PySys_Audit() which can execute arbitrary Python code.

Anyway, touching Python internals without holding the GIL is risky: see bpo-1021318 for example.
msg364109 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-13 16:36
Holding the GIL is a reasonable constraint.

As a note, the original request was for a Python-level tracing function (so that in the future other Python implementations also provide that function) -- does this need a PEP?
msg364337 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-16 16:41
New changeset f6a58507820c67e8d0fb07875cd1b1d9f5e510a8 by Victor Stinner in branch 'master':
bpo-35370: PyEval_SetTrace() logs unraisable error (GH-18977)
https://github.com/python/cpython/commit/f6a58507820c67e8d0fb07875cd1b1d9f5e510a8
msg364341 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-16 17:18
New changeset 046255c40fc0d9c5a4c528eb5955792fa08df66f by Victor Stinner in branch '3.8':
bpo-35370: PyEval_SetTrace() logs unraisable error (GH-18977) (GH-19029)
https://github.com/python/cpython/commit/046255c40fc0d9c5a4c528eb5955792fa08df66f
msg364342 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-16 17:19
> As a note, the original request was for a Python-level tracing function (so that in the future other Python implementations also provide that function) -- does this need a PEP?

What do you mean by a Python-level tracing function?
msg364344 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-16 17:42
>> As a note, the original request was for a Python-level tracing function (so that in the future other Python implementations also provide that function) -- does this need a PEP?

> What do you mean by a Python-level tracing function?

I mean that it's a function to be called from Python (not only from C) -- which hopefully could be adopted by other Python implementations in the long run. 

I.e.: something as adding a thread_id to sys.settrace -- sys.settrace(trace_func, thread_id=None).
msg364350 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-16 18:36
> I.e.: something as adding a thread_id to sys.settrace -- sys.settrace(trace_func, thread_id=None).

What is the use case for this feature?

It seems quite complicated to implement the thread_id for sys.settrace(trace_func, thread_id=None).

Currently, there is no way for a thread to execute code directly in another thread. In asyncio, it has to through call_soon_threadsafe() which queues the function calls and the function is called "later" and the caller doesn't get the result.

https://docs.python.org/dev/library/asyncio-eventloop.html#asyncio.loop.call_soon_threadsafe

sys.setprofile() and sys.settrace() use the current Python thread state (tstate). The threading.enumerate() function returns threading.Thread instances, but it's not currently possible to get the Python thread state (C structure PyThreadState) from a Python threading.Thread object.

At the C level, Python doesn't maintain a list of thread. There is only threading.enumerate() which is implemented in Python.

PyDev.Debugger seems to use the C API. Can it continue to use the C API?

Note: There is threading.setprofile() and threading.settrace() which set a profile/trace function when *new* threads are spawned.
msg364353 - (view) Author: Fabio Zadrozny (fabioz) Date: 2020-03-16 19:01
>> I.e.: something as adding a thread_id to sys.settrace -- sys.settrace(trace_func, thread_id=None).

> What is the use case for this feature?

The use case is having the user attach the debugger (either programmatically or by doing an attach to process) and be able to debug all threads, not just the current thread.

> It seems quite complicated to implement the thread_id for sys.settrace(trace_func, thread_id=None).

Humm, isn't it just a matter of passing the proper tstate to _PyEval_SetTrace? It seems reasonably simple to do at C (i.e.: iterate the existing thread states to get the thread id and then pass the proper tsate to _PyEval_SetTrace -- which is roughly what is done in the debugger, although it's a bit more complicated because it supports Python 2.7 up to Python 3.8...).

> At the C level, Python doesn't maintain a list of thread. There is only threading.enumerate() which is implemented in Python.

The tstate does contain the thread id, so, iterating the available tstates should be enough for that.

> PyDev.Debugger seems to use the C API. Can it continue to use the C API?

It can for CPython, but it can't for other Python implementations (and ideally I'd like to rely less on the CPython C-API -- because there's too much implementation details on it, things seem to break at each new version).

> Note: There is threading.setprofile() and threading.settrace() which set a profile/trace function when *new* threads are spawned

Yes, I know about those, but it's not enough if the user attaches the debugger after the process is already running.
History
Date User Action Args
2020-03-16 19:01:40fabiozsetmessages: + msg364353
2020-03-16 18:36:15vstinnersetmessages: + msg364350
2020-03-16 17:42:54fabiozsetmessages: + msg364344
2020-03-16 17:19:09vstinnersetmessages: + msg364342
2020-03-16 17:18:24vstinnersetmessages: + msg364341
2020-03-16 16:45:31vstinnersetpull_requests: + pull_request18378
2020-03-16 16:41:50vstinnersetmessages: + msg364337
2020-03-13 16:36:25fabiozsetmessages: + msg364109
2020-03-13 15:50:27vstinnersetmessages: + msg364102
2020-03-13 15:48:38vstinnersetmessages: + msg364101
2020-03-13 15:44:45vstinnersetpull_requests: + pull_request18325
2020-03-13 15:40:40vstinnersetmessages: + msg364100
2020-03-13 15:39:16vstinnersetmessages: + msg364098
2020-03-13 14:04:44fabiozsetmessages: + msg364097
2020-03-13 13:46:30vstinnersetmessages: + msg364093
2020-03-13 13:42:38vstinnersetmessages: + msg364092
2020-03-13 13:40:33vstinnersettitle: Provide API to set the tracing function to be used for running threads. -> Add _PyEval_SetTrace(tstate, func, arg) function
versions: + Python 3.9, - Python 3.8
2020-03-13 13:39:48vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request18324
2020-03-13 12:21:16fabiozsetmessages: + msg364088
2020-03-13 12:20:25fabiozsetmessages: + msg364087
2020-03-13 12:19:44fabiozsetmessages: + msg364086
2020-03-13 12:14:26vstinnersetmessages: + msg364085
2020-03-13 12:05:56vstinnersetnosy: + vstinner
messages: + msg364082
2020-03-13 11:42:17fabiozsetmessages: + msg364081
2018-12-01 10:40:47fabiozsettype: enhancement
2018-12-01 10:40:37fabiozcreate