classification
Title: Add instruction level tracing via sys.settrace
Type: enhancement Stage: test needed
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ammar2, gwk, haypo, xdegaye
Priority: normal Keywords: patch

Created on 2017-01-31 18:20 by gwk, last changed 2017-08-29 15:49 by gwk.

Files
File name Uploaded Description Edit
inst-tracing.diff gwk, 2017-02-01 04:36
inst-tracing-2.diff haypo, 2017-02-01 21:51 review
inst-trace-demo.py gwk, 2017-02-01 23:29
settraceinst.diff gwk, 2017-02-06 17:19 review
settracestate-demo.py gwk, 2017-08-29 15:48
Pull Requests
URL Status Linked Edit
PR 3236 open gwk, 2017-08-29 15:49
Messages (20)
msg286546 - (view) Author: George King (gwk) * Date: 2017-01-31 18:20
I have recently put some effort into developing a code coverage tool that shows correct results for intraline branches. In order to get intraline trace data, I patched CPython, adding an optional "trace instructions" flag to sys.settrace. The patch is fairly simple.

http://github.com/gwk/cpython/tree/inst-tracing

The crucial detail in ceval is here:

@@ -4521,6 +4523,9 @@ maybe_call_line_trace(Py_tracefunc func, PyObject *obj,
         frame->f_lineno = line;
         result = call_trace(func, obj, tstate, frame, PyTrace_LINE, Py_None);
     }
+    else if (tstate->inst_tracing) {
+        result = call_trace(func, obj, tstate, frame, PyTrace_INSTRUCTION, Py_None);
+    }

The call to maybe_call_line_trace is guarded by several checks, so I do not see how this would significantly impact performance, except for when tracing is enabled. I built python3.6 using clang -O3 -flto (I saw lots of test failures during the --with-optimizations PGO tests so I opted for a simpler build), with and without my patch. I then ran `python3 -m performance run --fast` and manually compared the results. The patched version was in most cases slightly slower but within the standard deviation of the normal version.

I would appreciate feedback before going any further with this. I looked at haypo's perf project but was unsure how to use it to run a more detailed comparison.

* First, will this feature be rejected on performance grounds or some other obvious reason?

* Second, does setting `tstate->inst_tracing` in PyEval_SetTraceInstructions require more elaborate guards similar to those in PyEval_SetTrace? I do not understand the intricacies of that function.

I am also curious about backwards compatibility. This adds an optional argument to settrace (haypo's recommendation) and some other things that appear to be under Py_LIMITED_API or internal (but I am not sure).
* Is a change like this eligible for 3.6.x or would it have to wait until 3.7?
* Would it be better to add a completely new function, e.g. sys.settrace_instructions instead of changing the signature of settrace?

Lastly, I have a few ideas for enhancing this patch that will require some more work on my end to verify their utility.
In particular:
* Remove the `else` to always trace with PyTrace_INSTRUCTION, rather than as an alternate to PyTrace_LINE as it does now.
* Pass the previous instruction offset as the trace function argument. This would give the tracer complete "edge" information. (In my coverage tool's trace function I track the previous edge as a nonlocal, but it would be better/faster to do it automatically in maybe_call_line_trace).
* Finally, if all of this works, it might also be useful (as a performance enhancement) to add a PyTrace_BRANCH option, which would omit runs of opcodes inbetween branches, as these edges tend not be be useful, and even counterproductive (at least for my use case). However it might be too tricky to get all the cases right.
msg286561 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-31 22:47
Can you please attach your patch?
msg286563 - (view) Author: Ammar Askar (ammar2) * Date: 2017-01-31 22:49
A git style patch can be found at https://github.com/gwk/cpython/commit/071d17cedfdf2db8b405aab5adabebe2ac5ef67b.patch
msg286587 - (view) Author: George King (gwk) * Date: 2017-02-01 04:36
Here is the patch from git; if you need a patch for hg I can work on that tomorrow!
msg286708 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-01 21:51
inst-tracing-2.diff:

* Rebased patch applied on Mercurial, I had to fix a conflict in sys_settrace()
* I replaced your PyEval_SetTraceInstructions() function with a new private _PyEval_SetTraceEx() method
* I changed sys.settrace() API to convert the trace_instructions parameter to keyword only: sys.settrace(func, True) raises a TypeError (I added an unit test for that).

About the API, I see an issue: the "old=sys.gettrace(); sys.settrace(old)" pattern doesn't work anymore, since old doesn't contain the trace_instructions flag :-/ So I suggest to add a new method and make it exclusive with sys.settrace(), sys.settraceinstr() / sys.gettraceinstr():

* sys.settrace() raises an exception of sys.gettraceinstr() is not None
* sys.settraceinstr() raises an exception of sys.gettrace() is not None

The sys module is a critical module, so the API should probably be discussed on python-dev (later?).

About performances: it's fine to add anything to maybe_call_line_trace(), it cannot impact performances when tracing is not used.

Can you please try to reimplement something like __lltrace__ on top of sys.settraceinstr()? It would be a nice example of the API, but also a good reason to add sys.settraceinstr().

See the issue #25571 "Improve the lltrace feature with the Py_Debug mode".

Currently, the __lltrace__ feature requires a Python compiled in debug mode, which is not convenient. Moreover, it requires to put a __ltrace__ variable in global variables.

The best would be able to add an instruction level debugger inside pdb! It would be a cool tool to learn Python and to play with the bytecode!
msg286715 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-02-01 22:28
> About performances: it's fine to add anything to maybe_call_line_trace(), it cannot impact performances when tracing is not used.

I do not agree, Python debuggers are already really really slow. They should not have to process 'instruction' trace events as this would happen if George does "Remove the `else` to always trace with PyTrace_INSTRUCTION, rather than as an alternate to PyTrace_LINE as it does now".
I understand that this was a suggestion :)
msg286716 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-01 22:37
> I do not agree, Python debuggers are already really really slow. They should not have to process 'instruction' trace events as this would happen if George does "Remove the `else` to always trace with PyTrace_INSTRUCTION, rather than as an alternate to PyTrace_LINE as it does now".

Hum, there are two things:

* attached patch adds an else to the C maybe_call_line_trace(): I don't think that it's possible to notice the overhead in a debugger implemented in pure Python. If you are concerned by the change, we need a micro-benchmark.

* existing debuggers may have to be extended to support PyTrace_INSTRUCTION and so may be slowed down. Maybe I misunderstood what you wrote? For me, it's an opt-in feature: you must call sys.settraceinstr() instead of sys.settrace(), otherwise you don't get PyTrace_INSTRUCTION events. From the user point of view, I expect that the debugger starts at LINE level, but only switch to instruction level when I explicitly ask it to do so.
msg286718 - (view) Author: George King (gwk) * Date: 2017-02-01 22:55
Xavier, this is a misunderstanding; sorry for not being more clear. When I said "remove the `else`", I was proposing this:

+ if (tstate->inst_tracing) {
+   result = call_trace(func, obj, tstate, frame, PyTrace_INSTRUCTION, Py_None);
+ }

Line-oriented tracers/debuggers would not have set trace_instructions, so they would not receive any 'instruction' events, and should not suffer.

The motivation for this suggestion was that for tracers that *are* interested in instructions, it *might* be simpler to get two calls, one of 'line', and one of 'instruction'. Maybe it's a bad idea though.
msg286720 - (view) Author: George King (gwk) * Date: 2017-02-01 23:29
Attached is a demo of using the feature as a fancy replacement for __ltrace__. It demonstrates using a closure for the local trace function to track the previous offset, and prints the offset transitions as src -> dst pairs. This helped me learn a lot about how exception handling works at the bytecode level.
msg286721 - (view) Author: George King (gwk) * Date: 2017-02-01 23:31
Thanks to both of you for your feedback. I will take a stab at updating the patch with Victor's suggestions as soon as I can.
msg286722 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-01 23:50
Hum, I don't know well how Python tracing works, but if we had a new
"instruction level debugging" feature, it must be an opt-in feature.
Existing debuggers must not be flooded by such events.
msg286765 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-02-02 11:27
> The motivation for this suggestion was that for tracers that *are* interested in instructions, it *might* be simpler to get two calls, one of 'line', and one of 'instruction'. Maybe it's a bad idea though.

Thanks for the clarification. It's ok if instruction tracing remains an opt-in feature. As for this suggestion, I am not sure it is useful for debuggers based on the bdb module as trace_dispatch() (the bdb trace function) receives notification of the trace event type as the 'event' argument.
msg287157 - (view) Author: George King (gwk) * Date: 2017-02-06 17:19
Attached is a new patch, which does not settrace/gettrace and instead offers new settraceinst/gettraceinst per Victor's recommendation.

I did not implement the proposed behavior of raising an exception if the old APIs are used when the inst_tracing flag is set. However I do think it makes sense to do so.

The new sys.settraceinst function takes two arguments: a function and an integer flag that is currently interpreted as a boolean (whether or not to enable instruction level tracing).

I did it this way because I would like to consider adding a third mode, which would only trigger tracing for interesting control-flow events, namely steps for which the previous instruction was a branch. The motivation is that I expect per-instruction tracing to be very slow, and for the code coverage use case at least, these are the interesting events. For this to be useful, a (prev_offset, current_offset) pair would need to be passed as to the trace callback. I am not sure if this would be useful to other applications, e.g. pdb.
msg287158 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-06 17:28
> I did it this way because I would like to consider adding a third mode, which would only trigger tracing for interesting control-flow events, namely steps for which the previous instruction was a branch.

Hum, let's use https://en.wikipedia.org/wiki/Basic_block terminology.

The callback would be called when you enter and exit a basic block? Or only on exit?

Only on conditional branches? Or also on unconditional branches?

In term of instructions, can give examples of instructions which would be traced or not?

IMHO the two instruction level modes would be useful.
msg287165 - (view) Author: George King (gwk) * Date: 2017-02-06 19:01
I'm not sure exactly, but the way I see it (for code coverage), we want to trace transitions between basic blocks. So I would define it as: each entry into a BB is traced, with a tuple of (previous_offset, current_offset). This way when a function call starts, we callback with (-1, 0), and then get a callback for every transition between BBs. The code is well covered if we observe every possible transition.

I am not clear on the precise details, e.g. regarding unconditional branches. Since for code coverage this is essentially on optimization over per-instruction mode, I intend to first work out correctness details and test cases for the coverage tool, and then once the tests solid add the optimization.

I'm happy to discuss more, but this aspect is a lower priority for me (I still have to work out remaining correctness issues with my opcode analysis). Would it make sense to split it out into a second issue?
msg287403 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-09 11:08
See aslo issue #29502: "Should PyObject_Call() call the profiler on C functions, use C_TRACE() macro?"
msg298441 - (view) Author: George King (gwk) * Date: 2017-07-16 16:16
After reviewing the thread, I'm reminded that the main design problem concerns preserving behavior of this idiom:
"old=sys.gettrace(); ...; sys.settrace(old)"

If we add more state, i.e. the `trace_instructions` bool, then the above idiom no longer correctly stores/reloads the full state.

Here are the options that I see:

1. New APIs:
* `gettrace() -> Callable # no change.`
* `settrace(tracefunc: Callable) -> None # no change.`
* `gettraceinst() -> Callable # new.`
* `settraceinst(tracefunc: Callable) -> None # new.`
Behavior:
* `settrace()` raises if `gettraceinst()` is not None.
* `settraceinst()` raises if `gettrace()` is not None.


2. Add keyword arg to `settrace`.
* `gettrace() -> Callable # no change.`
* `settrace(tracefunc: Callable, trace_instructions:Optional[bool]=False) -> None # added keyword.`
* `gettracestate() -> Dict[str, Any] # new.`
Behavior:
* `gettracestate()` returns the complete trace-related state: currently, `tracefunc` and `trace_instructions` fields.
* `gettrace()` raises an exception if any of the trace state does not match the old behavior, i.e. if `trace_instructions` is set.


3. New API, but with extensible keyword args:
* `settracestate(tracefunc: Callable, trace_instructions:Optional[bool]=False) -> None # new.`
* `gettracestate() -> Dict[str, Any] # new.`
Behavior:
* `settrace()` raises if `gettracestate()` is not None.
* `settracestate()` raises if `gettrace()` is not None.


As I see it:
* approach #1 is more conservative because it does not modify the old API.
* approach #2 is more extensible because all future features can be represented by the state dictionary.
* approach #3 has both of these strengths, but is also the most work.

Examples of possible future features via keywords:
* branch-level tracing, as briefly disscussed above.
* instruction filtering set, which could be a more general version of the branch tracing.

I intend to prototype one or both of these, but I'm also feeling a bit of time pressure to get the basic feature on track for 3.7.
msg299887 - (view) Author: George King (gwk) * Date: 2017-08-08 01:36
I've updated the patch and I think it's ready for a more serious review. A few notes:

* settracestate now takes a flag `trace_instructions`. This describes slightly better the behavior, which is that line events take precedence over instructions.

* the old gettrace now raises when `trace_instructions` was set, to prevent silent failure of the old idiom `saved = gettrace(); ... settrace(saved)`.

* gettracestate returns a dictionary, which can be passed to settracestate as kwargs. This allows for a replacement idiom `saved = gettracestate(); ... settracestate(**saved)` which should be forward-compatible with any flags we might add in the future.

The patch can be viewed here: https://github.com/python/cpython/compare/master...gwk:trace-inst

Let me know if I should open up an official PR. Also I just filled out the CLA.
msg299888 - (view) Author: George King (gwk) * Date: 2017-08-08 01:38
(Also I did prototype instruction filtering but it had mild performance implications when tracing so I have shelved it for the moment.)
msg300992 - (view) Author: George King (gwk) * Date: 2017-08-29 15:48
Attached updated demo script.
History
Date User Action Args
2017-08-29 15:49:27gwksetpull_requests: + pull_request3278
2017-08-29 15:48:28gwksetfiles: + settracestate-demo.py

messages: + msg300992
2017-08-08 01:38:23gwksetmessages: + msg299888
2017-08-08 01:36:59gwksetmessages: + msg299887
2017-07-16 16:16:32gwksetmessages: + msg298441
2017-02-09 11:08:33hayposetmessages: + msg287403
2017-02-06 19:01:47gwksetmessages: + msg287165
2017-02-06 17:28:46hayposetmessages: + msg287158
2017-02-06 17:19:22gwksetfiles: + settraceinst.diff

messages: + msg287157
2017-02-03 22:36:37terry.reedysettitle: Instruction level tracing via sys.settrace -> Add instruction level tracing via sys.settrace
stage: test needed
versions: + Python 3.7, - Python 3.6
2017-02-02 11:27:42xdegayesetmessages: + msg286765
2017-02-01 23:50:34hayposetmessages: + msg286722
2017-02-01 23:31:30gwksetmessages: + msg286721
2017-02-01 23:29:12gwksetfiles: + inst-trace-demo.py

messages: + msg286720
2017-02-01 22:55:21gwksetmessages: + msg286718
2017-02-01 22:37:31hayposetmessages: + msg286716
2017-02-01 22:28:00xdegayesetmessages: + msg286715
2017-02-01 21:51:26hayposetfiles: + inst-tracing-2.diff

messages: + msg286708
2017-02-01 21:09:32xdegayesetnosy: + xdegaye
2017-02-01 04:36:18gwksetfiles: + inst-tracing.diff
keywords: + patch
messages: + msg286587
2017-01-31 22:49:19ammar2setnosy: + ammar2
messages: + msg286563
2017-01-31 22:47:17hayposetmessages: + msg286561
2017-01-31 18:20:52gwkcreate