Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instruction level tracing via sys.settrace #73586

Closed
gwk mannequin opened this issue Jan 31, 2017 · 23 comments
Closed

Add instruction level tracing via sys.settrace #73586

gwk mannequin opened this issue Jan 31, 2017 · 23 comments
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@gwk
Copy link
Mannequin

gwk mannequin commented Jan 31, 2017

BPO 29400
Nosy @freddrake, @vstinner, @jwilk, @xdegaye, @gwk, @ammaraskar
PRs
  • bpo-29400: sys.settracestate. #3236
  • Files
  • inst-tracing.diff
  • inst-tracing-2.diff
  • inst-trace-demo.py
  • settraceinst.diff
  • settracestate-demo.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-09-28.12:47:45.659>
    created_at = <Date 2017-01-31.18:20:52.106>
    labels = ['interpreter-core', 'type-feature', '3.7']
    title = 'Add instruction level tracing via sys.settrace'
    updated_at = <Date 2018-07-31.21:00:10.146>
    user = 'https://github.com/gwk'

    bugs.python.org fields:

    activity = <Date 2018-07-31.21:00:10.146>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-09-28.12:47:45.659>
    closer = 'gwk'
    components = ['Interpreter Core']
    creation = <Date 2017-01-31.18:20:52.106>
    creator = 'gwk'
    dependencies = []
    files = ['46475', '46485', '46486', '46547', '47107']
    hgrepos = []
    issue_num = 29400
    keywords = ['patch']
    message_count = 23.0
    messages = ['286546', '286561', '286563', '286587', '286708', '286715', '286716', '286718', '286720', '286721', '286722', '286765', '287157', '287158', '287165', '287403', '298441', '299887', '299888', '300992', '303134', '303198', '303227']
    nosy_count = 6.0
    nosy_names = ['fdrake', 'vstinner', 'jwilk', 'xdegaye', 'gwk', 'ammar2']
    pr_nums = ['3236']
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue29400'
    versions = ['Python 3.7']

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Jan 31, 2017

    I have recently put some effort into developing a code coverage tool that shows correct results for intraline branches. In order to get intraline trace data, I patched CPython, adding an optional "trace instructions" flag to sys.settrace. The patch is fairly simple.

    http://github.com/gwk/cpython/tree/inst-tracing

    The crucial detail in ceval is here:

    @@ -4521,6 +4523,9 @@ maybe_call_line_trace(Py_tracefunc func, PyObject *obj,
             frame->f_lineno = line;
             result = call_trace(func, obj, tstate, frame, PyTrace_LINE, Py_None);
         }
    +    else if (tstate->inst_tracing) {
    +        result = call_trace(func, obj, tstate, frame, PyTrace_INSTRUCTION, Py_None);
    +    }

    The call to maybe_call_line_trace is guarded by several checks, so I do not see how this would significantly impact performance, except for when tracing is enabled. I built python3.6 using clang -O3 -flto (I saw lots of test failures during the --with-optimizations PGO tests so I opted for a simpler build), with and without my patch. I then ran python3 -m performance run --fast and manually compared the results. The patched version was in most cases slightly slower but within the standard deviation of the normal version.

    I would appreciate feedback before going any further with this. I looked at haypo's perf project but was unsure how to use it to run a more detailed comparison.

    • First, will this feature be rejected on performance grounds or some other obvious reason?

    • Second, does setting tstate->inst_tracing in PyEval_SetTraceInstructions require more elaborate guards similar to those in PyEval_SetTrace? I do not understand the intricacies of that function.

    I am also curious about backwards compatibility. This adds an optional argument to settrace (haypo's recommendation) and some other things that appear to be under Py_LIMITED_API or internal (but I am not sure).

    • Is a change like this eligible for 3.6.x or would it have to wait until 3.7?
    • Would it be better to add a completely new function, e.g. sys.settrace_instructions instead of changing the signature of settrace?

    Lastly, I have a few ideas for enhancing this patch that will require some more work on my end to verify their utility.
    In particular:

    • Remove the else to always trace with PyTrace_INSTRUCTION, rather than as an alternate to PyTrace_LINE as it does now.
    • Pass the previous instruction offset as the trace function argument. This would give the tracer complete "edge" information. (In my coverage tool's trace function I track the previous edge as a nonlocal, but it would be better/faster to do it automatically in maybe_call_line_trace).
    • Finally, if all of this works, it might also be useful (as a performance enhancement) to add a PyTrace_BRANCH option, which would omit runs of opcodes inbetween branches, as these edges tend not be be useful, and even counterproductive (at least for my use case). However it might be too tricky to get all the cases right.

    @gwk gwk mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jan 31, 2017
    @vstinner
    Copy link
    Member

    Can you please attach your patch?

    @ammaraskar
    Copy link
    Member

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Feb 1, 2017

    Here is the patch from git; if you need a patch for hg I can work on that tomorrow!

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2017

    inst-tracing-2.diff:

    • Rebased patch applied on Mercurial, I had to fix a conflict in sys_settrace()
    • I replaced your PyEval_SetTraceInstructions() function with a new private _PyEval_SetTraceEx() method
    • I changed sys.settrace() API to convert the trace_instructions parameter to keyword only: sys.settrace(func, True) raises a TypeError (I added an unit test for that).

    About the API, I see an issue: the "old=sys.gettrace(); sys.settrace(old)" pattern doesn't work anymore, since old doesn't contain the trace_instructions flag :-/ So I suggest to add a new method and make it exclusive with sys.settrace(), sys.settraceinstr() / sys.gettraceinstr():

    • sys.settrace() raises an exception of sys.gettraceinstr() is not None
    • sys.settraceinstr() raises an exception of sys.gettrace() is not None

    The sys module is a critical module, so the API should probably be discussed on python-dev (later?).

    About performances: it's fine to add anything to maybe_call_line_trace(), it cannot impact performances when tracing is not used.

    Can you please try to reimplement something like __lltrace__ on top of sys.settraceinstr()? It would be a nice example of the API, but also a good reason to add sys.settraceinstr().

    See the issue bpo-25571 "Improve the lltrace feature with the Py_Debug mode".

    Currently, the __lltrace__ feature requires a Python compiled in debug mode, which is not convenient. Moreover, it requires to put a __ltrace__ variable in global variables.

    The best would be able to add an instruction level debugger inside pdb! It would be a cool tool to learn Python and to play with the bytecode!

    @xdegaye
    Copy link
    Mannequin

    xdegaye mannequin commented Feb 1, 2017

    About performances: it's fine to add anything to maybe_call_line_trace(), it cannot impact performances when tracing is not used.

    I do not agree, Python debuggers are already really really slow. They should not have to process 'instruction' trace events as this would happen if George does "Remove the else to always trace with PyTrace_INSTRUCTION, rather than as an alternate to PyTrace_LINE as it does now".
    I understand that this was a suggestion :)

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2017

    I do not agree, Python debuggers are already really really slow. They should not have to process 'instruction' trace events as this would happen if George does "Remove the else to always trace with PyTrace_INSTRUCTION, rather than as an alternate to PyTrace_LINE as it does now".

    Hum, there are two things:

    • attached patch adds an else to the C maybe_call_line_trace(): I don't think that it's possible to notice the overhead in a debugger implemented in pure Python. If you are concerned by the change, we need a micro-benchmark.

    • existing debuggers may have to be extended to support PyTrace_INSTRUCTION and so may be slowed down. Maybe I misunderstood what you wrote? For me, it's an opt-in feature: you must call sys.settraceinstr() instead of sys.settrace(), otherwise you don't get PyTrace_INSTRUCTION events. From the user point of view, I expect that the debugger starts at LINE level, but only switch to instruction level when I explicitly ask it to do so.

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Feb 1, 2017

    Xavier, this is a misunderstanding; sorry for not being more clear. When I said "remove the else", I was proposing this:

    + if (tstate->inst_tracing) {
    + result = call_trace(func, obj, tstate, frame, PyTrace_INSTRUCTION, Py_None);
    + }

    Line-oriented tracers/debuggers would not have set trace_instructions, so they would not receive any 'instruction' events, and should not suffer.

    The motivation for this suggestion was that for tracers that *are* interested in instructions, it *might* be simpler to get two calls, one of 'line', and one of 'instruction'. Maybe it's a bad idea though.

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Feb 1, 2017

    Attached is a demo of using the feature as a fancy replacement for __ltrace__. It demonstrates using a closure for the local trace function to track the previous offset, and prints the offset transitions as src -> dst pairs. This helped me learn a lot about how exception handling works at the bytecode level.

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Feb 1, 2017

    Thanks to both of you for your feedback. I will take a stab at updating the patch with Victor's suggestions as soon as I can.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2017

    Hum, I don't know well how Python tracing works, but if we had a new
    "instruction level debugging" feature, it must be an opt-in feature.
    Existing debuggers must not be flooded by such events.

    @xdegaye
    Copy link
    Mannequin

    xdegaye mannequin commented Feb 2, 2017

    The motivation for this suggestion was that for tracers that *are* interested in instructions, it *might* be simpler to get two calls, one of 'line', and one of 'instruction'. Maybe it's a bad idea though.

    Thanks for the clarification. It's ok if instruction tracing remains an opt-in feature. As for this suggestion, I am not sure it is useful for debuggers based on the bdb module as trace_dispatch() (the bdb trace function) receives notification of the trace event type as the 'event' argument.

    @terryjreedy terryjreedy added the 3.7 (EOL) end of life label Feb 3, 2017
    @terryjreedy terryjreedy changed the title Instruction level tracing via sys.settrace Add instruction level tracing via sys.settrace Feb 3, 2017
    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Feb 6, 2017

    Attached is a new patch, which does not settrace/gettrace and instead offers new settraceinst/gettraceinst per Victor's recommendation.

    I did not implement the proposed behavior of raising an exception if the old APIs are used when the inst_tracing flag is set. However I do think it makes sense to do so.

    The new sys.settraceinst function takes two arguments: a function and an integer flag that is currently interpreted as a boolean (whether or not to enable instruction level tracing).

    I did it this way because I would like to consider adding a third mode, which would only trigger tracing for interesting control-flow events, namely steps for which the previous instruction was a branch. The motivation is that I expect per-instruction tracing to be very slow, and for the code coverage use case at least, these are the interesting events. For this to be useful, a (prev_offset, current_offset) pair would need to be passed as to the trace callback. I am not sure if this would be useful to other applications, e.g. pdb.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 6, 2017

    I did it this way because I would like to consider adding a third mode, which would only trigger tracing for interesting control-flow events, namely steps for which the previous instruction was a branch.

    Hum, let's use https://en.wikipedia.org/wiki/Basic_block terminology.

    The callback would be called when you enter and exit a basic block? Or only on exit?

    Only on conditional branches? Or also on unconditional branches?

    In term of instructions, can give examples of instructions which would be traced or not?

    IMHO the two instruction level modes would be useful.

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Feb 6, 2017

    I'm not sure exactly, but the way I see it (for code coverage), we want to trace transitions between basic blocks. So I would define it as: each entry into a BB is traced, with a tuple of (previous_offset, current_offset). This way when a function call starts, we callback with (-1, 0), and then get a callback for every transition between BBs. The code is well covered if we observe every possible transition.

    I am not clear on the precise details, e.g. regarding unconditional branches. Since for code coverage this is essentially on optimization over per-instruction mode, I intend to first work out correctness details and test cases for the coverage tool, and then once the tests solid add the optimization.

    I'm happy to discuss more, but this aspect is a lower priority for me (I still have to work out remaining correctness issues with my opcode analysis). Would it make sense to split it out into a second issue?

    @vstinner
    Copy link
    Member

    vstinner commented Feb 9, 2017

    See aslo issue bpo-29502: "Should PyObject_Call() call the profiler on C functions, use C_TRACE() macro?"

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Jul 16, 2017

    After reviewing the thread, I'm reminded that the main design problem concerns preserving behavior of this idiom:
    "old=sys.gettrace(); ...; sys.settrace(old)"

    If we add more state, i.e. the trace_instructions bool, then the above idiom no longer correctly stores/reloads the full state.

    Here are the options that I see:

    1. New APIs:
    • gettrace() -> Callable # no change.
    • settrace(tracefunc: Callable) -> None # no change.
    • gettraceinst() -> Callable # new.
    • settraceinst(tracefunc: Callable) -> None # new.
      Behavior:
    • settrace() raises if gettraceinst() is not None.
    • settraceinst() raises if gettrace() is not None.
    1. Add keyword arg to settrace.
    • gettrace() -> Callable # no change.
    • settrace(tracefunc: Callable, trace_instructions:Optional[bool]=False) -> None # added keyword.
    • gettracestate() -> Dict[str, Any] # new.
      Behavior:
    • gettracestate() returns the complete trace-related state: currently, tracefunc and trace_instructions fields.
    • gettrace() raises an exception if any of the trace state does not match the old behavior, i.e. if trace_instructions is set.
    1. New API, but with extensible keyword args:
    • settracestate(tracefunc: Callable, trace_instructions:Optional[bool]=False) -> None # new.
    • gettracestate() -> Dict[str, Any] # new.
      Behavior:
    • settrace() raises if gettracestate() is not None.
    • settracestate() raises if gettrace() is not None.

    As I see it:

    Examples of possible future features via keywords:

    • branch-level tracing, as briefly disscussed above.
    • instruction filtering set, which could be a more general version of the branch tracing.

    I intend to prototype one or both of these, but I'm also feeling a bit of time pressure to get the basic feature on track for 3.7.

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Aug 8, 2017

    I've updated the patch and I think it's ready for a more serious review. A few notes:

    • settracestate now takes a flag trace_instructions. This describes slightly better the behavior, which is that line events take precedence over instructions.

    • the old gettrace now raises when trace_instructions was set, to prevent silent failure of the old idiom saved = gettrace(); ... settrace(saved).

    • gettracestate returns a dictionary, which can be passed to settracestate as kwargs. This allows for a replacement idiom saved = gettracestate(); ... settracestate(**saved) which should be forward-compatible with any flags we might add in the future.

    The patch can be viewed here: master...gwk:trace-inst

    Let me know if I should open up an official PR. Also I just filled out the CLA.

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Aug 8, 2017

    (Also I did prototype instruction filtering but it had mild performance implications when tracing so I have shelved it for the moment.)

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Aug 29, 2017

    Attached updated demo script.

    @vstinner
    Copy link
    Member

    The main blocker question is the API change. I started a thread on python-dev to discuss it:
    https://mail.python.org/pipermail/python-dev/2017-September/149632.html

    @vstinner
    Copy link
    Member

    See also bpo-31344 and PR bpo-3417:

    New changeset 5a85167 by Nick Coghlan in branch 'master':
    bpo-31344: Per-frame control of trace events (GH-3417)
    5a85167

    @gwk
    Copy link
    Mannequin Author

    gwk mannequin commented Sep 28, 2017

    Nick's implementation of f_trace_lines/f_trace_opcodes serves the same purpose as my proposal, and is a simpler solution so I'm abandoning this patch and working with that feature instead.

    @gwk gwk mannequin closed this as completed Sep 28, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants