Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink recursive tracebacks #71010

Closed
Vgr255 mannequin opened this issue Apr 22, 2016 · 33 comments
Closed

Shrink recursive tracebacks #71010

Vgr255 mannequin opened this issue Apr 22, 2016 · 33 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@Vgr255
Copy link
Mannequin

Vgr255 mannequin commented Apr 22, 2016

BPO 26823
Nosy @birkenfeld, @ncoghlan, @ethanfurman, @Rosuav, @berkerpeksag, @serhiy-storchaka, @zhangyangyu, @Vgr255, @AraHaan
Files
  • short_tracebacks.patch
  • short_tracebacks_2.patch
  • short_tracebacks_3.patch
  • short_tracebacks_4.patch
  • short_tracebacks_5.patch
  • short_tracebacks_6.patch
  • short_tracebacks_7.patch
  • short_tracebacks_doc_1.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2016-08-16.01:00:00.997>
    created_at = <Date 2016-04-22.02:36:25.104>
    labels = ['interpreter-core', 'type-bug']
    title = 'Shrink recursive tracebacks'
    updated_at = <Date 2016-08-20.01:06:53.338>
    user = 'https://github.com/Vgr255'

    bugs.python.org fields:

    activity = <Date 2016-08-20.01:06:53.338>
    actor = 'python-dev'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2016-08-16.01:00:00.997>
    closer = 'ncoghlan'
    components = ['Interpreter Core']
    creation = <Date 2016-04-22.02:36:25.104>
    creator = 'abarry'
    dependencies = []
    files = ['42561', '42562', '42570', '43646', '43843', '43863', '43866', '44120']
    hgrepos = []
    issue_num = 26823
    keywords = ['patch']
    message_count = 33.0
    messages = ['263948', '263949', '263950', '263951', '263952', '263953', '263954', '263955', '263959', '264022', '265750', '269909', '271085', '271089', '271140', '271152', '271156', '271162', '272642', '272703', '272705', '272706', '272707', '272708', '272709', '272710', '272711', '272724', '272734', '272780', '272812', '272814', '273177']
    nosy_count = 10.0
    nosy_names = ['georg.brandl', 'ncoghlan', 'ethan.furman', 'python-dev', 'Rosuav', 'berker.peksag', 'serhiy.storchaka', 'xiang.zhang', 'abarry', 'Decorater']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26823'
    versions = ['Python 3.6']

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Apr 22, 2016

    I recently suggested on Python-ideas ( https://mail.python.org/pipermail/python-ideas/2016-April/039899.html ) to shrink long tracebacks if they were all the same noise (recursive calls). Seeing as the idea had a good reception, I went ahead and implemented a small patch for this.

    It doesn't keep track of call chains, and I'm not sure if we want to implement that, as the performance decrease needed to store all of that might not be worth it. But then again, if an error happened performance is probably not a concern in this case.

    I've never really coded in C before, so feedback is very much welcome.

    @Vgr255 Vgr255 mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Apr 22, 2016
    @Rosuav
    Copy link
    Contributor

    Rosuav commented Apr 22, 2016

    By "doesn't keep track of call chains", you mean that it can't handle mutually-recursive functions, right?

    Still useful.

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Apr 22, 2016

    Yes, can't handle mutually recursive functions. I could maybe check for the last two or three functions, but that seems like unnecessary work for something that might not happen as often (I can see it being the case with e.g. __getattr__ though).

    If enough people think it should keep track of all or most functions being called, I can do it.

    @ethanfurman
    Copy link
    Member

    If you can, give it a go. Make it a new patch, though -- don't delete the existing one.

    @ethanfurman ethanfurman added the type-bug An unexpected behavior, bug, or error label Apr 22, 2016
    @zhangyangyu
    Copy link
    Member

    With the current patch, a simple test gives the traceback:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 2, in test
      File "<stdin>", line 2, in test
      File "<stdin>", line 2, in test
      [Previous message repeated 995 more times]
    RecursionError: maximum recursion depth exceeded

    I think the message [Previous message repeated 995 more times] is vague. Does it refer to the previous line or the whole previous messages? Hmm, at least it is vague for me, a non-native English speaker.

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Apr 22, 2016

    The message is mostly a placeholder, but "message" is singular so I figured it would be obvious. But alas, if you are confused, others might be too. Propositions for a better message are welcome :)

    I'll attempt to make it track chained calls (or mutually recursive functions) tomorrow, my sleep-deprived brain is unable to find a clean way to do this right now :)

    @ncoghlan
    Copy link
    Contributor

    +1 for the simple approach, and deferring the mutual recursion support - it's desirable to keep traceback printing simple in order to minimise the chance for failures during the display process.

    In addition to the C implementation of traceback printing, the standard library's traceback module would also need updating.

    Regarding the ambiguity of "Previous message", I'd suggest using "Previous line" instead - that way it's accurate regardless of whether people read it as "previous line of the traceback" or "previous line of quoted source code".

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Apr 22, 2016

    Attached patch also modifies Lib/traceback.py to present identical behaviour, and changes "Previous message" to "Previous line". I'll postpone the more complex implementation of that, and might just not do it as it's indeed better to avoid bugs where we're meant to handle bugs =)

    Should there be unit tests for that? I'm really not sure how to write tests for that particular case...

    @ncoghlan
    Copy link
    Contributor

    For testing, you can create a recursive scenario that terminates with an exception after a defined number of iterations:

    >>> def f(counter):
    ...     if counter:
    ...         f(counter-1)
    ...     else:
    ...         raise RuntimeError
    ... 
    >>> f(3)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in f
      File "<stdin>", line 3, in f
      File "<stdin>", line 3, in f
      File "<stdin>", line 5, in f
    RuntimeError

    It's probably also worth checking the handling of two distinct frame repetitions by having another recursion counter that terminates itself by calling the one that terminates by raising an exception

    Beyond that, https://hg.python.org/cpython/file/tip/Lib/test/test_traceback.py covers both traceback formatting implementations, so a new test case in TracebackFormatTests seems appropriate.

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Apr 22, 2016

    New version with tests now, I test both the C and Python implementations.

    @Vgr255 Vgr255 mannequin self-assigned this Apr 22, 2016
    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented May 17, 2016

    I realize that I haven't given any update on this since my last patch. I figured that trying to detect when an arbitrary number of functions call each other in a convoluted call chain isn't a very good idea.

    Not only is it way beyond my abilities, but I think it'll overcomplicate the code too much. Furthermore, the way Python/traceback.c (and, coincidentally, Lib/traceback.py) is done - print each frame as they come and then throw it away - means that remembering more than the last frame needs a refactor of both files. This is way out of the scope of this issue, which aims to solve the common cases (a typo when messing around in the REPL, __getattribute__ and __getattr__ are what get me the most).

    I think that if the idea of keeping a track of multiple frames still flies, it should be a separate issue - probably after yet another issue proposing to refactor the files for that, but I'm dropping the idea.

    Except for the ambigous message, I haven't seen any comments on any of my patches, which I take as a very good thing. Is there anything else preventing this from being merged?

    TL;DR - Ping

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Jul 6, 2016

    Sorry for the delay, but here's a patch with updated tests after Nick's suggestions. It should be ready to merge now.

    (I'm having some failures in test_traceback, but they fail even without my patch)

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Jul 23, 2016

    After discussing with someone, I realized that the tests actually test the Python implementation twice, and never the C implementation (since I use traceback.print_exc() to "test" the C implementation).

    I don't think it's possible to both catch the output from the C implementation and prevent it from exiting out of the function prematurely, but if anyone knows how to do that, be my guest! In the meantime, I'm going to mess about with _testcapi and see if I can add something there to help with this.

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Jul 23, 2016

    Turns out there's already some functions in _testcapi that do this, great! Here's an updated patch actually testing the C implementation (and the tests pass, obviously :)

    @ncoghlan ncoghlan assigned ncoghlan and unassigned Vgr255 Jul 24, 2016
    @ncoghlan
    Copy link
    Contributor

    Thanks Emanuel, and sorry for the long delay in getting a review!

    My main substantive comment is that the approach of hardcoding the recursion count won't work cross platform, as the default recursion limit is set differently depending on how the platform's C runtime behaves. Fortunately, sys.getrecursionlimit() and len(inspect.stack()) should make it possible to tweak the tests to avoid the hardcoded assumption.

    I also have some additional readability suggestions, which can be found in Rietveld.

    @ncoghlan ncoghlan assigned Vgr255 and unassigned ncoghlan Jul 24, 2016
    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Jul 24, 2016

    Thank you Nick for the comments! Updated patch attached.

    The new patch uses inspect.stack() to calculate the total stack size, but I'm unable to get the exact number needed, and I found that I'm either 19 or 20 above (depending on whether I run it with -m or manually). The test now "just works" with it, but I'm quite sure it's going to break at some point. Any idea?

    The other alternative would be to use traceback.walk_tb(exc.__traceback__), but the Python implementation uses that, so it'd be comparing itself; and that's about as useful as not caring about the size at all. (On the other hand, the C and Python implementations get the exact same number, so it can confirm it's fine)

    @ncoghlan
    Copy link
    Contributor

    Since you're ultimately comparing a list of lines, it may be useful to split the checks into two parts:

    1. Check the deterministic lines
    2. Check the variable line (which should always be at index -2 in the splitlines() result)

    For the recursion error, we're not too worried about the *exact* repetition count, we're mostly interested in checking that it abbreviated the traceback. Once you've pulled that line out of the main "Are they the same?" comparison, you can use a regex (or just string operations) to extract the repetition count, convert it to an integer and check that it gives the right answer to within (say) +/- 50 repetitions.

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Jul 24, 2016

    Alright, I think this works now. No more magic number, just some regex as well as checking that the number is in range(sys.getrecursionlimit()-50, sys.getrecursionlimit()) :)

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Aug 14, 2016

    Is it possible to get this in in time for 3.6.0a4? The feature itself hasn't been touched in about 4 months, only tests have been tweaked since.

    @ncoghlan
    Copy link
    Contributor

    I'm at the PyCon AU sprints today, and will take a look at getting this merged.

    @ncoghlan ncoghlan assigned ncoghlan and unassigned Vgr255 Aug 15, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 15, 2016

    New changeset 5a2ff215e841 by Nick Coghlan in branch 'default':
    Issue bpo-26823: Abbreviate recursive tracebacks
    https://hg.python.org/cpython/rev/5a2ff215e841

    @ncoghlan
    Copy link
    Contributor

    Thanks for the patch, Emanuel!

    The merged patch uses your implementation and the test cases you designed, with the following changes:

    • I refactored the test cases so the traceback module gets tested on all implementations, with only the CPython builtin rendering tests being specific to CPython

    • I added docs updates and a couple of What's New entries for the change

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Aug 15, 2016

    Awesome, thanks! The "What's new" entry is a bit confusing though; to a casual observer it might look like the traceback module was updated but the normal behaviour wasn't changed.

    @AraHaan
    Copy link
    Mannequin

    AraHaan mannequin commented Aug 15, 2016

    Be nice if 3.5 had this as well. As it can be very annoying at times. (Not to mention thread exceptions just cant be handled by anyone's code unless you modify threading.py

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Aug 15, 2016

    Pretty sure this falls under the "New features" category, and as such can't be applied on 3.5.

    @AraHaan
    Copy link
    Mannequin

    AraHaan mannequin commented Aug 15, 2016

    3.6 would make thread exceptions not print out to console or at least be configured to make them logged with the logging module in threading.py for 3.6? because I would like it to use the logger. Oh and how can I get the logs from the logging module from 2 different libs in 1 module?

    Like I can get the logs from 1 module just fine.
    However if I try to get from things (lets say from asyncio (then it would return nothing).

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Aug 15, 2016

    This is irrelevant to this issue.

    @ncoghlan
    Copy link
    Contributor

    As Emanuel noted, this is a new feature, and hence won't be backported to any earlier versions.

    For What's New, it turns out that didn't quite render the way I expected, since the "StackSummary" class reference is missing in addition to the "traceback" module reference.

    So I'll tweak that, and also try to make it clearer that all tracebacks are affected, not just those printed via the traceback module (PyTraceback_Print is mentioned further down, but what I'll probably do is add an example of an infinite recursion being truncated at the interactive prompt)

    @berkerpeksag
    Copy link
    Member

    So I'll tweak that [...]

    Hi Nick, could you also tweak the usage of versionchanged directive? :)

    + .. versionchanged:: 3.6
    +
    + Long sequences of repeated frames are now abbreviated.

    Description line needs to be indented:

    .. versionchanged:: 3.6
       Long sequences of repeated frames are now abbreviated.
    

    @Vgr255
    Copy link
    Mannequin Author

    Vgr255 mannequin commented Aug 15, 2016

    Doc patch.

    @Vgr255 Vgr255 mannequin reopened this Aug 15, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 16, 2016

    New changeset 86d062edb6ab by Nick Coghlan in branch 'default':
    Issue bpo-26823: fix traceback abbreviation docs
    https://hg.python.org/cpython/rev/86d062edb6ab

    @ncoghlan
    Copy link
    Contributor

    Thanks again Emanuel - I incorporated your suggested changes along with some other ones I had already made locally.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 20, 2016

    New changeset a7f3678f9509 by Victor Stinner in branch 'default':
    Fix reference leak in tb_printinternal()
    https://hg.python.org/cpython/rev/a7f3678f9509

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants