Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracebacks eat up memory by holding references to locals and globals when they are not wanted #44031

Closed
ghazel mannequin opened this issue Sep 26, 2006 · 43 comments
Closed
Labels
easy performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@ghazel
Copy link
Mannequin

ghazel mannequin commented Sep 26, 2006

BPO 1565525
Nosy @tim-one, @loewis, @akuchling, @terryjreedy, @pitrou, @vstinner, @bitdancer, @phmc
Files
  • eat_memory.py
  • clear-tb-frames.txt
  • clear-tb-frames-2.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-12-27.19:59:52.161>
    created_at = <Date 2006-09-26.06:58:18.000>
    labels = ['easy', 'library', 'performance']
    title = 'tracebacks eat up memory by holding references to locals and globals when they are not wanted'
    updated_at = <Date 2017-09-01.14:53:26.982>
    user = 'https://bugs.python.org/ghazel'

    bugs.python.org fields:

    activity = <Date 2017-09-01.14:53:26.982>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-12-27.19:59:52.161>
    closer = 'akuchling'
    components = ['Library (Lib)']
    creation = <Date 2006-09-26.06:58:18.000>
    creator = 'ghazel'
    dependencies = []
    files = ['2160', '31668', '31692']
    hgrepos = []
    issue_num = 1565525
    keywords = ['easy']
    message_count = 43.0
    messages = ['29994', '29995', '29996', '29997', '29998', '29999', '30000', '30001', '30002', '74329', '74401', '79815', '79817', '79818', '79819', '79820', '108646', '108653', '108658', '108665', '108668', '108672', '108677', '108686', '108742', '108748', '108751', '113450', '194513', '194520', '194943', '197267', '197278', '197340', '197347', '197371', '197396', '197397', '197429', '197776', '197777', '197838', '301115']
    nosy_count = 10.0
    nosy_names = ['tim.peters', 'loewis', 'akuchling', 'terry.reedy', 'pitrou', 'vstinner', 'ghazel', 'r.david.murray', 'python-dev', 'pconnell']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue1565525'
    versions = ['Python 3.4']

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Sep 26, 2006

    Attached is a file which demonstrates an oddity about
    traceback objects and the gc.

    The observed behaviour is that when the tuple from
    sys.exc_info() is stored on an object which is inside
    the local scope, the object, thus exc_info tuple, are
    not collected even when both leave scope.

    If you run the test with "s.e = sys.exc_info()"
    commented out, the observed memory footprint of the
    process quickly approaches and sits at 5,677,056
    bytes. Totally reasonable.

    If you uncomment that line, the memory footprint
    climbs to 283,316,224 bytes quite rapidly. That's a
    two order of magnitude difference!

    If you uncomment the "gc.collect()" line, the process
    still hits 148,910,080 bytes.

    This was observed in production code, where exc_info
    tuples are saved for re-raising later to get the stack-
    appending behaviour tracebacks and 'raise' perform.
    The example includes a large array to simulate
    application state. I assume this is bad behaviour
    occurring because the traceback object holds frames,
    and those frames hold a reference to the local
    objects, thus the exc_info tuple itself, thus causing
    a circular reference which involves the entire stack.
    Either the gc needs to be improved to prevent this
    from growing so wildly, or the traceback objects need
    to (optionally?) hold frames which do not have
    references or have weak references instead.

    @ghazel ghazel mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Sep 26, 2006
    @tim-one
    Copy link
    Member

    tim-one commented Sep 26, 2006

    Logged In: YES
    user_id=31435

    Your memory bloat is mostly due to the

    d = range(100000)

    line. Python has no problem collecting the cyclic trash,
    but you're creating 100000 * 100 = 10 million integer
    objects hanging off trash cycles before invoking
    gc.collect(), and those integers require at least 10 million

    • 12 ~= 120MB all by themselves. Worse, memory allocated to
      "short" integers is both immortal and unbounded: it can be
      reused for /other/ integer objects, but it never goes away.

    Note that memory usage in your program remains low and
    steady if you force gc.collect() after every call to bar().
    Then you only create 100K integers, instead of 10M, before
    the trash gets cleaned up.

    There is no simple-minded way to "repair" this, BTW. For
    example, /of course/ a frame has to reference all its
    locals, and moving to weak references for those instead
    would be insanely inefficient (among other, and deeper,
    problems).

    Note that the library reference manual warns against storing
    the result of exc_info() in a local variable (which you're
    /effectively/ doing, since the formal parameter s is a
    local variable within foo()), and suggests other approaches.
    Sorry, but I really couldn't tell from your description why
    you want to store this stuff in an instance attribute, so
    can't guess whether another more-or-less obvious approach
    would help.

    For example, no cyclic trash is created if you add this
    method to your class O:

        def get_traceback(self):
            self.e = sys.exc_info()

    and inside foo() invoke:

        s.get_traceback()

    instead of doing:

        s.e = sys.exc_info()

    Is that unreasonable? Perhaps simpler is to define a
    function like:

    def get_exc_info():
        return sys.exc_info()

    and inside foo() do:

        s.e = get_exc_info()

    No cyclic trash gets created that way either. These are the
    kinds of things the manual has suggested doing for the last
    10 years ;-)

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Sep 27, 2006

    Logged In: YES
    user_id=731668

    I have read the exc_info suggestions before, but they have
    never made any difference. Neither change you suggest
    modifies the memory footprint behaviour in any way.

    Weakrefs might be slow, I offered them as an alternative to
    just removing the references entirely. I understand this
    might cause problems with existing code, but the current
    situation causes a problem which is more difficult to work
    around. Code that needs locals and globals can explicity
    store a reference to eat - it is impossible to dig in to
    the traceback object and remove those references.
    The use-case of storing the exc_info is fairly simple, for
    example:
    Two threads. One queues a task for the other to complete.
    That task fails an raises an exception. The exc_info is
    caught, passed back to the first thread, the exc_info is
    raised from there. The goal is to get the whole execution
    stack, which it does quite nicely, except that it has this
    terrible memory side effect.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 27, 2006

    Logged In: YES
    user_id=21627

    I'm still having problems figuring out what the bug is that
    you are reporting. Ok, in this case, it consumes a lot of
    memory. Why is that a bug?

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Sep 27, 2006

    Logged In: YES
    user_id=731668

    The bug is the circular reference which is non-obvious and
    unavoidable, and cleaned up at some uncontrolable (unless
    you run a full collection) time in the future.
    There are many better situations or solutions to this bug,
    depending on which you think it is. I think those should be
    investigated.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 28, 2006

    Logged In: YES
    user_id=21627

    I disagree that the circular reference is non-obvious. I'm
    not sure what your application is, but I would expect that
    callers of sys.exc_info should be fully aware what a
    traceback is, and how it refers to the current frames. I do
    agree that it is unavoidable; I fail to see that it is a bug
    because of that (something unavoidable cannot be a bug).

    If you are saying that it is unavoidable in your
    application: I have difficulties believing this. For
    example, you could do

            s.e = sys.exc_info()[:2]

    This would drop the traceback, and thus not create a cyclic
    reference. Since, in the program you present, the traceback
    is never used, this looks like a "legal" simplification (of
    course, you don't use s.e at all in this program, so I can
    only guess that you don't need the traceback in your real
    application).

    As for the time of cleanup not being controllable: you can
    certainly control frequency of gc with gc.set_threshold; no
    need to invoke gc explicitly.

    tim_one: Why do you think your proposed modification of
    introducing get_traceback would help? The frame foo still
    refers to s (which is an O), and s.e will still refer to the
    traceback that includes foo.

    @tim-one
    Copy link
    Member

    tim-one commented Sep 28, 2006

    Logged In: YES
    user_id=31435

    [Martin]

    tim_one: Why do you think your proposed modification of
    introducing get_traceback would help? The frame foo still
    refers to s (which is an O), and s.e will still refer
    to the traceback that includes foo.

    Sorry about that! It was an illusion, of course. I wanted
    to suggest a quick fix, and "tested it" too hastily in a
    program that didn't actually bloat with /or/ without it.

    For the OP, I had need last year of capturing a traceback
    and (possibly) displaying it later in ZODB. It never would
    have occurred to me to try saving away exc_info(), though.
    Instead I used the traceback module to capture the
    traceback output (a string), which was (possibly) displayed
    later, with annotations, by a different thread. No cycles,
    no problems.

    BTW, I must repeat that there is no simple-minded way to
    'repair' this. That isn't based on general principle, but
    on knowledge of how Python is implemented.

    @akuchling
    Copy link
    Member

    Logged In: YES
    user_id=11375

    A quick grep of the stdlib turns up various uses of
    sys.exc_info that do put it in a local variable., e.g.
    doctest._exception_traceback, unittest._exc_info_to_string,
    SimpleXMLRPCServer._marshaled_dispatch. Do these all need
    to be fixed?

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Dec 29, 2006

    If you are saying that it is unavoidable in your
    application: I have difficulties believing this. For
    example, you could do

        s.e = sys.exc_info()[:2]
    

    This would drop the traceback, and thus not create a cyclic
    reference. Since, in the program you present, the traceback
    is never used, this looks like a "legal" simplification (of
    course, you don't use s.e at all in this program, so I can
    only guess that you don't need the traceback in your real
    application).

    "This was observed in production code, where exc_info
    tuples are saved for re-raising later to get the stack-
    appending behaviour tracebacks and 'raise' perform."

    I would like the traceback object so that I can re-raise the error. I can stringify it as tim_one suggests, but that can't be used with 'raise' and 'try','except' later.

    It is not important for my application to have all the references that the traceback object contains, which is what is causing the massive memory requirement. If I could replace the exc_info()[2] with a traceback look-alike that only held file, line, etc information for printing a standard traceback, that would solve this problem.

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Oct 4, 2008

    Or, being able to remove the references to the locals and globals from
    the traceback would be sufficient.

    Something like this:

    try:
    raise Exception()
    except:
    t, v, tb = sys.exc_info()
    tbi = tb
    while tbi:
    tbi.tb_frame.f_locals = None
    tbi.tb_frame.f_globals = None
    tbi = tbi.tb_next
    # now "tb" is cleaned of references

    @vstinner
    Copy link
    Member

    vstinner commented Oct 6, 2008

    Similar issue: bpo-4034 proposes to be able to set tb.tb_frame=None.
    It's easy to implement this, I can write a patch for this.

    @vstinner
    Copy link
    Member

    I wrote a patch to support <traceback object>.tb_frame=None. It works
    but the traceback becomes useless because you have unable to display
    the traceback. The frame is used by tb_printinternal() to get the
    filename (co_filename) and the code name (co_name).

    I also tried:
    while tbi:
    frame = tbi.tb_frame
    tbi = tbi.tb_next
    frame.f_locals.clear()
    frame.f_globals.clear()

    ... and it doesn't work because the tbi variable is also removed!

    A traceback object have to contain the full frame, but the frame
    contains "big" objects eating your memory. A solution to your initial
    problem (store exceptions) is to not store the traceback object or at
    least to store it as a (list of) string. Solution already proposed by
    loewis (msg29999).

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Jan 14, 2009

    But a list of strings is not re-raisable. The co_filename, co_name, and
    such used to print a traceback are not dependent on the locals or
    globals.

    @vstinner
    Copy link
    Member

    I tried to remove the frame from the traceback type (to keep only the
    filename and code name), but many functions rely on the frame object.
    Some examples:

    Lib/unittest.py:
    class TestResult(object):
    def _is_relevant_tb_level(self, tb):
    return '__unittest' in tb.tb_frame.f_globals

    Lib/traceback.py:
    print_tb() uses tb.tb_frame.f_globals for linecache.getline()

    Doc/tools/jinga/debugger.py:
    translate_exception() checks if __jinja_template__ variable is
    present in b.tb_frame.f_globals

    Lib/idlelib/StackViewer.py:
    StackTreeItem.get_stack() stores each tb.tb_frame in a list
    FrameTreeItem.GetText() reads frame.f_globals["__name__"] and gets
    the filename and code name using frame.f_code

    Lib/logging/init.py:
    currentframe() reads sys.exc_traceback.tb_frame.f_back

    Lib/types.py:
    Use tb.tb_frame to create the FrameType

    (...)

    co_name/co_filename can be stored directly in the traceback. But what
    about tb.tb_frame.f_back and tb.tb_frame.f_globals? I'm not motivated
    enough to change traceback API.

    @vstinner
    Copy link
    Member

    Greg Hazel> But a list of strings is not re-raisable

    Do you need the original traceback? Why not only raising the exception? Eg.
    ----

    import sys
    try:
        raise Exception("hm!")
    except:
        t, v, tb = sys.exc_info()
        raise v

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Jan 14, 2009

    STINNER Victor> Do you need the original traceback? Why not only raising
    the exception?

    If the exception was captured in one stack, and is being re-raised in
    another. It would be more useful to see the two stacks appended instead
    of just the place where it was re-raised (or the place where it was
    raised initially, which is what a string would get you - not to mention
    the inability to catch it).

    @terryjreedy
    Copy link
    Member

    Given comments like "I'm still having problems figuring out what the bug is that you are reporting. (Martin)" and "I must repeat that there is no simple-minded way to 'repair' this. (Tim)", and subsequent changes to Python, should this still be open? If so, for which version(s)?

    @loewis loewis mannequin closed this as completed Jun 25, 2010
    @loewis loewis mannequin closed this as completed Jun 25, 2010
    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Jun 25, 2010

    This is still an issue.

    The bug I'm reporting had been explained well, I thought, but I'll repeat it in summary:
    There is no way to pass around traceback objects without holding references to an excessive number of objects.

    Traceback raising typically does not use these references at all, so having some way to discard them would be very valuable. This allows storing and passing tracebacks between threads (or coroutines or async tasks) without dying quickly due to memory bloat. The simple-minded way to fix this is to allow the user to break the reference themselves.

    Fixing this bug would invalidate the need for hacks like the one Twisted has come up with in their twisted.python.Failure object which stringifies the traceback object, making it impossible to re-raise the exception. Failure has a lot of re-implementations of Exceptions and traceback objects as a result.

    @ghazel ghazel mannequin reopened this Jun 25, 2010
    @ghazel ghazel mannequin changed the title gc allowing tracebacks to eat up memory tracebacks eat up memory by holding references to locals and globals when they are not wanted Jun 25, 2010
    @ghazel ghazel mannequin reopened this Jun 25, 2010
    @ghazel ghazel mannequin changed the title gc allowing tracebacks to eat up memory tracebacks eat up memory by holding references to locals and globals when they are not wanted Jun 25, 2010
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 26, 2010

    I still don't understand the issue. You say that you want a traceback, but then you say you don't want the objects in the traceback. So what *precisely* is it that you want, and what is it that you don't want?

    In any case, whatever the solution, it is likely a new feature, which aren't acceptable anymore for 2.x release. So please don't target this report for any 2.x version.

    @ghazel
    Copy link
    Mannequin Author

    ghazel mannequin commented Jun 26, 2010

    The objects I do want in the traceback are the objects necessary to print a traceback, but not the locals and globals of each frame.

    For example:

    def bar():
      x = "stuff"
      raise Exception("example!")
    bar()
    prints: 
    Traceback (most recent call last):
      Line 4, in <module>
        bar()
      Line 3, in bar
        raise Exception("example!")
    Exception: example!

    There is no reason in that example to have a reference to "x" in the traceback, since it's not used in the output. This becomes important when I try to save a reference to the traceback object and raise it later:

    import sys
    def bar():
      x = "stuff"
      raise Exception("example!")
    try:
      bar()
    except:
      exc_info = sys.exc_info()
    def foo(e):
      raise e[0], e[1], e[2]
    # sometime possibly much later...
    foo(exc_info)
    Traceback (most recent call last):
      Line 12, in <module>
        foo(exc_info)
      Line 6, in <module>
        bar()
      Line 4, in bar
        raise Exception("example!")
    Exception: example!

    During that "sometime possibly much later..." comment, a reference to "x" is held, when it will not be used in printing the traceback later. So, I would not like to keep a reference to "x", and currently there is no way to do that without also dropping a reference to the data needed to print the traceback.

    @terryjreedy
    Copy link
    Member

    To call this a bug for tracker purposes, there would have to be a specific discrepancy between doc promise and observed behavior. Every feature request fixes a 'design bug' ;-).

    @terryjreedy terryjreedy added the type-feature A feature request or enhancement label Jun 26, 2010
    @bitdancer
    Copy link
    Member

    Excellent.

    As for twisted, I'm just repeating what I understood of what he said when I asked. It could well be that this feature would help them, I don't know.

    @terryjreedy
    Copy link
    Member

    This seems to be about reducing internal resource usage in a way that would be mostly invisible to the normal user. A core surface feature request would have to be put off to 3.3, but I do not see that as such.

    @terryjreedy terryjreedy added performance Performance or resource usage and removed type-feature A feature request or enhancement labels Aug 9, 2010
    @pitrou
    Copy link
    Member

    pitrou commented Aug 5, 2013

    frame.clear() was committed in bpo-17934.

    @vstinner
    Copy link
    Member

    vstinner commented Aug 6, 2013

    frame.clear() was committed in bpo-17934.

    How should it be used to workaround this issue ("tracebacks eat up memory by holding references to locals and globals when they are not wanted")?

    We need maybe an helper to clear all frames referenced by a traceback?

    @pitrou
    Copy link
    Member

    pitrou commented Aug 12, 2013

    We need maybe an helper to clear all frames referenced by a traceback?

    Yes. Something in the traceback module would be fine.

    @pitrou pitrou added stdlib Python modules in the Lib dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Aug 12, 2013
    @akuchling
    Copy link
    Member

    Here's a patch implementing traceback.clear_tb_frames(). (Feel free to bikeshed about the name.)

    One more substantial question: the top frame of the traceback is possibly still running. Currently the code skips it by doing an initial 'tb = tb.tb_next'. Would it be better to catch and ignore the RuntimeError
    from frame.clear()?

    @pitrou
    Copy link
    Member

    pitrou commented Sep 8, 2013

    One more substantial question: the top frame of the traceback is
    possibly still running. Currently the code skips it by doing an
    initial 'tb = tb.tb_next'. Would it be better to catch and ignore the
    RuntimeError
    from frame.clear()?

    Yes, I think it would be better.
    Other than that, the doc lacks a "versionadded" tag.
    Thanks!

    @akuchling
    Copy link
    Member

    Revised version of the patch: catches RuntimeError instead of skipping the first frame; adds versionadded tag; adds entry to NEWS and whatsnew files.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 9, 2013

    I would prefer a clear_frames() method on the traceback object rather than
    a function.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 9, 2013

    Revised version of the patch: catches RuntimeError instead of
    skipping the first frame; adds versionadded tag; adds entry to NEWS
    and whatsnew files.

    Looks good to me, thank you.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 9, 2013

    I tried to implement the feature as a new traceback.clear_frames() method. I tried to follow the chain of frame objects (using frame.f_back), but it does not work as expected. The method needs to follow the chain of traceback objects (tb.tb_next). So it makes sense to define a function instead of a method (a method usually only affect the object, not a chain of objects).

    clear-tb-frames-2.txt:

    • I didn't see the "tb" abbreviation in other places in Python, except for traceback attributes. I prefer clear_traceback_frames(). The name clear_frames() is maybe better because traceback is already known by the context (the module is called "tracback". Example: traceback.clear_frames(tb) instead of traceback.clear_traceback_frames(tb).

    • The documentation is wrong: frame.clear() does not guarantee to clear *all* locals, but only *most* locals:

    "F.clear(): clear most references held by the frame");

    So I suggest a more permissive documentation:

    "Clear most reference held by frames."

    @pitrou
    Copy link
    Member

    pitrou commented Sep 9, 2013

    • The documentation is wrong: frame.clear() does not guarantee to
      clear *all* locals, but only *most* locals:

    "F.clear(): clear most references held by the frame");

    Actually, the documentation is right: "references" != "locals" ;-)

    @akuchling
    Copy link
    Member

    I'm happy to change the function name, though I'll note that the traceback module does have print_tb(), format_tb() and extract_tb().

    I'm OK with both of Victor's suggestions but personally slightly prefer traceback.clear_frames(tb).

    Rationale: People who are keeping tracebacks around and want to save memory are probably using other functions from the traceback module, and the module has fairly short names (print_tb, format_exception) so I doubt they'll often do 'from traceback import clear_traceback_frames'.

    @akuchling
    Copy link
    Member

    Ping! I'd like to change the function name to clear_frames() and then commit this. Antoine or anyone, want to disagree with using clear_frames() as the name?

    @pitrou
    Copy link
    Member

    pitrou commented Sep 15, 2013

    Ping! I'd like to change the function name to clear_frames() and then
    commit this. Antoine or anyone, want to disagree with using
    clear_frames() as the name?

    clear_frames() sounds fine to me :-)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 15, 2013

    New changeset 100606ef02cf by Andrew Kuchling in branch 'default':
    bpo-1565525: Add traceback.clear_frames() helper function to clear locals ref'd by a traceback
    http://hg.python.org/cpython/rev/100606ef02cf

    @vstinner
    Copy link
    Member

    vstinner commented Sep 1, 2017

    traceback.clear_frames() doesn't really clear all frames: see bpo-31321.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants