This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
Type: enhancement Stage: patch review
Components: Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: andrei.avk, iritkatriel, jbw, moi90, rbcollins, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-03-29 09:36 by moi90, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 25062 closed moi90, 2021-03-29 10:07
PR 29299 closed moi90, 2021-10-28 21:00
Messages (43)
msg389679 - (view) Author: Martin (moi90) * Date: 2021-03-29 09:36
With `capture_locals=True`, `StackSummary.format` prints the local variables for every frame:
https://github.com/python/cpython/blob/4827483f47906fecee6b5d9097df2a69a293a85c/Lib/traceback.py#L440

This will fail, however, if string conversion fails.

StackSummary.format should be robust towards such possibilities.


An easy fix would be a utility function:

```
def try_str(x):
  try:
    return str(x)
  except:
    return "<some sensible hint>"
```
msg389680 - (view) Author: Martin (moi90) * Date: 2021-03-29 09:46
I have to correct myself: The conversion to string already happens during construction, in `FrameSummary.__init__`:

https://github.com/python/cpython/blob/4827483f47906fecee6b5d9097df2a69a293a85c/Lib/traceback.py#L273

The issue remains as severe and the fix remains as easy.
msg389794 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-03-30 07:20
Why str() is called at all? Would not be better to use repr()?
msg389815 - (view) Author: Martin (moi90) * Date: 2021-03-30 10:27
Yes, it is repr in FrameSummary.__init__.
msg389819 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-03-30 11:30
See issue39228.
msg389840 - (view) Author: Martin (moi90) * Date: 2021-03-30 17:24
I didn't know repr is supposed to always succeed. Does the documentation mention this?
msg389842 - (view) Author: Martin (moi90) * Date: 2021-03-30 17:31
I can't find any mention of this in the documentation:

https://docs.python.org/3/reference/datamodel.html#object.__repr__

https://docs.python.org/3/library/functions.html#repr

I think this should be mentioned somewhere.
msg393199 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-05-07 18:25
I think it's the other way around - the documentation usually states what exceptions a function can raise as part of its published API, not which exceptions it should not raise.
msg393298 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-05-09 05:39
@Martin it's more that logic in a function may raise exception and it will be handled somewhere upstream, but a repr should just give you a simple representation of an object, it would make no sense for upstream logic to be "object repr failed with such exception, instead we should do this now". It's hard to imagine when this would make sense.
msg393519 - (view) Author: Martin (moi90) * Date: 2021-05-12 14:48
Thanks for the explanations. I think this issue can be closed then.
msg404319 - (view) Author: Joe Wells (jbw) Date: 2021-10-19 17:02
I would like to request that this bug be repopened and fixed.

I've changed (or at least tried to change, I'm not sure if it will let me) the title of the bug to point out that the failure happens in FrameSummary.__init__.  It does not happen in StackSummary.format.

This problem makes the capture_locals=True feature of TracebackException and StackSummary and the "locals" parameter of FrameSummary unreliable.  If any one of the local variables in any frame on the stack is in an inconsistent state such that repr will raise an exception, the processing of the traceback fails.  This kind of inconsistent state is of course likely to happen during debugging, which is precisely when you would want the capture_locals feature to actually work so you can see what is going wrong.

Just one example of an exception traceback being created with an unsafe local variable value in one of the stack frames is in the following line:

  from _pydecimal import Decimal as D; D(None)

The _pydecimal.Decimal.__new__ method raises an exception when it sees a value it doesn't know how to convert to Decimal.  When it does this, the new object it was creating is left in an inconsistent state missing the _sign attribute.  When you try to inspect the resulting exception traceback with traceback.TracebackException(..., capture_locals=True), this raises an exception.

While I was tracking this bug down, I discovered that the traceback.TracebackException.__init__ method has the same problem: it initializes the _str attribute used by the __str__ method quite late and when it raised an exception before this point, the incompletely initialized TracebackException object caused repr to fail.  There's at least one more class in traceback.py that also has this problem, but I don't remember which one it is.

The cascade of exceptions causing exceptions causing exceptions makes the capture_locals feature difficult to use and debug.

Here is a short snippet that reproduces the bug on Python 3.9.7:

import traceback
class TriggerTracebackBug:
    def __init__(self):
        raise RuntimeError("can't build a TriggerTracebackBug object for some reason")
        self._repr = 'if we reached this line, this object would have a repr result'
    def __repr__(self):
        return self._repr
try:
    TriggerTracebackBug()
except Exception as e:
    # this method call fails because there is a stack frame with a local
    # variable (self) such that repr fails on that variable's value:
    traceback.TracebackException.from_exception(e, capture_locals=True)

It's clear that it is too much to expect every class to implement a safe __repr__ method.  So it also seems clear to me that traceback.FrameSummary.__init__ is the place to fix it.

My suggested fix is to replace this line in the latest Lib/traceback.py:

        self.locals = {k: repr(v) for k, v in locals.items()} if locals else None

with something like this code (written in a somewhat awkward way because that helped while I was debugging it):

        if locals:
            d = {}
            self.locals = d
            for k, v in locals.items():
                try:
                    d[k] = repr(v)
                except Exception:
                    d[k] = '''<an exception was raised while trying to format this local variable's value>'''
        else:
            self.locals = None

I've tested this code in an older version of Python and it fixed the problem for me.
msg404330 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-19 18:42
related issue: https://bugs.python.org/issue20853

similarly to this, if args cmd is used in pdb in the TestClass __init__ method, `self` will be displayed, which means the TestClass.__str__ or __repr__ will run, lacking any required state set in the __init__ after current line; - causing an exception and pdb quitting.

I wonder if one fix can be applied to both issues. I will think about it. I'll also think if this issue should be reopened.
msg404392 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-20 03:39
Joe:

I would argue that it should be expected that every object instantiated from a class should have a safe __repr__, because you will have logging and if a __repr__ is broken in some rare circumstances, it may bring down your production system in the worst case. 

Additionally, if you have some logic that handles error conditions and logs respective objects, and some of these objects have a broken __repr__, you may run into a situation where you have a rare bug, and as you use the logs to debug it, you will not be able to tell which object was involved because you will only see the traceback from the __repr__. You may have to wait for the rare bug to occur again to determine the cause.

To me it seems that this request is more for convenience of interactive debugging, which should not be a priority over system functional robustness and logging robustness.

In view of this, your example should be changed to something like:

class TriggerTracebackBug:
    _repr = None
    def __init__(self):
        raise RuntimeError("can't build a TriggerTracebackBug object for some reason")
        self._repr = 'if we reached this line, this object would have a repr result'
    def __repr__(self):
        return f'<Myclass: {self._repr}>'
msg404445 - (view) Author: Joe Wells (jbw) Date: 2021-10-20 13:12
Here are my thoughts inspired by Andrei's insightful comments.

1. In response to the major issue Andrei raises, I agree that it is desirable that repr would never raise an exception.  The reasons Andrei mentions seem quite correct to me.  However, I think the only practical way to make that change would be the change the code of the repr builtin.  (Expecting the authors of every class in all Python code ever written to make sure that their __repr__ method will never raise an exception is unrealistic.)

The bug that is the topic of the issue in this bug report can be fixed by merely handling exceptions raised by one particular call to repr in the code of FrameSummary.__init__.  That change can only affect code that uses the traceback module to get nicer tracebacks, and only if the capture_locals=True feature is requested

Changing the implementation of the repr builtin could conceivably cause unexpected problems for lots of deployed 3rd party code during normal use, because repr is widely used in deployed code, and hence more care and thought is needed.  In contrast, changing FrameSummary.__init__ as I propose in this bug report will only affect code using the capture_locals=True feature of the traceback module, and is likely to make such code more reliable because right now that feature is failure-prone due to this bug.

So I propose that making repr never raise exceptions should be a different bug.  This bug does not need to wait for that bug to be fixed.

2. In response to a minor point of Andrei, I would like to mention that I encountered this because I had given students a coursework template containing this code:

import traceback, sys

def format_exception_chunks(exception):
    return (traceback.TracebackException.from_exception(exception, capture_locals=True).format())

def print_exception(_ignored_type, exception, _ignored_traceback):
    for chunk in format_exception_chunks(exception):
        print(chunk, file=sys.stderr, end="")

# This change to sys.excepthook adds local variables to stack traces.
sys.excepthook = print_exception

This had the unfortunate effect that when a class constructor decided that it did not like its arguments, the students were overwhelmed by a baffling cascade of exception tracebacks.  So while this was indeed “for convenience of interactive debugging”, it had the actual effect of making it nearly impossible for these beginner Python programmers to do any debugging at all.  The overall effect of this bug is that it makes it potentially unwise to routinely rely on the capture_locals=True feature.  A feature that could be useful for beginners actually makes it worse for them.

3. In response to Andrei's suggested change to my minimal example to reproduce the bug, I have a few comments.  First, his minimal example is just as good as mine for reproducing the bug.  Second, my example is inspired by the two classes I mention whose constructors trigger the bug: both of them do it by leaving the incompletely initialized object missing an attribute that repr depends on, so they both raise a missing attribute exception when an attempt is made to print the incompletely initialized object.  I suspect this is a quite common pattern in deployed Python code.  I suspect that people have not made efforts to avoid this pattern because it only triggers a bug when exception tracebacks are investigated, because the only reference to the incompletely initialized object is in the stack frame of the constructor method (either __new__ or __init__) which in turn is only referenced from the traceback.  Third, my example deliberately has as little code as possible in the __repr__ method to convey the key issue.  Fourth, one of these two examples (mine or Andrei's) should go into the unit tests when the bug is fixed.

4. I notice that the code of unittest.util.safe_repr gives an example of how to use object.__repr__(obj) to get something formatted for an object whose normal __repr__ method has raised an exception.  So I alter my recommended fix to the following:

        if locals:
            d = {}
            self.locals = d
            for k, v in locals.items():
                try:
                    d[k] = repr(v)
                except Exception:
                    d[k] = '<repr failed for ' + object.__repr__(v) + '>'
        else:
            self.locals = None

5. Can someone please change the Stage of this issue to something other than “resolved” and the Resolution of this issue to something other than “not a bug”?

I hope these comments are helpful and this bug gets somehow fixed.
msg404452 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-20 13:46
Joe: when I ran your code sample with modification I posted previously, I don't get any errors. Can you try running it?

By the way my point was not that builtin __repr__ should be modified to never raise exceptions (which would not help in this case), but rather that users should try to write classes that can be safely repr()'ed with the __repr__ they have defined on the class.

I agree that this case is hard for new users to debug, but it indicates to the user that if a class and its __repr__ are implemented in this way, it can lead to more serious issue in production.

Possibly this can be improved by making a documentation update to https://docs.python.org/3/reference/datamodel.html?highlight=__repr__#object.__repr__  ?
msg404456 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-20 13:54
Joe: I've looked at https://bugs.python.org/issue39228 again and I see there was consensus to reject this idea, please take a look at the discussion there.
msg404457 - (view) Author: Joe Wells (jbw) Date: 2021-10-20 13:58
I'm sorry Andrei: I misread your alteration of my example and misunderstood its purpose.

For anyone else reading these messages: In my most recent comment above, please ignore the part of my comment about Andrei's example.

So yes, Andrei, that is how people should write their code!  Your code does not trigger the bug because it is written in a better way.  Agreed completely.

However, this bug still needs fixed because it is currently not possible to use the capture_locals=True feature of the traceback module when debugging code written by people who do not follow Andrei's example of best practice.
msg404464 - (view) Author: Joe Wells (jbw) Date: 2021-10-20 14:27
Andrei, thanks very much for the pointer to bug/issue https://bugs.python.org/issue39228.  I had not noticed the earlier comment by Irit pointing to that bug.  (Is there some way to merge bugs so that someone visiting one of the bugs will see the merged stream of comments?)

The remarks in that bug are precisely why my recommended fix has this line:

                    d[k] = '<repr failed for ' + object.__repr__(v) + '>'

instead of this:

                    d[k] = object.__repr__(v)

Does this not meet the concern expressed there?  Something that would more thoroughly meet that would be the following:

        if locals:
            d = {}
            self.locals = d
            for k, v in locals.items():
                try:
                    d[k] = repr(v)
                except Exception as e:
                    d[k] = '<repr failed with exception ' + object.__repr__(e) + ' for object ' + object.__repr__(v) + '>'
        else:
            self.locals = None

I use object.__repr__ instead of repr because when repr(v) has already failed it is likely that repr(e) on the exception e would also be likely to fail.  Although we could also try repr(e) first to see if it raises an exception.

Your suggested reaction to this bug (documenting Python best practice) is something that should be done regardless.  I completely agree.  Unfortunately, better documentation of how to write good Python does not address the point of this bug which is that debugging tools should not fail when used to debug buggy code.  The purpose of a debugging tool is to help with badly written code, not to work only with well written code.  Right now the capture_locals=True feature is not safe to use without wrapping it with an exception handler.  As a result of this bug, I have been forced to rewrite this:

def format_exception_chunks(exception):
    return (traceback.TracebackException.from_exception(exception, capture_locals=True).format())

to instead be this:

def format_exception_chunks(exception):
    try:
        tbe = (traceback.TracebackException
               . from_exception(exception, capture_locals=True))
        return tbe.format()
    except Exception:
        pass
    # the traceback module fails to catch exceptions that occur when
    # formatting local variables' values, so we retry without
    # requesting the local variables.
    try:
        tbe = traceback.TracebackException.from_exception(exception)
        return ('WARNING: Exceptions were raised while trying to'
                ' format exception traceback with local variable'
                ' values,\n'
                'so the exception traceback was formatted without'
                ' local variable values.\n',
                *tbe.format())
    except Exception:
        return ('WARNING: Exceptions were raised while trying to format'
                ' exception traceback,\n'
                'so no formatted traceback information is being'
                ' provided.\n',)

My argument is that it is better to fix the bug once in traceback.py instead of every user of the capture_locals=True feature having to discover the hard way (with hours of painful bug investigation) that they need to write lots of exception handling and bug workarounds around their use of the capture_locals=True feature.
msg404468 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-20 14:47
Joe: I understand your point but my concern is that this creates an impression for the users that Python is tolerant of failing __repr__'s, while that's not the case at all.

I agree that if StackSummary was only used for interactive debugging, there would be a stronger case for it to nicely handle failed __repr__; but I think the more important use for it is non-interactive.

The argument on the other issue was that exceptions should bubble up and we can't assume that user wants them silenced, that's the decision for user to make. I agree with that; -- and your proposed change does not address it.
msg404514 - (view) Author: Joe Wells (jbw) Date: 2021-10-20 18:48
In the hopes of convincing someone to install a fix to this bug, I will mention a few additional points.

When I mention “the capture_locals feature”, I mean calls of the form traceback.TracebackException(..., capture_locals=True) and traceback.StackSummary.extract(..., capture_locals=True).

1. Right now, the capture_locals feature is unreliable.  If any frame on the entire stack has a single local variable for which repr raises an exception, the user gets no information whatsoever back for the entire stack except for that exception, and that exception does not identify the offending stack frame or local variable.  Also, every use of the capture_locals feature must be inside a try-except statement.

2. The exceptions raised by the capture_locals feature will often be junk exceptions carrying no useful information.  The meaning of these exceptions will often be “there is an incompletely initialized object in a variable somewhere on the stack”.  Because this is a fairly normal state of affairs, this will often not be useful information.

3. Although it is a excellent idea to encourage Python coders to ensure that __repr__ method never raise exceptions, it seems unlikely this will cause many __repr__ methods to be fixed in the near future.  My impression is that __repr__ methods that raise exceptions on incompletely initialized objects are common.  One reason for this might be that authors of __repr__ methods rarely suffer from this problem personally, because they will generally supply correct arguments to their own class constructors, and even when they don't (e.g., while building unit tests), they are unlikely to try to format to strings all the local variable values on the stack in the exception traceback.

4. Right now, the caller of traceback.FrameSummary(..., locals=x) needs to go through the entire dict x and for every value v in x test whether repr(v) raises an exception and if so either remove the key/value pair or change the value to something which can be safely repr-ed.  Then FrameSummary.__init__ will repeat this work and run repr on every value in x again.  So using FrameSummary safely requires duplicating the work, i.e., running repr on every value in the dict both before and during the call to FrameSummary.

5. Anyone who is using the capture_locals feature on an exception traceback has already caught an exception, and therefore decided not to let that exception “bubble up”.  Their attempts to log or otherwise cope with the exception will be spoiled by the capture_locals feature raising an unrelated exception.  This is even worse when the exception raised by the capture_locals feature is a junk exception that merely indicates there is an incompletely initialized object on the stack.  This is likely to happen because exceptions will often happen during the initialization of an object.

6. The most recent suggested fix does in fact report the extra repr exception discovered by the capture_locals feature in the string that represents the local variable's value.  The main effect of the current code propagating this repr exception is to make it harder to deal with the original exception.

I hope these points are useful.
msg404567 - (view) Author: Martin (moi90) * Date: 2021-10-21 04:56
Thanks, Joe, this is a very good summary of all the problems.

Another idea to fix this could be an additional parameter "repr=repr" to StackSummary.extract. This way, the default behavior is not changed, but the user is allowed to supply their own repr (safe_repr for example).
msg404603 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-21 14:04
Martin: I was also thinking of adding a parameter to `extract()`. The issue I see is that it's still confusing and complicated for new students to understand the issue and find the parameter and understand why it's needed.

Joe: one important thing to note is that if an exception is caught and handled, it doesn't at all mean that unrelated exceptions from __repr__ can be silenced. Can we determine if they came from an initialized object or from object in the middle of initialization?

By the way, can you post a complete traceback that your students were getting? I would like to see how hard it is to tell from the traceback that a particular classes' __repr__ is at fault. If it's a long traceback please attach as a file..
msg404736 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-10-22 09:40
See also 23597.
msg404950 - (view) Author: Martin (moi90) * Date: 2021-10-25 07:41
> Can we determine if they came from an initialized object or from object in the middle of initialization?

That would be very nice because inside __init__ is the only place where we have to deal with partly initialized objects. But I think Python does not provide a way to detect this state.

Although I would very much like having FrameSummary robust to any kind error, I acknowledge that this might not be possible.

It might be frustrating for beginners, but I think the only way to "fix" this, is by having people implement their `repr`s correctly.

The documentation currently says[1]:

> This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.

It should be added that __repr__ might be used to display partly initialized objects during debugging and therefore should deal with these gracefully.

[1] https://docs.python.org/3/reference/datamodel.html#object.__repr__
msg404988 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-25 18:10
I've been thinking that perhaps it makes sense to special case printing of `self` argument in `__init__` methods. The same exact issue happens with PDB `args` command in `__init__` methods.

My idea is that in the __init__, you generally don't want to print `self`  arg and trying to do so can cause this type of unexpected errors.

The reason is that __repr__ is not designed to be used for objects with unfinished initialization, because even if it doesn't break, it will give you incomplete or even misleading representation.

In case when __init__ has some complex logic that can raise an exception, it's likely that other local variables will help you identify the object. If there is no complex logic or other arguments, and __init__ still failed, you can say that there wasn't yet an actual object that can be uniquely represented.

Therefore I think it makes sense to simply omit representing `self` arg (or first arg of the method) in both `StackSummary.extract()` and PDB `args` command. It may break some existing code but I think it would be a small amount of code affected. Because of this it can only go into 3.11 version. I feel like on the balance it would be a good change to make, but I'm curious to hear other opinions.
msg404989 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-10-25 18:12
Martin, would you like to submit a patch with this addition to the doc?
msg405028 - (view) Author: Martin (moi90) * Date: 2021-10-26 07:42
Irit, I'm unsure about the wording. Something like ":meth:`__repr__` should always succeed, even if errors prevent a full description of the object."? "... even if the object is only partly initialized."?

Andrei, I think you can not simply omit the first argument. For virtually all methods, `self` is an important value for debugging. It could be omitted for __init__, but that would break consistency. Also, __init__ could call other methods that help initialize the object and therefore also deal with partly initialized objects.

I really think one important part of the solution to this problem is making people aware, that under some rare conditions, repr might receive a partly initialized object.

I'm also still convinced that the other part of the solution is a way to customize the behavior of StackSummary.extract and friends.
Context: I have written a tool (https://github.com/moi90/experitur) to run batches of (machine learning) experiments (so it is not interactive). In case of an error, the traceback is dumped into a file, together with the respective local variables (using `traceback.TracebackException.from_exception(...).format()`). I want this to succeed *in any case*, even if I was too ignorant to implement correct `__repr__`s in my experiment code (which is mostly on-time-use, so why should I care).

In the end, this whole problem only affects users of `capture_locals=True`, so we could just change the semantics of this parameter a bit:
False: do nothing, True: use repr (as before), <any callable>: use this callable to convert the value to a string.
This way, we could avoid adding an additional parameter to many of the methods in traceback AND give users an easy way to customize exception formatting for easy debugging.
msg405033 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-10-26 09:26
Martin, how about something like:

"This is typically used for debugging, so it is important that the representation is information-rich and unambiguous. Furthermore, this function should not raise exceptions because that can make it inappropriate for use in some debugging contexts. "
msg405049 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-10-26 16:10
Martin: It's true that exceptions raised in other methods called from __init__ would still have this issue, but I feel like ~90% of it would be solved by the proposed fix.

It does introduce an inconsistency but it does so because it reflects the inconsistency of `self` being NOT initialized in __init__ and `self` being initialized in all other methods used after __init__. It makes intuitive sense that you don't get a repr of an object before it exists in a full state.

The doc fix is useful in itself, but I'm not sure it's enough given the issue reported by Joe with new students getting this error. When they get this error, they will not understand why it happens, what's the connection between the error and the argument that needs to be provided to fix it, and whether it's the correct fix or not.

If they *do* understand it, fixing the __repr__ is probably the best solution, and then no fix on our side is required.

My concern with adding a safe repr argument is that code using that argument can be copy pasted unknowingly and then it's not explicit/obvious that it will silence the errors raised due to broken __repr__'s. (including __repr__'s that are broken even outside of __init__ method).

If such an parameter is added, it's much better to make it a new parameter, to avoid backwards incompatible change, which means it can be backported.
msg405214 - (view) Author: Joe Wells (jbw) Date: 2021-10-28 18:19
1. As background, it is worth remembering that a major motivation for why FrameSummary.__init__ stringifies the local variable values in its parameter locals is to prevent the resulting data structure from keeping those values live.  This enables them to be garbage collected.

2. It has been suggested that TracebackException.__init__ or StackSummary.extract could be given a function parameter.  This could either be a new parameter or could involve using the existing capture_locals parameter with a non-bool.  The purpose of this suggestion is to allow customizing the use of repr to stringify local variable values.  If this suggestion is followed, it would be quite useful if this function parameter was given not just the local variable values, but also the name of the local variable and maybe also the stack frame it is in.  This would enable filtering out undesired variables.  For example, I would usually prefer to filter most of the variables from the __main__ module frame, because they are just noise that makes it hard to see the important details.  Some ability to filter would also help with the security issue that is discussed in issue 23597 that Irit pointed to.

3. I doubt that new students will be setting up calls to TracebackException or modifying sys.excepthook on their own.  Although a simple interface is clearly good, it might be more important to have an interface that maximizes the usefulness of the feature.

4. I no longer have the tracebacks my students were getting.  You can look at the traceback from the example in msg 404319 for a traceback.  While I was debugging this, I got much more complicated tracebacks because two of the classes in traceback.py also throw exceptions during their __init__ method that leave the not-yet-initialized object in a repr-will-raise-an-exception state.  Despite having decades of programming experience, it actually took me a long time before I realized that the exceptions I was debugging were junk exceptions due to attempts to call repr on not-yet-initialized objects.  I think figuring this out would be extremely challenging for my second-year undergraduate students.

5. If one of the calls to repr in FrameSummary.__init__ raises an exception, this prevents all the other local variable values from being inspected.  If it is desired to report local variable values that cause repr to raise an exception, then it would be good to collect all such exceptions, which requires handling each exception and then continuing to traverse the traceback stack.  Because of point 1 above, it might often be best to turn each such exception into a string.  To avoid infinite loops in the debugging/logging tools, it would often be best not to attempt to traverse the traceback stack of each of these extra exceptions.  If this argument is a good argument, this seems to mean that my most recent proposed fix is a good balance.

6. It is not clear if there is a reliable way to detect whether repr raises an exception due to an object's initialization having been interrupted, but all of the failures that I observed of this sort were for the variable name “self”.  In one case, the repr failure was for a normal local variable of an __new__ method which was not a parameter of this method; the variable was named “self” by convention, but this convention would be difficult to automatically verify.  In the two other cases, the repr failure was for a parameter named “self” which was the first parameter of an __init__ method.  So it would help to give special treatment to the first parameter of __init__ methods, but this would not solve the problem for __new__ methods.

7. In some cases, it might be a bit complicated to ensure the object is always in a safe state for repr-ing during initialization.  This is because there may be many attributes that need to be modified and this is not generally done atomically.  One attribute could be designated to indicate that initialization is done, so that repr will be extra careful if that attribute does not have an expected value.  Given that this is only (so far) an issue for debuggers and traceback inspection tools, it is not clear how to motivate everyone who writes a __new__, __init__, or __repr__ method to do this safely.  Documentation can of course help.
msg405238 - (view) Author: Martin (moi90) * Date: 2021-10-28 19:39
Once again a very good summary, thanks Joe!

> it would be quite useful if this function parameter was given not just the local variable values, but also the name of the local variable and maybe also the stack frame it is in

So this would be something like `format_locals(filename, lineno, name, locals) -> dict[str,str]` that could be used inside FrameSummary.__init__. I like that!

I wouldn't bother with detecting un-repr-able objects in Python itself, but if the above `format_locals` was implemented, you could easily prepare such a formatter for your students that hides `self` et al. and/or catch exceptions during `repr` to your liking.

What is needed to get an OK for such a change?
msg405893 - (view) Author: Martin (moi90) * Date: 2021-11-07 07:12
Could the participants of this issue please have a look at my pull request:

https://github.com/python/cpython/pull/29299

What do you like, what don't you like? Does it work for your use case?
msg405918 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-11-07 23:19
Martin: I'm not sure what is the best way to fix this issue, so I hope someone else will look into this.
msg405935 - (view) Author: Martin (moi90) * Date: 2021-11-08 08:29
I see two scenarious discussed here:

Scenario 1 (Offline / Batch-Mode):
A system runs user-supplied jobs that may fail. In the case of an error, the system shouldn't crash but rather store a maximally helpful message and carry on with the next job. Most likely, the relevant information is the situation that lead to the error in the first place, not that repr() has also gone wrong as a result.

This could be a a system to automatically test the homework code of your students. Or, like in my case, a framework that runs experiments. You would very likely want a system that does not crash if a __repr__ is wrongly implemented but prints a readable traceback and carries on with the next job.

Here, format_locals could be used to fall back to a different representation of an object if repr() fails.

This is the use case that my pull request tries to address primarily.

Scenario 2 (Online / Write, Test, Repeat):
A user frequently rewrites and executes their code until the desired outcome appears.
Errors inevitably happen. In this case, a maximally helpful stack trace is needed. Again, the relevant information is the situation that lead to the error in the first place, not that repr() has also gone wrong as a result.

Yes, it would be hard for unexperienced users to come up with StackSummary.extract(format_locals=...) by themselves.
But, the correct way would be to use a debugger anyway, not some hand-written code to print exceptions on the fly.

However, if your students receive a template file where they fill in missing function bodies etc, there might already be a substantial amount of code which they don't understand at first, nor do they need to care. Therefore, a piece of code that formats exceptions nicely would hurt here.

I think the most useful change for Scenario 2 (if, for some reason, a proper debugger is not an option) could be to add a command line option (e.g. -X capture_locals) so that a complete stack trace is printed if an exception bubbles up to interpreter level. (Here, it wouldn't hurt to handle exceptions in repr() because the interpreter already stopped exection anyway.)
This option would be very easy to teach to inexperienced users.

My pull request would provide preparation for this more extensive change.
msg405994 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-11-09 01:55
Martin:

I have a couple of concerns:

 - Generally (AFAIK) Python is very conservative about silencing arbitrary exceptions. There are a few functions with args like `ignore_errors`, but those are for errors in the logic of respective functions. I don't recall examples where any error would be silenced via an argument, but if there are such cases, it would be interesting to look into how the design decision was made.

In this case of course arbitrary exceptions coming any objects' __repr__ may be silenced.

There is a clear and very explicit way to catch exceptions via try/except and as a dev, I would really want to be able to look at a module, look at all try/except clauses and be confident that exceptions are not silenced elsewhere.

 - You are targeting this fix to production use, but realistically, if merged, it will be used both in testing and production. Which means, by not seeing these exceptions in testing, you will have a higher chance of deploying them to production where they can surface in other circumstances.

IOW, as a dev I might prefer to see these errors early and often, rather than have a mechanism that ends up silencing errors more broadly than intended.

I'm not saying this fix should be rejected, but that there's a tricky balance here -- and I don't feel confident enough about this solution to approve the PR if I reviewed it.
msg406008 - (view) Author: Martin (moi90) * Date: 2021-11-09 10:28
Just to avoid misunderstandings: My pull request is not at all about silencing exceptions. It is about customizing the output of the traceback module. (Just like the introduction of capture_locals previously: #22936)

(-X capture_locals, on the other hand, is a hypothetical idea that might help with debugging, but we don't have to discuss this now.)

My main argument for the proposed change is that traceback is useless in certain situations, because capture_locals is not "robust" (it *might* raise exceptions; I could handle these in the calling code but that would again hide the stack trace that I was about to investigate) and might dump sensitive data like passwords into logfiles (msg237320, msg237323).

Nothing is taken away or hidden, but flexibility is added.

(The only other option to resolve these issues would be to re-implement much of the current functionality of traceback in a third-party module, which would lead to maintainability problems and fragmentation.)
msg406012 - (view) Author: Martin (moi90) * Date: 2021-11-09 11:49
I improved the example in traceback.rst to illustrate how format_locals works.
msg406042 - (view) Author: Joe Wells (jbw) Date: 2021-11-09 19:27
Some quick comments on Martin's pull request.

1. The changes are sufficient to let the user make things work the way it is requested that they work and anyone who does not start using the new format_locals parameter will get the old behavior.

2. A side comment: I just noticed that the docstring for FrameSummary.__init__ does not document the filename, lineno, and name parameters.  Perhaps this could be fixed at the same time?

3. The new parameter format_locals is slightly confusingly named, because it does not format a single string from all the locals but instead gives a dictionary of strings with one string for each local.

4. I personally think it would be better to include something like the example value for format_locals as an extra attribute of the traceback module so everybody doesn't have to write the same function.  That said, the documented example is sufficient.

5. It is not clear to me why this stringify-ing of the locals can't be done in StackSummary._extract_from_extended_frame_gen.  It passes the dictionary of locals and also the function to transform it to FrameSummary.  It would make more sense to transform it first.  I suppose there might be some user somewhere who is directly calling FrameSummary so the interface needs to stay.

6. I fantasize that the new format_locals parameter would have access to the frame object.  In order for this to happen, the frame object would have to be passed to FrameSummary instead of the 3 values (locals, name, filename) that are extracted from it.  I think the current interface of FrameSummary was designed to interoperate with very old versions of Python.  Oh well.

7. If anyone wanted to ever refactor FrameSummary (e.g., to enable my fantasy in point #6 just above), it becomes harder after this.  This change has the effect of exposing details of the interface of FrameSummary to users of StackSummary.extract and TracebackException.  The four parameters that the new parameter format_locals takes are most of the parameters of FrameSummary.
msg406107 - (view) Author: Martin (moi90) * Date: 2021-11-10 16:38
Thanks Joe!

> 1. The changes are sufficient to let the user make things work the way it is requested that they work and anyone who does not start using the new format_locals parameter will get the old behavior.

That was my goal :)

> 2. A side comment: I just noticed that the docstring for FrameSummary.__init__ does not document the filename, lineno, and name parameters.  Perhaps this could be fixed at the same time?

I agree and already updated the pull request.

> 3. The new parameter format_locals is slightly confusingly named, because it does not format a single string from all the locals but instead gives a dictionary of strings with one string for each local.

I think format_locals is OK here (local*s*: plural), but I'm open to better suggestions.

> 4. I personally think it would be better to include something like the example value for format_locals as an extra attribute of the traceback module so everybody doesn't have to write the same function.  That said, the documented example is sufficient.

I sort of agree, but I also don't know what such a general-purpose function would entail. Suggestions?

> 5. It is not clear to me why this stringify-ing of the locals can't be done in StackSummary._extract_from_extended_frame_gen.  It passes the dictionary of locals and also the function to transform it to FrameSummary.  It would make more sense to transform it first.  I suppose there might be some user somewhere who is directly calling FrameSummary so the interface needs to stay.

I agree! In principle, FrameSummary has been little more than a container without (much) logic of its own.
Memory efficiency requires that that FrameSummary does not hold references to the original locals, thats why repr() was used.
I think as well that it would have been better to keep FrameSummary as a mere container and do the conversion of the locals elsewhere.
But at this point, this shouldn't be changed anymore.

> 6. I fantasize that the new format_locals parameter would have access to the frame object.  In order for this to happen, the frame object would have to be passed to FrameSummary instead of the 3 values (locals, name, filename) that are extracted from it.  I think the current interface of FrameSummary was designed to interoperate with very old versions of Python.  Oh well.

Do you have a use case for that? I have no clue what you would do with the frame object at this point.

> 7. If anyone wanted to ever refactor FrameSummary (e.g., to enable my fantasy in point #6 just above), it becomes harder after this.  This change has the effect of exposing details of the interface of FrameSummary to users of StackSummary.extract and TracebackException.  The four parameters that the new parameter format_locals takes are most of the parameters of FrameSummary.

Previously, I totally misread your request to "not just the local variable values, but also the name of the local variable and maybe also the stack frame it is in". This is why I included filename, lineno and name as parameters to format_locals which now seems totally useless to me and I'll remove them (if nobody objects).
msg406997 - (view) Author: Martin (moi90) * Date: 2021-11-25 13:59
Irit, would you be able to take a look at the patch?

---

I found another application scenario for the patch: In Numpy and Pandas, the assert_* functions in testing have a __tracebackhide__ local that advises pytest to hide the frame. With the patch in place, we could at least not display any locals if __tracebackhide__ is present.
msg406998 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-25 14:36
While I do think this module should be more customisable than it is, I think it should be done via support for subclassing rather than injecting functions that get forwarded on as you do here.

There are other issues on bpo related to customising this module, with more convincing use cases than suppressing errors. I think it’s more likely that a patch will be accepted which solves the general problem of this module being inflexible (while being backwards compatible).

The patch you propose here solves a small part of the problem while making the api clunkier and it commits us to supporting this new parameter when we try to solve the general problem.
msg406999 - (view) Author: Martin (moi90) * Date: 2021-11-25 14:41
Thanks for your definitive answer, this is what I was waiting for.

I understand and I totally agree that subclassing is the way to go to make traceback more flexible.

Would you mind linking the other issues concerning the general improvement of traceback?
msg407001 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-25 14:52
Try the search function on the tracker (that's what I would need to do to find what to link).
History
Date User Action Args
2022-04-11 14:59:43adminsetgithub: 87822
2021-11-25 14:52:05iritkatrielsetmessages: + msg407001
2021-11-25 14:41:25moi90setmessages: + msg406999
2021-11-25 14:36:35iritkatrielsetmessages: + msg406998
2021-11-25 13:59:59moi90setmessages: + msg406997
2021-11-10 16:38:29moi90setmessages: + msg406107
2021-11-09 19:27:26jbwsetmessages: + msg406042
2021-11-09 11:57:11kjsetnosy: - kj
2021-11-09 11:49:01moi90setmessages: + msg406012
2021-11-09 10:28:01moi90setnosy: + rbcollins
messages: + msg406008
2021-11-09 01:55:02andrei.avksetmessages: + msg405994
2021-11-08 08:29:24moi90setmessages: + msg405935
2021-11-07 23:19:03andrei.avksetmessages: + msg405918
2021-11-07 07:12:34moi90setmessages: + msg405893
2021-10-28 21:00:59moi90setstage: resolved -> patch review
pull_requests: + pull_request27563
2021-10-28 19:39:22moi90setmessages: + msg405238
2021-10-28 18:19:11jbwsetmessages: + msg405214
2021-10-26 16:10:47andrei.avksetmessages: + msg405049
2021-10-26 09:26:31iritkatrielsetmessages: + msg405033
2021-10-26 07:42:14moi90setmessages: + msg405028
2021-10-25 18:12:41iritkatrielsetmessages: + msg404989
2021-10-25 18:10:45andrei.avksetmessages: + msg404988
2021-10-25 07:41:52moi90setmessages: + msg404950
2021-10-22 09:40:48iritkatrielsetmessages: + msg404736
2021-10-21 14:04:34andrei.avksetmessages: + msg404603
2021-10-21 04:56:05moi90setmessages: + msg404567
2021-10-20 18:48:10jbwsetmessages: + msg404514
2021-10-20 14:47:17andrei.avksetmessages: + msg404468
2021-10-20 14:27:19jbwsetmessages: + msg404464
2021-10-20 13:58:09jbwsetmessages: + msg404457
2021-10-20 13:54:03andrei.avksetmessages: + msg404456
2021-10-20 13:46:25andrei.avksetmessages: + msg404452
2021-10-20 13:12:39jbwsetmessages: + msg404445
2021-10-20 03:39:38andrei.avksetnosy: + kj
messages: + msg404392
2021-10-19 18:42:18andrei.avksetstatus: closed -> open

messages: + msg404330
2021-10-19 17:02:25jbwsetnosy: + jbw

messages: + msg404319
title: StackSummary.format fails if repr(value) fails -> TracebackException or StackSummary.extract with capture_locals=True fail to catch exceptions raised by repr() on value of frame local variable in FrameSummary.__init__.
2021-05-12 14:53:03iritkatrielsetstatus: open -> closed
resolution: not a bug
stage: patch review -> resolved
2021-05-12 14:48:25moi90setmessages: + msg393519
title: StackSummary.format fails if str(value) fails -> StackSummary.format fails if repr(value) fails
2021-05-09 05:39:56andrei.avksetnosy: + andrei.avk
messages: + msg393298
2021-05-07 18:25:51iritkatrielsetmessages: + msg393199
2021-04-03 01:27:05terry.reedysetversions: - Python 3.6, Python 3.7
2021-03-30 17:31:36moi90setmessages: + msg389842
2021-03-30 17:24:17moi90setmessages: + msg389840
2021-03-30 11:30:23iritkatrielsetmessages: + msg389819
2021-03-30 10:27:32moi90setmessages: + msg389815
2021-03-30 07:20:32serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg389794
2021-03-30 07:17:59iritkatrielsetnosy: + iritkatriel
2021-03-29 10:07:29moi90setkeywords: + patch
stage: patch review
pull_requests: + pull_request23812
2021-03-29 09:46:03moi90setmessages: + msg389680
2021-03-29 09:36:49moi90create