Title: Fix exception causes in tarfile module
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.10
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ethan.furman Nosy List: cool-RR, ethan.furman, martin.panter, terry.reedy, veky
Priority: normal Keywords: patch

Created on 2020-02-21 19:20 by cool-RR, last changed 2020-12-12 21:30 by ethan.furman. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 18595 closed cool-RR, 2020-02-21 19:20
PR 23739 merged ethan.furman, 2020-12-11 02:41
Messages (17)
msg362425 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-02-21 20:00
What do you think it is necessary to switch from implicit chaining to explicit chaining?

If anyone is currently relying on __context__ vs __cause__, this patch will break their code.

In a traceback, the only visible difference is in the text between the exceptions:

- During handling of the above exception, another exception occurred:
+ The above exception was the direct cause of the following exception:

While we haven't been 100% consistent about this, the norm has been to either use implicit chaining or use "from None" to turn-off chaining.  The "from e" approach can be used to alter the explicit chain, perhaps skipping over one or more exceptions, but that isn't the case here.
msg362427 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-02-21 20:28
If this were new code, there would be a better case for the direct-cause style even though it is more cluttered and a bit slower.

I'm only questioning whether it make sense to change it in already deployed code, possibly risking breakage.  AFAICT, nothing is currently broken and the minor change in traceback wording would not confer a real benefit.
msg362431 - (view) Author: Ram Rachum (cool-RR) * Date: 2020-02-21 20:46
> What do you think it is necessary to switch from implicit chaining to explicit chaining?

I didn't say it's necessary, but I think it's beneficial. I find that message between exceptions useful when I'm debugging and I'd like to keep it accurate in as many places as I can. 

I think that the more accurate it is, the more people will learn to trust that it has meaning and understand it. 

> If anyone is currently relying on __context__ vs __cause__, this patch will break their code.

I think that the standard you're setting for backward compatibility is unreasonably high.

The only code I can think of that inspects `__context__` and `__cause__` is code that formats exceptions to be shown to the user, such as Django's debug page, Wing IDE's Exceptions pane, whatever's going on in PyCharm etc. That kind of code is responsible for checking for both `__context__` and `__cause__` and showing the correct message.

If you've seen code outside of the category above that uses `__context__`, I'd be curious to see it.

Fortunately, this point is moot since using `raise foo from bar` sets the `__context__ = __cause__ = bar`, so the `__context__` will not be changed.
msg362440 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-02-21 23:59
Ethan, would you make the call on this?  My recommendation is to close because we usually don't churn APIs unless there is a demonstrable benefit.  Also, the current code reflects the dominant practice in the standard library which is both faster and more concise.
msg362444 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2020-02-22 02:02
I know we are not in the habit of making large-scale changes to take advantage of new features and enhancements, but I think this may be one of the few exceptions to the rule, and it has to do with what the text between the two tracebacks means:

"During handling of the above exception, another exception occurred:"

-> there is a bug in the exception handler.

"The above exception was the direct cause of the following exception:"

-> the first error is the cause and the second error is what we want the user to pay attention to.

Whether or not the stdlib is buggy is a pretty big distinction.

If this is a change worth making the follow-on question is should we be raising from the previous exception, or from None?  How much value is the previous exception adding?

In cases where the new exception contains the text of the old one I think `from None` should be preferred; in cases where import errors get translated into CompressionErrors those import errors are sometimes useful and sometimes just noise -- perhaps including `str(e)` in the new exception is the best way there as well.

I'll need to do some research before I decide.
msg362447 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2020-02-22 05:52
Please don’t use “from None” in library code. It hides exceptions raised by the calling application that would help debugging. E.g. <>
msg362451 - (view) Author: Ram Rachum (cool-RR) * Date: 2020-02-22 07:18
I'm also strongly against using `from None`. When I'm debugging, I'm like a man who got lost in the desert and is about to die of thirst. Any possible insight into what happened is like an oasis, even if there are just a few drops of water there.

Also, some tools like Django and Sentry show you all the local variables for your stacktraces, which is a godsend. These often have important information that sheds light on what went wrong, and if you remove the traceback they'll be gone.
msg362520 - (view) Author: Vedran Čačić (veky) * Date: 2020-02-23 16:24
Oh yes, this has bugged me often. Please fix it somehow.

Yes, using "from None" is probably the wrong way to go. Often we need more info in tracebacks, not less. But the "During handling" message is very misleading. Same as Ethan, many times I interpreted it as "something went wrong in the handler" when in fact the handler was doing exactly what it was supposed to do.

    except WhateverException as e:
        raise CustomException from e

might be too much to write every time, and while I understand that we cannot simply redefine `raise` under `except` to do that implicitly, maybe there is some middle solution. On Python-ideas, recently I saw an idea

    except WhateverException:
        raise as CustomException

which I like a lot, but I don't know how hard it is to implement.


If everything above seems like too much, at least we should consider changing the wording of the message. If it said

+"In the handler of the above exception, another exception was raised:"
-"During handling of the above exception, another exception occurred:"

I'd be much happier. And it would more often suggest the right thing.
msg362526 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2020-02-23 17:48
Looking back at PEP3134 [1], the 

    raise NewException from exc

is used as one of the examples.

msg362567 - (view) Author: Ram Rachum (cool-RR) * Date: 2020-02-24 05:47
Ethan, did we successfully convince you not to use `from None`?
msg362569 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2020-02-24 06:32
`Fraid not.  It is still going to be a case-by-case basis -- sometimes the previous exception just doesn't add any value, and sometimes it does.

PEP3134 did add a lot of justification for your changes, though.
msg362572 - (view) Author: Vedran Čačić (veky) * Date: 2020-02-24 08:44
I guess is a perfect example of "previous exception not adding any value". :-) And I think it isn't a coincidence that it happens in "your" module.

The morale: we think about exceptions in different ways, and it's hard to say what's the right way. Maybe we should just change the wording to imply that __context__ exception didn't spontaneously "occur", but was explicitly "raised" by a handler.
msg362936 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-02-28 23:01
While I have no specific opinion on tarfile, I strongly disagree with a blanket prohibition on 'from None'.  Its proper use is to maintain a defined API and hide irrelevant implementation details.  Realistic toy example:

def f(x, y):
    "Return (x+y)/y for non-zery y."

    if y == 0:  # Body 1: look ahead.
        raise ValueError('y cannot be 0')
        return (x+y)/y
# or
    try:  # Body 2: leap first.
        return (x+y)/y
    except ZeroDivisionError:
        raise ValueError('y cannot be 0') from None

'from e' instead of 'from None' would just add distracting noise.
msg364462 - (view) Author: Ram Rachum (cool-RR) * Date: 2020-03-17 17:46
Ethan, got a verdict?
msg364473 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2020-03-17 20:05

Some of the changes are good, others should be `from None`.

The `from None` raises should include the ones where the new exception includes the text of the caught exception.

What is needed is text captured from the proposed changes to see if the previous exception was useful, and not easily gotten in other ways -- for example, the bz2 ImportError can be easily regenerated by `import bz2` at the REPL, so `from None` makes sense there.
msg364475 - (view) Author: Ram Rachum (cool-RR) * Date: 2020-03-17 20:12
I understand. I've closed my PR and I'll let someone else implement this ticket-- I don't want to be the reason that someone didn't get the information they wanted in an error report. Thanks anyway for your time.
msg382923 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2020-12-12 21:26
New changeset b5a6db9111562454617b6771b61f2734ea0420c9 by Ethan Furman in branch 'master':
bpo-39717: [tarfile] update nested exception raising (GH-23739)
Date User Action Args
2020-12-12 21:30:10ethan.furmansetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-12-12 21:26:52ethan.furmansetmessages: + msg382923
2020-12-11 02:42:25ethan.furmansetversions: + Python 3.10, - Python 3.9
2020-12-11 02:41:04ethan.furmansetkeywords: + patch
stage: patch review
pull_requests: + pull_request22597
2020-03-17 20:12:50cool-RRsetmessages: + msg364475
2020-03-17 20:05:09ethan.furmansetmessages: + msg364473
2020-03-17 17:46:27cool-RRsetmessages: + msg364462
2020-02-28 23:01:10terry.reedysetnosy: + terry.reedy
messages: + msg362936
2020-02-24 08:44:33vekysetmessages: + msg362572
2020-02-24 06:32:03ethan.furmansetmessages: + msg362569
2020-02-24 05:47:28cool-RRsetmessages: + msg362567
2020-02-23 21:50:54rhettingersetnosy: - rhettinger
2020-02-23 17:48:43ethan.furmansetmessages: + msg362526
2020-02-23 16:24:28vekysetnosy: + veky
messages: + msg362520
2020-02-22 07:18:38cool-RRsetmessages: + msg362451
2020-02-22 05:52:36martin.pantersetnosy: + martin.panter
messages: + msg362447
2020-02-22 02:02:36ethan.furmansetmessages: + msg362444
2020-02-21 23:59:53rhettingersetassignee: ethan.furman

messages: + msg362440
nosy: + ethan.furman
2020-02-21 20:46:11cool-RRsetmessages: + msg362431
2020-02-21 20:28:49rhettingersetmessages: + msg362427
2020-02-21 20:00:32rhettingersetnosy: + rhettinger
messages: + msg362425
2020-02-21 19:20:01cool-RRcreate