Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exception causes in tarfile module #83898

Closed
cool-RR mannequin opened this issue Feb 21, 2020 · 17 comments
Closed

Fix exception causes in tarfile module #83898

cool-RR mannequin opened this issue Feb 21, 2020 · 17 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Feb 21, 2020

BPO 39717
Nosy @terryjreedy, @cool-RR, @ethanfurman, @ambv, @vadmium, @vedgar
PRs
  • bpo-39717: Fix exception causes in tarfile module #18595
  • bpo-39717: [tarfile] update nested exception raising #23739
  • [3.9] bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766) #28614
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ethanfurman'
    closed_at = <Date 2020-12-12.21:30:10.142>
    created_at = <Date 2020-02-21.19:20:01.857>
    labels = ['type-bug', 'library', '3.10']
    title = 'Fix exception causes in tarfile module'
    updated_at = <Date 2021-09-29.09:42:37.561>
    user = 'https://github.com/cool-RR'

    bugs.python.org fields:

    activity = <Date 2021-09-29.09:42:37.561>
    actor = 'lukasz.langa'
    assignee = 'ethan.furman'
    closed = True
    closed_date = <Date 2020-12-12.21:30:10.142>
    closer = 'ethan.furman'
    components = ['Library (Lib)']
    creation = <Date 2020-02-21.19:20:01.857>
    creator = 'cool-RR'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39717
    keywords = ['patch']
    message_count = 17.0
    messages = ['362425', '362427', '362431', '362440', '362444', '362447', '362451', '362520', '362526', '362567', '362569', '362572', '362936', '364462', '364473', '364475', '382923']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'cool-RR', 'ethan.furman', 'lukasz.langa', 'martin.panter', 'veky']
    pr_nums = ['18595', '23739', '28614']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue39717'
    versions = ['Python 3.10']

    @cool-RR cool-RR mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 21, 2020
    @rhettinger
    Copy link
    Contributor

    What do you think it is necessary to switch from implicit chaining to explicit chaining?

    If anyone is currently relying on __context__ vs __cause__, this patch will break their code.

    In a traceback, the only visible difference is in the text between the exceptions:

    • During handling of the above exception, another exception occurred:
      + The above exception was the direct cause of the following exception:

    While we haven't been 100% consistent about this, the norm has been to either use implicit chaining or use "from None" to turn-off chaining. The "from e" approach can be used to alter the explicit chain, perhaps skipping over one or more exceptions, but that isn't the case here.

    @rhettinger
    Copy link
    Contributor

    If this were new code, there would be a better case for the direct-cause style even though it is more cluttered and a bit slower.

    I'm only questioning whether it make sense to change it in already deployed code, possibly risking breakage. AFAICT, nothing is currently broken and the minor change in traceback wording would not confer a real benefit.

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Feb 21, 2020

    What do you think it is necessary to switch from implicit chaining to explicit chaining?

    I didn't say it's necessary, but I think it's beneficial. I find that message between exceptions useful when I'm debugging and I'd like to keep it accurate in as many places as I can.

    I think that the more accurate it is, the more people will learn to trust that it has meaning and understand it.

    If anyone is currently relying on __context__ vs __cause__, this patch will break their code.

    I think that the standard you're setting for backward compatibility is unreasonably high.

    The only code I can think of that inspects __context__ and __cause__ is code that formats exceptions to be shown to the user, such as Django's debug page, Wing IDE's Exceptions pane, whatever's going on in PyCharm etc. That kind of code is responsible for checking for both __context__ and __cause__ and showing the correct message.

    If you've seen code outside of the category above that uses __context__, I'd be curious to see it.

    Fortunately, this point is moot since using raise foo from bar sets the __context__ = __cause__ = bar, so the __context__ will not be changed.

    @rhettinger
    Copy link
    Contributor

    Ethan, would you make the call on this? My recommendation is to close because we usually don't churn APIs unless there is a demonstrable benefit. Also, the current code reflects the dominant practice in the standard library which is both faster and more concise.

    @ethanfurman
    Copy link
    Member

    I know we are not in the habit of making large-scale changes to take advantage of new features and enhancements, but I think this may be one of the few exceptions to the rule, and it has to do with what the text between the two tracebacks means:

    -----
    "During handling of the above exception, another exception occurred:"

    -> there is a bug in the exception handler.
    -----

    -----
    "The above exception was the direct cause of the following exception:"

    -> the first error is the cause and the second error is what we want the user to pay attention to.
    -----

    Whether or not the stdlib is buggy is a pretty big distinction.

    If this is a change worth making the follow-on question is should we be raising from the previous exception, or from None? How much value is the previous exception adding?

    In cases where the new exception contains the text of the old one I think from None should be preferred; in cases where import errors get translated into CompressionErrors those import errors are sometimes useful and sometimes just noise -- perhaps including str(e) in the new exception is the best way there as well.

    I'll need to do some research before I decide.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 22, 2020

    Please don’t use “from None” in library code. It hides exceptions raised by the calling application that would help debugging. E.g. <https://bugs.python.org/issue30097#msg293185\>

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Feb 22, 2020

    I'm also strongly against using from None. When I'm debugging, I'm like a man who got lost in the desert and is about to die of thirst. Any possible insight into what happened is like an oasis, even if there are just a few drops of water there.

    Also, some tools like Django and Sentry show you all the local variables for your stacktraces, which is a godsend. These often have important information that sheds light on what went wrong, and if you remove the traceback they'll be gone.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Feb 23, 2020

    Oh yes, this has bugged me often. Please fix it somehow.

    Yes, using "from None" is probably the wrong way to go. Often we need more info in tracebacks, not less. But the "During handling" message is very misleading. Same as Ethan, many times I interpreted it as "something went wrong in the handler" when in fact the handler was doing exactly what it was supposed to do.

        except WhateverException as e:
            raise CustomException from e

    might be too much to write every time, and while I understand that we cannot simply redefine raise under except to do that implicitly, maybe there is some middle solution. On Python-ideas, recently I saw an idea

        except WhateverException:
            raise as CustomException

    which I like a lot, but I don't know how hard it is to implement.

    ---

    If everything above seems like too much, at least we should consider changing the wording of the message. If it said

    +"In the handler of the above exception, another exception was raised:"
    -"During handling of the above exception, another exception occurred:"

    I'd be much happier. And it would more often suggest the right thing.

    @ethanfurman
    Copy link
    Member

    Looking back at PEP-3134 [1], the

        raise NewException from exc

    is used as one of the examples.

    [1] https://www.python.org/dev/peps/pep-3134/#id37

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Feb 24, 2020

    Ethan, did we successfully convince you not to use from None?

    @ethanfurman
    Copy link
    Member

    `Fraid not. It is still going to be a case-by-case basis -- sometimes the previous exception just doesn't add any value, and sometimes it does.

    PEP-3134 did add a lot of justification for your changes, though.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Feb 24, 2020

    I guess https://bugs.python.org/issue39728 is a perfect example of "previous exception not adding any value". :-) And I think it isn't a coincidence that it happens in "your" module.

    The morale: we think about exceptions in different ways, and it's hard to say what's the right way. Maybe we should just change the wording to imply that __context__ exception didn't spontaneously "occur", but was explicitly "raised" by a handler.

    @terryjreedy
    Copy link
    Member

    While I have no specific opinion on tarfile, I strongly disagree with a blanket prohibition on 'from None'. Its proper use is to maintain a defined API and hide irrelevant implementation details. Realistic toy example:

    def f(x, y):
        "Return (x+y)/y for non-zery y."
    
        if y == 0:  # Body 1: look ahead.
            raise ValueError('y cannot be 0')
        else:
            return (x+y)/y
    # or
        try:  # Body 2: leap first.
            return (x+y)/y
        except ZeroDivisionError:
            raise ValueError('y cannot be 0') from None

    'from e' instead of 'from None' would just add distracting noise.

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Mar 17, 2020

    Ethan, got a verdict?

    @ethanfurman
    Copy link
    Member

    Yes.

    Some of the changes are good, others should be from None.

    The from None raises should include the ones where the new exception includes the text of the caught exception.

    What is needed is text captured from the proposed changes to see if the previous exception was useful, and not easily gotten in other ways -- for example, the bz2 ImportError can be easily regenerated by import bz2 at the REPL, so from None makes sense there.

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Mar 17, 2020

    I understand. I've closed my PR and I'll let someone else implement this ticket-- I don't want to be the reason that someone didn't get the information they wanted in an error report. Thanks anyway for your time.

    @ethanfurman ethanfurman added 3.10 only security fixes and removed 3.9 only security fixes labels Dec 11, 2020
    @ethanfurman
    Copy link
    Member

    New changeset b5a6db9 by Ethan Furman in branch 'master':
    bpo-39717: [tarfile] update nested exception raising (GH-23739)
    b5a6db9

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants