Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behaviour between BytesParser.parse and Parser.parse #65675

Closed
ukaszKucharski mannequin opened this issue May 12, 2014 · 16 comments
Closed

Inconsistent behaviour between BytesParser.parse and Parser.parse #65675

ukaszKucharski mannequin opened this issue May 12, 2014 · 16 comments
Labels
topic-email type-bug An unexpected behavior, bug, or error

Comments

@ukaszKucharski
Copy link
Mannequin

ukaszKucharski mannequin commented May 12, 2014

BPO 21476
Nosy @warsaw, @bitdancer, @serhiy-storchaka, @vajrasky
Files
  • mail_test.py: Simple program that depicts the issue presented.
  • bytes_parser_dont_close_file.patch
  • bytes_parser_dont_close_file_v2.patch
  • bytes_parser_dont_close_file_v3.patch
  • bytes_parser_dont_close_file_v4.patch
  • bytes_parser_dont_close_file_v5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-06-26.17:34:41.572>
    created_at = <Date 2014-05-12.02:58:05.642>
    labels = ['type-bug', 'expert-email']
    title = 'Inconsistent behaviour between BytesParser.parse and Parser.parse'
    updated_at = <Date 2014-06-26.17:34:41.571>
    user = 'https://bugs.python.org/ukaszKucharski'

    bugs.python.org fields:

    activity = <Date 2014-06-26.17:34:41.571>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-06-26.17:34:41.572>
    closer = 'r.david.murray'
    components = ['email']
    creation = <Date 2014-05-12.02:58:05.642>
    creator = '\xc5\x81ukasz.Kucharski'
    dependencies = []
    files = ['35220', '35413', '35418', '35420', '35424', '35537']
    hgrepos = []
    issue_num = 21476
    keywords = ['patch']
    message_count = 16.0
    messages = ['218309', '219395', '219398', '219406', '219439', '219449', '219452', '219453', '219461', '219469', '219471', '219489', '219841', '220038', '221623', '221624']
    nosy_count = 6.0
    nosy_names = ['barry', 'r.david.murray', 'python-dev', 'serhiy.storchaka', 'vajrasky', '\xc5\x81ukasz.Kucharski']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21476'
    versions = ['Python 3.4', 'Python 3.5']

    @ukaszKucharski
    Copy link
    Mannequin Author

    ukaszKucharski mannequin commented May 12, 2014

    The behaviour of the method for both classes seem to be a little different. Executing Parser.parse(fp) does not close the file pointer passed while Executing BytesParser.parse does.

    I think this caused by fact that BytesParser.parse implementation is using with statement. Writing this

        fp = TextIOWrapper(file_pointer, encoding='ascii', errors='surrogateescape')
        with fp:
          return self.parser.parse(fp, headersonly)
        file_pointer.seek(0)

    The original file_pointer gets closed and the call to seek fails.

    I am not sure whether such behaviour is intended, and whether, the with behaves badly, or the TextIOWrapper, or the BytesParser, thus I am unable to suggest or provide a patch. But I think the behaviour should be consistent and/or documented.

    I attached a file that depicts the issue. The problem originated from SO:
    http://stackoverflow.com/questions/23599457/how-to-parse-an-email-in-python-without-closing-the-file
    I think it's a minor issue, but it did cause a code to fail with no apparent reason.

    @ukaszKucharski ukaszKucharski mannequin added topic-email type-bug An unexpected behavior, bug, or error labels May 12, 2014
    @bitdancer
    Copy link
    Member

    I think the code should be using 'detach' after passing the fp to parser.parse, instead of using 'with'.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented May 30, 2014

    Here is the simple patch based on David Murray's thought.

    @serhiy-storchaka
    Copy link
    Member

    Use try/finally to detach even if an exception is raised during parsing.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented May 31, 2014

    Thank you, Serhiy, for the review! Here is the updated patch.

    @serhiy-storchaka
    Copy link
    Member

    Could you please add a test with parse() raising an exception?

    Yet one nitpick. Instead of

            fp = openfile('msg_02.txt', 'rb')
            self.addCleanup(fp.close)
            ...

    you can write

            with openfile('msg_02.txt', 'rb') as fp:
                ...

    as in other tests.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented May 31, 2014

    Serhiy, here is the latest patch incorporating your request.

    @serhiy-storchaka
    Copy link
    Member

    Sorry, I meant to test parser with invalid message, so that parse() raises an exception, and file shouldn't be closed after this.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented May 31, 2014

    The Parse class does not throw exception if given invalid message:

    Assume /tmp/a.txt contains garbage, such as: "&&&&&"

    With this code:

        with open("/tmp/a.txt", "r") as fp:
            msg = email.parser.Parser().parse(fp) # does not throw exception
            print(msg) # => &&&&&
            msg['from'] # => None

    It is just you can not get useful information, such as msg['to'].

    @bitdancer
    Copy link
    Member

    Right, part of the parser contract is to not throw exceptions. Traditionally, a bug could result in an exception, but not an invalid message.

    However, using the new email policies, it is possible to *request* that it raise exceptions instead of registering defects. See policy.raise_on_defect.

    @serhiy-storchaka
    Copy link
    Member

    If the parser itself doesn't raise exceptions, we should test with input stream raising an exception.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Jun 1, 2014

    Ok, here is the updated patch based on R. David Murray's help. Thanks!

    @bitdancer
    Copy link
    Member

    I believe there are msg_NN files that have defects. I'd rather use one of those in the exception test.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Jun 8, 2014

    Here is the patch based on R. David Murray's nitpick.

    @PCManticore PCManticore mannequin changed the title Inconsitent behaviour between BytesParser.parse and Parser.parse Inconsistent behaviour between BytesParser.parse and Parser.parse Jun 25, 2014
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 26, 2014

    New changeset 0a16756dfcc0 by R David Murray in branch '3.4':
    bpo-21476: Unwrap fp in BytesParser so the file isn't unexpectedly closed.
    http://hg.python.org/cpython/rev/0a16756dfcc0

    New changeset a3ee325fd489 by R David Murray in branch 'default':
    Merge bpo-21476: Unwrap fp in BytesParser so the file isn't unexpectedly closed.
    http://hg.python.org/cpython/rev/a3ee325fd489

    @bitdancer
    Copy link
    Member

    Thanks, Vajrasky. And to the reviewers as well.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-email type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants