Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarfile.is_tarfile() and tarfile.open() when used with file object may cause tarfile operations to fail #88455

Closed
amateja mannequin opened this issue Jun 2, 2021 · 4 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@amateja
Copy link
Mannequin

amateja mannequin commented Jun 2, 2021

BPO 44289
Nosy @gvanrossum, @gustaebel, @Fidget-Spinner, @akulakov, @amateja
PRs
  • bpo-44289: Keep argument file object's current position in tarfile.is_tarfile #26488
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-06-02.15:09:05.223>
    labels = ['type-bug', 'library', '3.9', '3.10', '3.11']
    title = 'tarfile.is_tarfile() and tarfile.open() when used with file object may cause tarfile operations to fail'
    updated_at = <Date 2022-02-09.16:19:26.161>
    user = 'https://github.com/amateja'

    bugs.python.org fields:

    activity = <Date 2022-02-09.16:19:26.161>
    actor = 'gvanrossum'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-06-02.15:09:05.223>
    creator = 'mateja.and'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44289
    keywords = ['patch']
    message_count = 3.0
    messages = ['394922', '408062', '412919']
    nosy_count = 6.0
    nosy_names = ['gvanrossum', 'lars.gustaebel', 'python-dev', 'kj', 'andrei.avk', 'mateja.and']
    pr_nums = ['26488']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue44289'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @amateja
    Copy link
    Mannequin Author

    amateja mannequin commented Jun 2, 2021

    Since Python 3.9 tarfile.is_tarfile accepts not only paths but also files and file-like objects (bpo-29435).

    Verification if a file or file-like object is a tar file modifies file object's current position.

    Imagine a function listing names of all tar archive members but checking first if this is a valid tar archive. When its argument is a str or pathlib.Path this is quite straightforward. If the argument is a file of file-like object then current position must be reset or TarFile.getmembers() returns empty list.

    import tarfile
    
    
    def list_tar(archive):
        if tarfile.is_tarfile(archive):
            kwargs = {'fileobj' if hasattr(archive, 'read') else 'name': archive}
            t = tarfile.open(**kwargs)
            return [member.name for member in t.getmembers()]
        return []
    
    
    if __name__ == '__main__':
        path = 'archive.tar.gz'
        print(list_tar(path))
        print(list_tar(open(path, 'rb')))

    ['spam.py', 'ham.py', 'bacon.py', 'eggs.py']
    []

    @amateja amateja mannequin added 3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 2, 2021
    @akulakov
    Copy link
    Contributor

    akulakov commented Dec 9, 2021

    This affects more use cases than just is_tarfile() and getmembers() results.

    is_tarfile() calls open() which is the root cause of the issue. Calling open() 2+ times will also cause the same issue.

    In addition to getmembers(), extracting the tar will also silently fail. (and possibly other operations).

    I've suggested a different fix in the comment on the PR:
    #26488 (comment)

    @akulakov akulakov changed the title tarfile.is_tarfile() modifies file object's current position tarfile.is_tarfile() and tarfile.open() when used with file object may cause tarfile operations to fail Dec 9, 2021
    @akulakov akulakov changed the title tarfile.is_tarfile() modifies file object's current position tarfile.is_tarfile() and tarfile.open() when used with file object may cause tarfile operations to fail Dec 9, 2021
    @gvanrossum
    Copy link
    Member

    New changeset 128ab09 by Andrzej Mateja in branch 'main':
    bpo-44289: Keep argument file object's current position in tarfile.is_tarfile (GH-26488)
    128ab09

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @furkanonder
    Copy link
    Sponsor Contributor

    @gvanrossum Issue seems to resolved. I think we can close the issue.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: Done
    Development

    No branches or pull requests

    3 participants