This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile.is_tarfile() and tarfile.open() when used with file object may cause tarfile operations to fail
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: andrei.avk, gvanrossum, kj, lars.gustaebel, mateja.and, python-dev
Priority: normal Keywords: patch

Created on 2021-06-02 15:09 by mateja.and, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 26488 merged python-dev, 2021-06-02 15:23
Messages (3)
msg394922 - (view) Author: Andrzej Mateja (mateja.and) * Date: 2021-06-02 15:09
Since Python 3.9 tarfile.is_tarfile accepts not only paths but also files and file-like objects (bpo-29435).

Verification if a file or file-like object is a tar file modifies file object's current position.

Imagine a function listing names of all tar archive members but checking first if this is a valid tar archive. When its argument is a str or pathlib.Path this is quite straightforward. If the argument is a file of file-like object then current position must be reset or TarFile.getmembers() returns empty list.

import tarfile


def list_tar(archive):
    if tarfile.is_tarfile(archive):
        kwargs = {'fileobj' if hasattr(archive, 'read') else 'name': archive}
        t = tarfile.open(**kwargs)
        return [member.name for member in t.getmembers()]
    return []


if __name__ == '__main__':
    path = 'archive.tar.gz'
    print(list_tar(path))
    print(list_tar(open(path, 'rb')))


['spam.py', 'ham.py', 'bacon.py', 'eggs.py']
[]
msg408062 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-12-09 01:23
This affects more use cases than just is_tarfile() and getmembers() results.

is_tarfile() calls open() which is the root cause of the issue. Calling open() 2+ times will also cause the same issue.

In addition to getmembers(), extracting the tar will also silently fail. (and possibly other operations).

I've suggested a different fix in the comment on the PR:
https://github.com/python/cpython/pull/26488#issuecomment-989367707
msg412919 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-02-09 16:19
New changeset 128ab092cad984b73a117f58fa0e9b4105051a04 by Andrzej Mateja in branch 'main':
bpo-44289: Keep argument file object's current position in tarfile.is_tarfile (GH-26488)
https://github.com/python/cpython/commit/128ab092cad984b73a117f58fa0e9b4105051a04
History
Date User Action Args
2022-04-11 14:59:46adminsetgithub: 88455
2022-02-09 16:19:26gvanrossumsetnosy: + gvanrossum
messages: + msg412919
2021-12-11 03:56:45andrei.avksetnosy: + lars.gustaebel
2021-12-09 01:23:47andrei.avksetnosy: + kj
2021-12-09 01:23:31andrei.avksetnosy: + andrei.avk

messages: + msg408062
title: tarfile.is_tarfile() modifies file object's current position -> tarfile.is_tarfile() and tarfile.open() when used with file object may cause tarfile operations to fail
2021-06-02 15:23:44python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request25084
stage: patch review
2021-06-02 15:09:05mateja.andcreate