This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add tarinfo.Path
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: FFY00, barneygale, ethan.furman, jaraco
Priority: normal Keywords:

Created on 2021-10-28 15:59 by FFY00, last changed 2022-04-11 14:59 by admin.

Messages (6)
msg405194 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-10-28 15:59
It would be helpful to have a pathlib-compatible object in tarfile, similarly to zipfile.Path.
msg405206 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-10-28 17:23
I vaguely recall exploring this concept and finding that tarfiles don’t supply the requisite interface because they’re not random access. I’m only 10% confident in that recollection, so worth exploring.
msg405210 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-10-28 18:08
That is good to know. This isn't very high on my priority list, but I will try to explore when I have some time.
msg409479 - (view) Author: Barney Gale (barneygale) * Date: 2022-01-01 21:34
It's possible to do, but will be a little slow due to the nature of tar files. They're a big linked list of files, so you need to do a bunch of reads/seeks from the start to the end to enumerate all files.

I'd ask that we try to get issue24132 solved first. That would let us write:

    # tarfile.py
    class Path(pathlib.AbstractPath):
        def iterdir(self):
            ...
        def stat(self):
            ...

We'd fill in a smallish number of abstract methods to get a full `Path`-compatible class with `read_text()`, `is_symlink()` etc methods.
msg409481 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2022-01-01 23:19
I'd recommend not to block on issue24132. It's not obvious to me that subclassing would be valuable. It depends on how it's implemented, but in my experience, zipfile.Path doesn't and cannot implement the full interface of pathlib.Path. Instead zipfile.Path attempts to implement a protocol. At the time, the protocol was undefined, but now there exists importlib.resources.abc.Traversable (https://docs.python.org/3/library/importlib.html#importlib.abc.Traversable), the interface needed by importlib.resources. I'd honestly just create a standalone class, see if it can implement Traversable, and only then consider if it should implement a more complicated interface (such as something with symlink support or perhaps even later subclassing from pathlib.Path).
msg409482 - (view) Author: Barney Gale (barneygale) * Date: 2022-01-02 00:01
If you're only aiming for Traversable compatibility, sure.

The original bug description asks for something that's pathlib-compatible and similar to zipfile.Path, which goes beyond the Traversable interface in attempting to emulate pathlib.Path.

The pathlib.Path interface is a good one - I see no reason it can't apply to zip and tar archives in full. Methods of Path objects already raise NotImplementedError if operations aren't supported (e.g. creating symlinks)

Some prototyping from a couple years back, including a tar path implementation: https://github.com/barneygale/pathlab/tree/master/pathlab
History
Date User Action Args
2022-04-11 14:59:51adminsetgithub: 89812
2022-01-02 00:01:26barneygalesetmessages: + msg409482
2022-01-01 23:19:11jaracosetmessages: + msg409481
2022-01-01 21:34:37barneygalesetnosy: + barneygale
messages: + msg409479
2021-10-28 20:20:39ethan.furmansetnosy: + ethan.furman
2021-10-28 18:08:11FFY00setmessages: + msg405210
2021-10-28 17:23:48jaracosetmessages: + msg405206
2021-10-28 15:59:21FFY00create