This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile should expose supported formats
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, docs@python, eric.araujo, lars.gustaebel, nadeem.vawda
Priority: normal Keywords: patch

Created on 2012-02-14 16:27 by eric.araujo, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
add-tarfile.formats.diff eric.araujo, 2012-02-14 16:29 review
add-tarfile.compression_formats.diff eric.araujo, 2012-02-20 02:19 review
issue14013.diff berker.peksag, 2014-06-23 19:36 review
Messages (4)
msg153350 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-14 16:27
shutil contains high-level functions to create a zipfile or a tarball.  When a new format is added to the tarfile module, then shutil needs to be updated manually.  If tarfile exposed the names of the compressors it supports, then shutil could just automatically support everything that tarfile supports instead of having to re-do import dances for optional modules (bz2, lzma, zlib) and also duplicate formats in its doc.

This may also be useful for other code wanting to do some introspection.

Attached patch implements tarfile.formats, a list of strings (I thought about using a frozenset but then followed the precedent set by the 3.3 crypt module).  Tests and docs not updated, I wanted to get Lars’ approval on the principle first.

One could argue that this is not needed: compression modules are not added often; updating shutil after updating tarfile is not hard; it is not that useful to have access to the list of supported formats.
msg153650 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-02-18 18:10
I think this is a reasonable proposal. I think it is good style to let tarfile figure out which supported compression methods are available instead of shutil or the user. So far I have no objections.

Following 3.3's crypt module, I think the name `methods' is superior to `formats' (maybe `compression_methods' is even better). Also, crypt's concept of a sorted list from stronger to weaker could also make sense here: ["xz", "bz2", "gz"]. Why not?
msg153758 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-20 02:19
Thanks for the quick reply.

> I think it is good style to let tarfile figure out which supported compression methods are
> available instead of shutil or the user.
Note that shutil will not be wholly transparent when I’m done with the refactoring, as it will be able to translate 'xztar', 'bztar' and 'gztar' to tarfile mode strings, but will need to have a special case to morph 'bztar' to 'bz2'.  It will be a small ugliness.

(There will also be ugliness in packaging: Even if I make it transparently supports all formats that shutil supports, I’ll need to have a bit of duplication because packaging has a preferred format by platform.  Well.)

> Following 3.3's crypt module, I think the name `methods' is superior to `formats' (maybe
> `compression_methods' is even better).
Note that crypt’s methods really are instances of something called Method.  hashlib has algorithms_guaranteed and algorithms_available since 3.2 and shutil uses get_archive_formats and get_unpack_formats.  I went for tarfile.compression_formats.

> Also, crypt's concept of a sorted list from stronger to weaker could also make sense here:
Sure.  In my first patch I put gz first as it should be universally available, and then put xz before bz2 as I think bz2 is quickly losing ground to xz (even GNU and Debian are switching to xz for their archives).  The attached patch follows your idea.

BTW I will gladly wait for commits related to the other bugs (misc bugs and misc doc edits) and refresh my patch then.
msg221374 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-06-23 19:36
I've updated Éric's patch. Minor changes:
- Updated versionadded directive
- A couple of cosmetic changes (e.g. removed brackets in the list comprehension)
History
Date User Action Args
2022-04-11 14:57:26adminsetgithub: 58221
2014-06-23 19:36:25berker.peksagsetfiles: + issue14013.diff

assignee: docs@python ->
components: - Documentation
versions: + Python 3.5, - Python 3.3
nosy: + berker.peksag

messages: + msg221374
2012-02-23 01:22:05eric.araujounlinkissue5411 dependencies
2012-02-20 02:19:39eric.araujosetfiles: + add-tarfile.compression_formats.diff

messages: + msg153758
2012-02-18 18:10:03lars.gustaebelsetmessages: + msg153650
2012-02-14 16:34:22eric.araujolinkissue5411 dependencies
2012-02-14 16:29:16eric.araujosetfiles: + add-tarfile.formats.diff
keywords: + patch
2012-02-14 16:27:26eric.araujocreate