This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [Feature Request]: Add zstd support in tarfile
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Anatol Pomozov, Jeffrey.Kintscher, Jerrod Frost, daniel.ugra, erlendaasland, evan0greenup, lars.gustaebel, lilydjwg, malin, serhiy.storchaka, wicher
Priority: normal Keywords:

Created on 2019-05-30 03:42 by evan0greenup, last changed 2022-04-11 14:59 by admin.

Messages (7)
msg343945 - (view) Author: Evan Greenup (evan0greenup) Date: 2019-05-30 03:42
Zstandard is getting more and more popular. It could be awesome if tarfile support this compression format for .tar.zst file.
msg356498 - (view) Author: Jerrod Frost (Jerrod Frost) Date: 2019-11-12 22:31
Curious about this as well.
msg373583 - (view) Author: Anatol Pomozov (Anatol Pomozov) Date: 2020-07-13 04:24
Is there any progress with this feature development?

Arch Linux uses Python tar library for its toolset. Arch devs are looking to add ZSTD support to the toolset but it needs this feature to be implemented.
msg373634 - (view) Author: Ma Lin (malin) * Date: 2020-07-14 10:14
> Add zstd support in tarfile

This requires the stdlib to contain a Zstandard module.

You can ask in the Idea forum:
https://discuss.python.org/c/ideas
msg374123 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-07-23 06:14
The tarfile module supports arbitrary compressions by using the stream mode. You only need to use a third-party library which provides zstd support.

Recent versions of the tar utility has options to explicit support of new compressions: --lzip, --lzma, --lzop, --zstd, so corresponding modes can be added to the tarfile module. But it needs to include the support of these compressions in the stdlib. It should be discussed on the Python-ideas mailing list.

https://mail.python.org/mailman3/lists/python-ideas.python.org/
msg375472 - (view) Author: Ma Lin (malin) * Date: 2020-08-15 15:51
There are two zstd modules on pypi:

    https://pypi.org/project/zstd/
    https://pypi.org/project/zstandard/
    
The first one is too simple.

The second one is powerful, but has too many APIs:
    ZstdCompressorIterator
    ZstdDecompressorIterator
    ZstdCompressionReader
    ZstdCompressionWriter
    ZstdCompressionChunkerIterator
    (multi-thread compression)

IMO these are not necessary for stdlib.

In addition, it needs to add something, such as the `max_length` parameter, and a `ZstdFile` class that can be integrated with the tarfile module. These workloads are not big.

I looked at the zstd API, it's a bit simpler than lzma/bz2/zlib. If spend a month, should be able to make a zstd module for stdlib. Then discuss the detailed API on Python-Ideas.
 
I once wanted to do this job, but it seems my time does not allow it. If anyone wants to do this work, please reply here.

FYI, Python 3.10 schedule:
    3.10.0 beta 1: 2021-05-03 (No new features beyond this point.)
msg376095 - (view) Author: Ma Lin (malin) * Date: 2020-08-30 04:14
I have spent two weeks, almost complete the code, a preview:
https://github.com/animalize/cpython/pull/8/files

Write directly for stdlib, since there are already zstd modules on pypi.
In addition, the API of zstd is simple, not as complicated as lzma.

Can also use these:
1, argument clinic
2, multi-phase init
3. internal function _PyLong_AsInt
History
Date User Action Args
2022-04-11 14:59:15adminsetgithub: 81276
2021-10-26 10:45:36yan12125setnosy: - yan12125
2021-10-26 07:23:28erlendaaslandsetnosy: + erlendaasland
2020-08-30 04:14:28malinsetmessages: + msg376095
2020-08-15 15:51:47malinsetmessages: + msg375472
2020-07-23 06:23:33Jeffrey.Kintschersetnosy: + Jeffrey.Kintscher
2020-07-23 06:14:24serhiy.storchakasetmessages: + msg374123
2020-07-22 16:16:30wichersetnosy: + wicher
2020-07-14 10:14:16malinsetnosy: + malin
messages: + msg373634
2020-07-13 11:32:16lukasz.langasetversions: + Python 3.10, - Python 3.8, Python 3.9
2020-07-13 04:24:42Anatol Pomozovsetnosy: + Anatol Pomozov
messages: + msg373583
2020-04-03 12:13:57daniel.ugrasetnosy: + daniel.ugra
2019-11-25 06:45:47yan12125setnosy: + yan12125
2019-11-24 14:11:30lilydjwgsetnosy: + lilydjwg
2019-11-12 22:31:00Jerrod Frostsetnosy: + Jerrod Frost
messages: + msg356498
2019-05-30 03:46:44xtreaksetnosy: + lars.gustaebel, serhiy.storchaka

versions: + Python 3.8, Python 3.9, - Python 3.7
2019-05-30 03:42:48evan0greenupcreate