classification
Title: Add --fast, --best to the gzip CLI
Type: Stage: resolved
Components: Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: matrixise Nosy List: matrixise, mdk, pmpp, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2018-10-13 05:45 by matrixise, last changed 2018-11-03 21:18 by mdk. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 9833 merged matrixise, 2018-10-13 05:48
Messages (10)
msg327626 - (view) Author: Stéphane Wirtel (matrixise) * (Python triager) Date: 2018-10-13 05:45
the gzip module has a CLI but this one does not allow to specify the compression method (slow, fast)
msg327628 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-10-13 06:33
Hi Stéphane thanks for the proposal and the PR.

But are those options usefull in real life? (I may be biased as a Linux user).

I see this gzip CLI usefull to decompress a gzip file on platforms not having a gzip program installed, but I don't think it's usefull to compress. (Yet I'm OK with the current status-quo: If we allow to decompress, let's allow to compress, for consistency).

But why drifting from "let's allow to compress just for consistency" to "let's replace the gzip command"?

I mean, except is it's usefull in some cases that I don't see.

Also so does it mean we'll have to add the 16 other gzip options at the end?
msg327631 - (view) Author: pmpp (pmpp) * Date: 2018-10-13 07:51
Hi, on platform without gzip ( there are some , including some widely used OS  eg: https://github.com/bazelbuild/rules_docker/issues/507 )
ability to use python gzip cli is extremely usefull as a fallback.

Though as discussed on irc default compression to 6 is a good tradeoff for the basic options fast (-1) / best(-9) and require less memory for decompression on target (could be embedded).
msg327637 - (view) Author: Stéphane Wirtel (matrixise) * (Python triager) Date: 2018-10-13 08:43
Julien,

Currently, the default compresslevel for gzip.open, GzipFile is 9, for the best method compression. Maybe we could define 6 as the tradeoff and specify 1 (for fast), 9 (for best), 6 (tradeoff).

Maybe in an other issue?
msg328158 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-20 16:27
I'm not sure those options are such useful. If you want to control more details, it is not hard to write a tiny Python script. Python is a programming language on the whole.

But if I add options for controlling the compression level, I would add options -1 ... -9. I never used verbose forms --fast and --best.
msg328799 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-29 10:44
> But if I add options for controlling the compression level, I would add options -1 ... -9. I never used verbose forms --fast and --best.

If we add options, I would prefer to only add --fast and --best which are easy to understand. I really have no idea of the difference between -3 and -4 for example. In practice, I don't think that anyone uses these -N options on the common line.

To be honest, I never passed any option to gzip: I always use "gzip file" to get "file.gz". I don't really care of it's file. I just hope that it's smaller :-)

But I'm not against add --best and --fast. By the way, on Linux, gzip default compression level is 6 whereas Python uses 9 by default. I agree to make Python more consistent with Unix tools. In that case, again, it makes sense to add an option to get again --best (level 9).
msg328801 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-29 11:25
If we add this options to the gzip CLI, I would suggest to add them to bzip2 and lzma CLI in the same PR. And maybe open separate issues for the zipfile and tarfile CLI,
msg328803 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-29 11:32
man lzma:

       --fast
       --best These are somewhat misleading aliases for  -0  and  -9,  respec‐
              tively.   These  are  provided  only for backwards compatibility
              with LZMA Utils.  Avoid using these options.

man bzip2:

       -1 (or --fast) to -9 (or --best)
              Set  the  block size to 100 k, 200 k ..  900 k when compressing.
              Has no effect when decompressing.  See MEMORY MANAGEMENT  below.
              The --fast and --best aliases are primarily for GNU gzip compat‐
              ibility.  In particular, --fast  doesn't  make  things  signifi‐
              cantly faster.  And --best merely selects the default behaviour.
msg328825 - (view) Author: Stéphane Wirtel (matrixise) * (Python triager) Date: 2018-10-29 13:18
Hi @Serhiy

I would like to add them on lzma, bz2, zipfile and tarfile.
msg329201 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-03 15:24
New changeset 3e28eed9ec2249bb11ad0db4629271b7ce9b7918 by Julien Palard (Stéphane Wirtel) in branch 'master':
bpo-34969: Add --fast, --best on the gzip CLI (GH-9833)
https://github.com/python/cpython/commit/3e28eed9ec2249bb11ad0db4629271b7ce9b7918
History
Date User Action Args
2018-11-03 21:18:09mdksetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-11-03 15:24:26mdksetmessages: + msg329201
2018-10-29 13:18:28matrixisesetmessages: + msg328825
2018-10-29 11:32:27vstinnersetmessages: + msg328803
2018-10-29 11:25:58serhiy.storchakasetmessages: + msg328801
2018-10-29 10:44:45vstinnersetnosy: + vstinner
messages: + msg328799
2018-10-20 16:27:04serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg328158
2018-10-13 08:43:25matrixisesetmessages: + msg327637
2018-10-13 07:51:51pmppsetnosy: + pmpp
messages: + msg327631
2018-10-13 06:33:16mdksetmessages: + msg327628
stage: patch review
2018-10-13 06:20:46matrixisesetnosy: + mdk

stage: patch review -> (no value)
2018-10-13 05:48:59matrixisesetkeywords: + patch
stage: patch review
pull_requests: + pull_request9207
2018-10-13 05:45:43matrixisecreate