This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhpvorderman
Recipients rhpvorderman, serhiy.storchaka
Date 2021-11-23.06:12:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1637647971.4.0.782807409586.issue45509@roundup.psfhosted.org>
In-reply-to
Content
I increased the performance of the patch. I added the file used for benchmarking. I also test the FHCRC changes now.

The benchmark tests headers with different flags concatenated to a DEFLATE block with no data and a gzip trailer. The data is fed to gzip.decompress. Please note that this is the *worst-case* performance overhead. When there is actual data to decompress the overhead will get less. When GzipFile is used the overhead will get less as well.

BEFORE (Current main branch):
$ ./python benchmark_gzip_read_header.py 
with_fname
average: 3.01, range: 2.9-4.79 stdev: 0.19
with_noflags
average: 2.99, range: 2.93-3.04 stdev: 0.02
All flags (incl FHCRC)
average: 3.13, range: 3.05-3.16 stdev: 0.02


After (bpo-45509 PR):
with_fname
average: 3.09, range: 3.01-4.63 stdev: 0.16
with_noflags
average: 3.1, range: 3.03-3.38 stdev: 0.04
All flags (incl FHCRC)
average: 4.09, range: 4.05-4.49 stdev: 0.04

An increase of .1 microsecond in the most common use cases. Roughly 3%. But now the FNAME field is correctly checked for truncation.

With the FHCRC the overhead is increased by 33%. But this is worth it, because the header is now actually checked. As it should.
History
Date User Action Args
2021-11-23 06:12:51rhpvordermansetrecipients: + rhpvorderman, serhiy.storchaka
2021-11-23 06:12:51rhpvordermansetmessageid: <1637647971.4.0.782807409586.issue45509@roundup.psfhosted.org>
2021-11-23 06:12:51rhpvordermanlinkissue45509 messages
2021-11-23 06:12:51rhpvordermancreate