Message 406758 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhpvorderman
Recipients	rhpvorderman, serhiy.storchaka
Date	2021-11-22.10:44:01
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1637577841.58.0.423673007131.issue45509@roundup.psfhosted.org>
In-reply-to

Content
1. Quite a lot I tested it for the two most common use case. import timeit import statistics WITH_FNAME = """ from gzip import GzipFile, decompress import io fileobj = io.BytesIO() g = GzipFile(fileobj=fileobj, mode='wb', filename='compressable_file') g.write(b'') g.close() data=fileobj.getvalue() """ WITH_NO_FLAGS = """ from gzip import decompress import zlib data = zlib.compress(b'', wbits=31) """ def benchmark(name, setup, loops=10000, runs=10): print(f"{name}") results = [timeit.timeit("decompress(data)", setup, number=loops) for _ in range(runs)] # Calculate microseconds results = [(result / loops) * 1_000_000 for result in results] print(f"average: {round(statistics.mean(results), 2)}, " f"range: {round(min(results), 2)}-{round(max(results),2)} " f"stdev: {round(statistics.stdev(results),2)}") if __name__ == "__main__": benchmark("with_fname", WITH_FNAME) benchmark("with_noflags", WITH_FNAME) BEFORE: with_fname average: 3.27, range: 3.21-3.36 stdev: 0.05 with_noflags average: 3.24, range: 3.14-3.37 stdev: 0.07 AFTER: with_fname average: 4.98, range: 4.85-5.14 stdev: 0.1 with_noflags average: 4.87, range: 4.69-5.05 stdev: 0.1 That is a dramatic increase in overhead. (Okay the decompressed data is empty, but still) 2. Haven't tested this yet. But the regression is quite unacceptable already. 3. Not that I know of. But if it is set, it is safe to assume they care. Nevertheless this is a bit of an edge-case.

1. Quite a lot

I tested it for the two most common use case. 
import timeit
import statistics

WITH_FNAME = """
from gzip import GzipFile, decompress
import io
fileobj = io.BytesIO()
g = GzipFile(fileobj=fileobj, mode='wb', filename='compressable_file')
g.write(b'')
g.close()
data=fileobj.getvalue()
"""
WITH_NO_FLAGS = """
from gzip import decompress
import zlib
data = zlib.compress(b'', wbits=31)
"""

def benchmark(name, setup, loops=10000, runs=10):
    print(f"{name}")
    results = [timeit.timeit("decompress(data)", setup, number=loops) for _ in range(runs)]
    # Calculate microseconds
    results = [(result / loops) * 1_000_000 for result in results]
    print(f"average: {round(statistics.mean(results), 2)}, "
          f"range: {round(min(results), 2)}-{round(max(results),2)} "
          f"stdev: {round(statistics.stdev(results),2)}")


if __name__ == "__main__":
    benchmark("with_fname", WITH_FNAME)
    benchmark("with_noflags", WITH_FNAME)

BEFORE:

with_fname
average: 3.27, range: 3.21-3.36 stdev: 0.05
with_noflags
average: 3.24, range: 3.14-3.37 stdev: 0.07

AFTER:
with_fname
average: 4.98, range: 4.85-5.14 stdev: 0.1
with_noflags
average: 4.87, range: 4.69-5.05 stdev: 0.1

That is a dramatic increase in overhead. (Okay the decompressed data is empty, but still)

2. Haven't tested this yet. But the regression is quite unacceptable already.

3. Not that I know of. But if it is set, it is safe to assume they care. Nevertheless this is a bit of an edge-case.

History
Date	User	Action	Args
2021-11-22 10:44:02	rhpvorderman	set	recipients: + rhpvorderman, serhiy.storchaka
2021-11-22 10:44:01	rhpvorderman	set	messageid: <1637577841.58.0.423673007131.issue45509@roundup.psfhosted.org>
2021-11-22 10:44:01	rhpvorderman	link	issue45509 messages
2021-11-22 10:44:01	rhpvorderman	create