Message 160377 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	Jimbofbx, docs@python, serhiy.storchaka, xuanji
Date	2012-05-10.22:24:06
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1336688646.86.0.522953899714.issue10376@psf.upfronthosting.co.za>
In-reply-to

Content
This is not because zipfile module is unbuffered. This is the difference between expensive function call and cheap bytes slicing. Replace `zf.open(namelist [0])` to `io.BufferedReader(zf.open(namelist [0]))` to see the effect of a good buffering. In 3.2 zipfile read() implemented not optimal, so it slower (twice), but in 3.3 it will be almost as fast as using io.BufferedReader. It is still several times more slowly than bytes slicing, but there's nothing you can do with it. Here is a patch, which is speeds up (+20%) the reading from a zip file by small chunks. Microbenchmark: ./python -m zipfile -c test.zip python ./python -m timeit -n 1 -s "import zipfile;zf=zipfile.ZipFile('test.zip')" "with zf.open('python') as f:" " while f.read(1):pass" Python 3.3 (vanilla): 1 loops, best of 3: 36.4 sec per loop Python 3.3 (patched): 1 loops, best of 3: 30.1 sec per loop Python 3.3 (with io.BufferedReader): 1 loops, best of 3: 30.2 sec per loop And, for comparison, Python 3.2: 1 loops, best of 3: 74.5 sec per loop

This is not because zipfile module is unbuffered. This is the difference between expensive function call and cheap bytes slicing. Replace `zf.open(namelist [0])` to `io.BufferedReader(zf.open(namelist [0]))` to see the effect of a good buffering. In 3.2 zipfile read() implemented not optimal, so it slower (twice), but in 3.3 it will be almost as fast as using io.BufferedReader. It is still several times more slowly than bytes slicing, but there's nothing you can do with it.

Here is a patch, which is speeds up (+20%) the reading from a zip file by small chunks. Microbenchmark:

./python -m zipfile -c test.zip python
./python -m timeit -n 1 -s "import zipfile;zf=zipfile.ZipFile('test.zip')"  "with zf.open('python') as f:"  "  while f.read(1):pass"

Python 3.3 (vanilla):  1 loops, best of 3: 36.4 sec per loop
Python 3.3 (patched):  1 loops, best of 3: 30.1 sec per loop
Python 3.3 (with io.BufferedReader):  1 loops, best of 3: 30.2 sec per loop
And, for comparison, Python 3.2:  1 loops, best of 3: 74.5 sec per loop

History
Date	User	Action	Args
2012-05-10 22:24:07	serhiy.storchaka	set	recipients: + serhiy.storchaka, docs@python, Jimbofbx, xuanji
2012-05-10 22:24:06	serhiy.storchaka	set	messageid: <1336688646.86.0.522953899714.issue10376@psf.upfronthosting.co.za>
2012-05-10 22:24:06	serhiy.storchaka	link	issue10376 messages
2012-05-10 22:24:06	serhiy.storchaka	create