New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zipfile._ZipDecryptor generates wasteful crc32 table on import #62715
Comments
http://hg.python.org/cpython/file/e7305517260b/Lib/zipfile.py#l460 I noticed this table taking up time on import. I'd guess that it pre-dates zlib.crc32 which is imported at the top of the file. I also suspect that most of the time this table isn't even used at all. |
Someone who has a better understanding of zipfile may be able to get zlib.crc32(bytes[ch], running_crc) to work correctly. This patch that passes the zipfile tests generates the crctable only when _ZipDecrypter() is instantiated. |
How much time take it? |
It takes 706 microseconds. On my machine %timeit import sys; del sys.modules['zipfile']; import zipfile "import takes 2.51 ms without the patch and 1.7 ms with the patch. On Sat, Jul 20, 2013, at 12:13 PM, Serhiy Storchaka wrote:
|
I tried it on a raspberry pi. zipfile takes 36 ms to import and 10 ms if it does not generate the crctable. |
The objection to zipfile-no-crc32.patch is that binascii.crc32() and _crc32() have different signatures. binascii.crc32() accepts a byte object while _crc32() accepts a single integer. With packing this value into a bytes object _crc32() will be much slower. As for zdlazy.patch, there is a proposed patch in bpo-10030 which speedups pure Python decryption. I'm afraid that with zdlazy.patch this path will lose a part of it's speedup. |
I am withdrawing zipfile-no-crc32.patch. It did not work correctly. zdlazy.patch should go in. It avoids creating the rarely-needed crc32 table until the first time it is needed, saving some memory and the majority of time needed to import the module. zdlazy.patch should not affect its C replacement at all, except that the rarely-needed CRC32 table that we are not generating becomes a never-needed table. |
I meant my pure Python patch in bpo-10030. Binding crctable to local variable is one of microoptimizations. Not the largest one however. So in general I not objects. Your patch LGTM. Only one nitpick -- instead "not _ZipDecrypter.crctable" use "_ZipDecrypter.crctable is None". |
Since this isn't a bugfix, it was inappropriate to check this in after feature-freeze for 3.4. However it looks harmless enough, so I'm not asking you to revert it at this time. I guess it's easier to get forgiveness than permission, huh. |
Thanks. I guess I know who to ask now. It was just painful seeing so much import-time computation, pretty much On Mon, Jan 6, 2014, at 10:22 AM, Larry Hastings wrote:
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: