classification
Title: zipimport is a bit slow
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.2
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Goplat, brett.cannon, ysj.ray
Priority: normal Keywords: patch

Created on 2010-05-18 06:08 by Goplat, last changed 2011-06-27 00:33 by brett.cannon.

Files
File name Uploaded Description Edit
zipimport_speedup.patch Goplat, 2010-05-18 06:08 patch to read .zip's central directory sequentially
Messages (3)
msg105954 - (view) Author: (Goplat) Date: 2010-05-18 06:08
Reading the list of files in a .zip takes a while because several seeks are done for each entry, which (on Windows at least) flushes stdio's buffer and forces a system call on the next read. For large .zips the effect on startup speed is noticeable, being perhaps 50ms per thousand files. Changing the read_directory function to read the central directory entirely sequentially would cut this time by more than half.
msg105960 - (view) Author: ysj.ray (ysj.ray) Date: 2010-05-18 10:09
When I perform some test on debian-5.0, I see the timing results almost the same before and after apply your patch(I modified the patch to against the trunk). 

Could you give some test result on windows? I can't see the speedups on debian-5.0.
msg106191 - (view) Author: (Goplat) Date: 2010-05-20 21:09
Zipping up the Lib directory from the python source (1735 files) as a test, it took on average 0.10 sec to read the zip before, 0.04 sec after.

(To test the time taken by zipimport isolated from other startup costs, instead of actually getting the zip in the import path, I just ran

import time, zipimport; start = time.clock(); 
zipimport.zipimporter("lib.zip"); print time.clock() - start)
History
Date User Action Args
2011-06-27 00:33:10brett.cannonsetassignee: brett.cannon ->
2010-05-20 21:09:04Goplatsetmessages: + msg106191
2010-05-18 11:35:34pitrousetassignee: brett.cannon
stage: patch review

nosy: + brett.cannon
versions: + Python 3.2, - Python 2.6
2010-05-18 10:09:12ysj.raysetnosy: + ysj.ray
messages: + msg105960
2010-05-18 06:08:09Goplatcreate