Message49701
Logged In: YES
user_id=580910
I've found some time to work on this. I've added zipfile-zip64-
version2.patch, this version:
* Makes zip64 behaviour optional (defaults to off because zip(1) doesn't
support zip64)
* Is significantly faster for large zipfiles because it doesn't scan the entire
zipfile just to check that the file headers are consistent with the central
directory w.r.t. filename (this check is now done when trying to read a file)
* Updates the reference documentation.
* Adds unittests. There are two sets of tests: one set tests the behaviour of
zip64 extensions using small files by lowering the zip64 cutoff point and is
run every time, the other set do tests with huge zipfiles and are run when the
largefile feature is enabled when running the tests.
There one backward incompatible change: ZipInfo objects no longer have a
file_offset attribute. That was the other reason for scanning the entire zipfile
when opening it. IMNSHO this should have been a private attribute and the
cost of this feature is not worth its *very* limited usefulness. As an indication
of its cost: I got a 6x speedup when I removed the calculation of the
file_offset attribute, something that adds up when you are dealing with huge
zipfiles (I wrote this patch because I'm dealing with 10+GByte zipfiles with
tens of thousands of files at work).
I noticed that zipfile raises RuntimeError in some places. I've changed one of
those to zipfile.BadZipfile, but others remain. I don't like this, most of them
should be replaced by TypeError or ValueError exceptions.
BTW. This patch also supports storing files >4GByte in the zipfile, but that
feature isn't very useful because zipfile doesn't have an API for reading file
data incrementally. |
|
| Date |
User |
Action |
Args |
| 2007-08-23 15:46:47 | admin | link | issue1446489 messages |
| 2007-08-23 15:46:47 | admin | create | |
|