classification
Title: DeprecationWarning in zipfile.py while zipping 113000 files
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: alanmcintyre, amaury.forgeotdarc, bialix, gnezdo, gvanrossum, jcea, loewis
Priority: low Keywords:

Created on 2007-11-30 13:26 by bialix, last changed 2011-05-27 06:05 by amaury.forgeotdarc. This issue is now closed.

Files
File name Uploaded Description Edit
zipfile_lotsafiles.diff alanmcintyre, 2008-01-06 10:03
Messages (12)
msg57979 - (view) Author: Alexander Belchenko (bialix) Date: 2007-11-30 13:26
C:\Python\2.5.1\lib\zipfile.py:719: DeprecationWarning: 'H' format
requires 0 <= number <= 65535
  0, 0, count, count, pos2 - pos1, pos1, 0)
msg57992 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-11-30 17:33
Hmm... Seems there's a 16-bit-wide field somewhere. How do other ZIP
implementation deal with this?
msg59282 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-01-05 10:41
The reported warning was being produced when writing the "end of central
directory record", in ZipFile.close().

Based on a little experiment with 70k test text files, the default
archiver in OS X appears to just use the number of files mod 64k in the
end of central directory record. I tweaked the ZipFile close() method to
do this, and the resulting ZIP file appears to work just fine, both with
the OS X archiver and with ZipFile (without ZIP64 enabled).  

There's a blurb in the ZIP format description about this sort of thing:
"If an archive is in ZIP64 format and the value in this field is 0xFFFF,
the size will be in the corresponding 8 byte zip64 end of central
directory field."  I don't know if that means "the right thing" is to
switch the archive to ZIP64 format if more than 64k files are added, though.

If I have time I'll go look at some other open source ZIP
implementations, but I won't swear I'll ever get around to it. :)
msg59345 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-06 00:07
Sounds like a plan.  Can you cook up a patch?  Otherwise perhaps the Jan
19 bug day can look into this?
msg59351 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-01-06 10:03
Here's a patch that just uses the "mod 64k" approach.  If I get time to
look at some other implementations, and find a better way to handle it,
I'll submit an update.  Otherwise, maybe on bug day people can try it
out with a variety of archiving utilities to see if there's any
compatibility issues.
msg59427 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-07 03:39
I was about to check this in, when I noticed that the test runs for ~16
seconds on my state-of-the-arg hardware.  I think that's too long. 
Perhaps it should only be run when -ulargefile is enabled?
msg59428 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-01-07 04:03
Oh thanks, I meant to ask whether or not the run time was too long, but
forgot. Only running when -ulargefile is enabled seems fine to me.  I
can tweak the patch for that if you'd like; moving it to test_zipfile64
should do that, right?
msg59432 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-01-07 04:32
Sounds like a plan.
msg59811 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-01-12 10:35
I just noticed that my changes for this issue are included in the patch
for issue 1622; if that gets committed then this issue should be closed.
msg69194 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-07-03 12:53
The patch for issue1622 was committed as r64688; closing this patch as
outdated.
msg137016 - (view) Author: Greg Steuck (gnezdo) Date: 2011-05-26 23:43
There may be a related issue that I still hit with 2.6.5.

% cat /tmp/a.py 
import zipfile
import os

z = zipfile.ZipFile('/tmp/a.zip', 'w')
open("/tmp/a", "w")
os.utime("/tmp/a", (0,0))
z.write("/tmp/a", "a")
% python -V
Python 2.6.5
% uname -mo
x86_64 GNU/Linux
% uname -mor
2.6.32-gg426-generic x86_64 GNU/Linux
% python /tmp/a.py
/usr/lib/python2.6/zipfile.py:1047: DeprecationWarning: struct integer overflow masking is deprecated
  self.fp.write(zinfo.FileHeader())
/usr/lib/python2.6/zipfile.py:1047: DeprecationWarning: 'H' format requires 0 <= number <= 65535
  self.fp.write(zinfo.FileHeader())
/usr/lib/python2.6/zipfile.py:1123: DeprecationWarning: struct integer overflow masking is deprecated
  self.close()
/usr/lib/python2.6/zipfile.py:1123: DeprecationWarning: 'H' format requires 0 <= number <= 65535
  self.close()
msg137026 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-05-27 06:05
The ZIP file format is unable to store dates before 1980.  With version 3.2, your script even raises an exception.  Please file this in a different issue.
History
Date User Action Args
2011-05-27 06:05:17amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg137026
2011-05-26 23:43:40gnezdosetnosy: + gnezdo
messages: + msg137016
2008-07-03 12:53:42loewissetstatus: open -> closed
nosy: + loewis
resolution: out of date
messages: + msg69194
2008-01-16 02:38:02jceasetnosy: + jcea
2008-01-12 10:35:47alanmcintyresetmessages: + msg59811
2008-01-07 04:32:26gvanrossumsetmessages: + msg59432
2008-01-07 04:03:16alanmcintyresetmessages: + msg59428
2008-01-07 03:39:30gvanrossumsetmessages: + msg59427
2008-01-06 10:03:27alanmcintyresetfiles: + zipfile_lotsafiles.diff
messages: + msg59351
2008-01-06 00:07:05gvanrossumsetpriority: low
messages: + msg59345
2008-01-05 10:41:41alanmcintyresetnosy: + alanmcintyre
messages: + msg59282
2007-11-30 17:33:23gvanrossumsetnosy: + gvanrossum
messages: + msg57992
2007-11-30 13:26:57bialixsetmessages: + msg57979
components: + Library (Lib)
2007-11-30 13:26:20bialixcreate