This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile and winzip
Type: behavior Stage:
Components: Extension Modules Versions: Python 3.0, Python 3.1, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: amaury.forgeotdarc Nosy List: amaury.forgeotdarc, barry, benjamin.peterson, brett.cannon, loewis, pitrou, vgeorge
Priority: release blocker Keywords: needs review, patch

Created on 2008-09-29 15:10 by vgeorge, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bug.py vgeorge, 2008-09-29 15:10 replicates the reported issue
zip-64k-3.patch amaury.forgeotdarc, 2009-01-17 14:04
zip64-alwaystry2.patch amaury.forgeotdarc, 2009-01-17 22:27
Messages (16)
msg74030 - (view) Author: vali (vgeorge) Date: 2008-09-29 15:10
using ZipFile library with Python 2.6 or an earlier version creates
archived files that are not compatible with windows compress or Winzip.
Other programs like 7-Zip will not have a problem with the format. 

Bug Description:
if it is attempted to create an archive with more than 65535 (e.g 2^16 +
10)  files winzip or windows compress will show only what is above 65535
(in this case 9 file) 

The attached script tries to create an archive with 2^16 + 1 files and
compress or winzip will show an empty archive.
msg74031 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-29 15:20
An archive with more than 65535 files must use the "64-bit extensions"
of the standard Zip format.

Such archives cannot be opened by programs that do not understand these
extensions. See http://www.winzip.com/wzdic.htm

Which version of Winzip did you try?
msg75003 - (view) Author: vali (vgeorge) Date: 2008-10-20 20:33
The version I used should not have this limitation as archives created
with other languages as Java or C# opens fine in WinZip 11.2 evaluation
version. Also the same issue can be observed in Windows compress utility.
msg75014 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-10-20 22:23
Right, this is a bug in zipfile.py.
The official PKZIP specifications says:
http://www.pkware.com/documents/casestudies/APPNOTE.TXT

      total number of entries in the central dir: (2 bytes)

          The total number of files in the .ZIP file. If an 
          archive is in ZIP64 format and the value in this field
          is 0xFFFF, the size will be in the corresponding 8 byte 
          zip64 end of central directory field.

Patch is attached. With it I can open the file with WinZip.
msg75036 - (view) Author: vali (vgeorge) Date: 2008-10-21 19:24
Thank you for the quick fix. I could verify that the issue is fixed in
python 2.6 when I use WinZip to open an archive with more than 2^16
files created with attached script (bug.py). However the windows native
compress utility does not seem to be able to recognize the files that
have more than 2^16 files (the error says that the archive is not
valid). However, if I try to create an archive using WinZip or 7zip with
exactly the same files, windows compress utility is able to open the
archive. The same thing would apply if I create the archive using zlib
C# library.
msg75040 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-10-21 20:41
OK, it seems that the "central directory size" and "central directory 
offset" must contain their actual value if it can fit in a 32bit int, 
even though the spec says "If an archive is in ZIP64 format and the 
value in this field is 0xFFFFFFFF, the value is in the corresponding zip64 end of central directory field."

Attached a new patch. The file generated by bug.py seems very similar to 
one generated with WinZip, and the WindowsXP explorer can now open it.
msg75145 - (view) Author: vali (vgeorge) Date: 2008-10-23 16:23
I could verify that the patch works with both the Windows Compress
utility, WinZip and 7zip. 
Thank you!
msg79181 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-05 18:31
The patch looks nearly fine. AFAICT, care must be taken to always write
a ZIP64 end-of-cd record whenever an end-of-cd field overflows; I think
this patch is missing the condition centDirSize > ZIP64_LIMIT.
msg80015 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-17 14:04
Here is an updated version (zip-64k-3.patch).
Now the condition for writing a "ZIP64 end-of-archive" depends on the 
size of all three values.
msg80016 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-17 14:17
Looks fine to me. Please apply.
msg80022 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-17 16:57
Fixed with r68661 (trunk), r68662 (py3k), r68663 (2.6) and r68664 (3.0)
msg80047 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-01-17 21:03
Reopening, some tests have started deterministically failing:

======================================================================
ERROR: testAbsoluteArcnames (test.test_zipfile.TestZip64InSmallFiles)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/test/test_zipfile.py",
line 506, in testAbsoluteArcnames
    zipfp = zipfile.ZipFile(TESTFN2, "r", zipfile.ZIP_STORED)
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 710, in __init__
    self._GetContents()
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 730, in _GetContents
    self._RealGetContents()
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 767, in _RealGetContents
    raise BadZipfile, "Bad magic number for central directory"
BadZipfile: Bad magic number for central directory

======================================================================
ERROR: testDeflated (test.test_zipfile.TestZip64InSmallFiles)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/test/test_zipfile.py",
line 499, in testDeflated
    self.zipTest(f, zipfile.ZIP_DEFLATED)
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/test/test_zipfile.py",
line 434, in zipTest
    zipfp = zipfile.ZipFile(f, "r", compression)
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 710, in __init__
    self._GetContents()
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 730, in _GetContents
    self._RealGetContents()
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 767, in _RealGetContents
    raise BadZipfile, "Bad magic number for central directory"
BadZipfile: Bad magic number for central directory

======================================================================
ERROR: testStored (test.test_zipfile.TestZip64InSmallFiles)
----------------------------------------------------------------------
Traceback (most recent call last):
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/test/test_zipfile.py",
line 493, in testStored
    self.zipTest(f, zipfile.ZIP_STORED)
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/test/test_zipfile.py",
line 434, in zipTest
    zipfp = zipfile.ZipFile(f, "r", compression)
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 710, in __init__
    self._GetContents()
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 730, in _GetContents
    self._RealGetContents()
  File
"/home/pybot/buildarea/trunk.klose-debian-ia64/build/Lib/zipfile.py",
line 767, in _RealGetContents
    raise BadZipfile, "Bad magic number for central directory"
BadZipfile: Bad magic number for central directory

----------------------------------------------------------------------
msg80049 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-17 21:39
Oops, I did see this before, but forgot to merge the patches.

It appears that "centDirOffset == 0xffffffff" is not the correct test 
for detecting zip64 structure. From the PKZip notes, a value of 
0xffffffff does imply a zip64 format, not the other way round.

So it seems necessary to always try to read the zip64 info (the _EndRecData64 function does not fail if the format is not zip64 - is 
simply returns the previous structure)
Patch is attached, and now the test passes.
msg80057 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-17 22:44
Applied the zip64-alwaystry2.patch in trunk, waiting for buildbots.
msg80672 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2009-01-27 22:27
Amaury, did the buildbots verify this worked and thus this bug can be
closed?
msg80708 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-28 13:35
Yes, the correction went in r68678, r68700, r68734 and r68735
History
Date User Action Args
2022-04-11 14:56:39adminsetnosy: + barry, benjamin.peterson
github: 48247
2009-01-28 13:35:43amaury.forgeotdarcsetstatus: open -> closed
resolution: fixed
messages: + msg80708
2009-01-27 22:27:53brett.cannonsetnosy: + brett.cannon
messages: + msg80672
2009-01-27 21:16:24vstinnersetfiles: - zip-64k-2.patch
2009-01-27 21:16:17vstinnersetfiles: - zip-64k.patch
2009-01-17 22:44:56amaury.forgeotdarcsetmessages: + msg80057
2009-01-17 22:27:27amaury.forgeotdarcsetfiles: - zip64-alwaystry.patch
2009-01-17 22:27:19amaury.forgeotdarcsetfiles: + zip64-alwaystry2.patch
2009-01-17 21:39:49amaury.forgeotdarcsetkeywords: + needs review
files: + zip64-alwaystry.patch
messages: + msg80049
2009-01-17 21:03:09pitrousetstatus: closed -> open
nosy: + pitrou
resolution: fixed -> (no value)
messages: + msg80047
2009-01-17 16:57:46amaury.forgeotdarcsetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg80022
2009-01-17 14:17:46loewissetkeywords: - needs review
messages: + msg80016
2009-01-17 14:04:03amaury.forgeotdarcsetfiles: + zip-64k-3.patch
messages: + msg80015
2009-01-05 18:31:28loewissetassignee: amaury.forgeotdarc
resolution: accepted
messages: + msg79181
2009-01-04 01:04:14loewissetversions: + Python 3.0, Python 3.1, Python 2.7
2008-12-30 12:14:52loewissetpriority: release blocker
2008-10-23 16:23:11vgeorgesetmessages: + msg75145
2008-10-21 20:41:14amaury.forgeotdarcsetfiles: + zip-64k-2.patch
messages: + msg75040
2008-10-21 19:24:59vgeorgesetmessages: + msg75036
2008-10-20 22:23:51amaury.forgeotdarcsetkeywords: + needs review, patch
nosy: + loewis
messages: + msg75014
files: + zip-64k.patch
2008-10-20 20:33:49vgeorgesetmessages: + msg75003
2008-09-29 15:20:27amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg74031
2008-09-29 15:10:58vgeorgecreate