Message 115250 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	craigds
Recipients	craigds
Date	2010-08-31.01:02:13
SpamBayes Score	2.8722025e-12
Marked as misclassified	No
Message-id	<1283216540.17.0.624049939978.issue9720@psf.upfronthosting.co.za>
In-reply-to

Content
Steps to reproduce: # create a large (>4gb) file f = open('foo.txt', 'wb') text = 'a' * 1024*2 for i in xrange(5 1024): f.write(text) f.close() # now zip the file import zipfile z = zipfile.ZipFile('foo.zip', mode='w', allowZip64=True) z.write('foo.txt') z.close() Now inspect the file headers using a hex editor. The written headers are incorrect. The filesize and compressed size should be written as 0xffffffff and the 'extra field' should contain the actual sizes. Tested on Python 2.5 but looking at the latest code in 3.2 it still looks broken. The problem is that the ZipInfo.FileHeader() is written before the filesize is populated, so Zip64 extensions are not written. Later, the sizes in the header are written, but Zip64 extensions are not taken into account and the filesize is just wrapped (7gb becomes 3gb, for instance). My patch fixes the problem on Python 2.5, it might need minor porting to fix trunk. It works by assigning the uncompressed filesize to the ZipInfo header initially, then writing the header. Then later on, I re-write the header (this is okay since the header size will not have increased.)

Steps to reproduce:

# create a large (>4gb) file
f = open('foo.txt', 'wb')
text = 'a' * 1024**2
for i in xrange(5 * 1024):
    f.write(text)
f.close()

# now zip the file
import zipfile
z = zipfile.ZipFile('foo.zip', mode='w', allowZip64=True)
z.write('foo.txt')
z.close()


Now inspect the file headers using a hex editor. The written headers are incorrect. The filesize and compressed size should be written as 0xffffffff and the 'extra field' should contain the actual sizes.


Tested on Python 2.5 but looking at the latest code in 3.2 it still looks broken.

The problem is that the ZipInfo.FileHeader() is written before the filesize is populated, so Zip64 extensions are not written. Later, the sizes in the header are written, but Zip64 extensions are not taken into account and the filesize is just wrapped (7gb becomes 3gb, for instance).

My patch fixes the problem on Python 2.5, it might need minor porting to fix trunk. It works by assigning the uncompressed filesize to the ZipInfo header initially, then writing the header. Then later on, I re-write the header (this is okay since the header size will not have increased.)

History
Date	User	Action	Args
2010-08-31 01:02:20	craigds	set	recipients: + craigds
2010-08-31 01:02:20	craigds	set	messageid: <1283216540.17.0.624049939978.issue9720@psf.upfronthosting.co.za>
2010-08-31 01:02:16	craigds	link	issue9720 messages
2010-08-31 01:02:15	craigds	create