This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile.ZipFile behavior inconsistent.
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, Retro, alanmcintyre, georg.brandl, markaflacy, nnorwitz, pitrou
Priority: normal Keywords: patch

Created on 2007-05-01 16:43 by markaflacy, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
zipfile_empty.diff georg.brandl, 2010-07-29 17:43
zipfile_empty3.diff alanmcintyre, 2010-10-02 20:49 latest patch against svn trunk
Messages (15)
msg31920 - (view) Author: Mark Flacy (markaflacy) Date: 2007-05-01 16:43
In short, ZipFile() will not write the Central Directory entry unless you have added a file to it.  That makes it impossible to create a valid empty zip archive.

In one of my applications, I have the need to extract a partial set of information from one zip file and insert it into another.  There are valid use cases where the source zip archive will not have any of the files which I am looking for.  In Python 2.4, I would end up with an empty file which was considered to be a valid empty zip archive.  In Python 2.5, an empty file is not considered a valid zip archive.  

One would reasonably expect that creating a new ZipFile(mode="r") and successfully closing it without writing any entries would result in a valid zip archive that could be re-opened later without throwing an exception.

msg31921 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2007-05-02 06:06
Mark, can you create a patch for zipfile to make it do what you want?  Do the docs mention anything about this either way?  Perhaps the docs also need updating?

I don't know about what happened here, but I'm guessing there was some bug fix.  This change could have been intentional or not.  I patch will help us figure out what went wrong and how to proceed.  
msg31922 - (view) Author: Mark Flacy (markaflacy) Date: 2007-05-04 07:26
No wonder you're confused.  My description of how 2.4 worked was flat-out wrong; empty files opened as zip files will throw IOExceptions and have done so since 2.4 at least (I didn't look further back than that).  However, it *is* the case that 2.4 would correctly write the Central Directory entry on zipfile close for "w" and "a" modes, even for zip files that never had any entries written into them.

In 2.4, the ZipFile.close() method contains the line...

        if self.mode in ("w", "a"):             # write ending records

...while in 2.5, the test was changed to...

        if self.mode in ("w", "a") and self._didModify: # write ending records

That change was added in revision 46967 as part of the ZIP64 support and that change breaks backwards compatibility (as well as not making a lot of sense for the "w" case).
msg31923 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2007-05-09 21:40
I tried out a change to set the modified flag (_didModify) if the ZipFile constructor ends up having mode  'w' or decides that it's appending to a file with no existing zip structure at the end.  I'm waiting on the full regression test suite to run against it, but it passes everything in test_zipfile.py (and I added new tests to check for the behavior with empty files).  I can post the patch if Mark hasn't had a chance to work one up yet.

The docs don't seem to say anything about what happens if you open a ZipFile in 'w' or 'a' and then just close it.  I wouldn't mind updating the docs to cover this if desired.

As a side note, when attempting to open an empty file in 'r' mode, a mostly unhelpful IOError (with message "invalid parameter") gets raised in _EndRecData when attempting to seek backwards.  It seems that it would be preferable to catch any exceptions raised by _EndRecData and raise a BadZipFile so that it's not as cryptic.
msg59841 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2008-01-12 21:51
Here's a quick patch that covers the issues mentioned in my post from
2007-05-09.
msg111979 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-07-29 17:43
Alan, I've updated the patch for current 3.2 trunk, see attached, but the new test now fails.  Can you find out why?
msg112164 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2010-07-31 16:24
Sure thing; I'll see if I can have a look within the next week or so.
msg113934 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2010-08-14 22:44
Apparenty _EndRecData64 needed the same kind of check that _EndRecData has when trying to seek to the end-of-archive record.  So I added that, and everything seems to work correctly now.  All tests pass on my 64-bit Linux box (including test_zipfile64).

The updated patch against py3k/trunk is attached as zipfile_empty2.diff.
msg113950 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-08-15 12:33
Two nits:
- bug fixes shouldn't have a "versionadded" or "versionchanged" entry (it's only for new features)
- Misc/NEWS should be in antichronological order
msg113952 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-08-15 12:50
Agreed with Antoine.  Do you want to commit?
msg116677 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-09-17 16:19
I don't think the latest patch has been committed, could someone wave their magic wand please :)
msg117896 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2010-10-02 20:49
My apologies if Georg was waiting on me to say, "Yes." :-)

I've attached an updated patch that has the NEWS/doc changes Antoine mentioned.  I also just checked that the tests still pass on Linux against the current trunk, and that the docs still build properly.
msg118627 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-14 07:02
Okay, applied as r85455.
msg122660 - (view) Author: Boštjan Mejak (Retro) Date: 2010-11-28 14:43
Please fix this patch to raise BadZipFile instead of BadZipfile. See http://docs.python.org/dev/library/zipfile.html?highlight=zipfile#zipfile.BadZipfile. The use of the class name BadZipfile is deprecated. The class name BadZipFile is prefered.
msg122695 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-11-28 18:23
Not sure what you're referring to, all occurrences of BadZipfile (except for its definition and its documentation) have been removed in py3k.
History
Date User Action Args
2022-04-11 14:56:24adminsetgithub: 44916
2010-11-28 18:23:51georg.brandlsetmessages: + msg122695
2010-11-28 14:43:04Retrosetnosy: + Retro
messages: + msg122660
2010-10-14 07:02:08georg.brandlsetstatus: open -> closed
resolution: fixed
messages: + msg118627
2010-10-02 20:49:49alanmcintyresetfiles: - zipfile_empty2.diff
2010-10-02 20:49:09alanmcintyresetfiles: + zipfile_empty3.diff

messages: + msg117896
2010-10-02 20:10:06alanmcintyresetfiles: - empty-zipfile.diff
2010-09-17 16:19:20BreamoreBoysetnosy: + BreamoreBoy
messages: + msg116677
2010-08-15 12:50:52georg.brandlsetmessages: + msg113952
2010-08-15 12:33:22pitrousetnosy: + pitrou
messages: + msg113950
2010-08-14 22:44:26alanmcintyresetfiles: + zipfile_empty2.diff

messages: + msg113934
2010-07-31 16:24:15alanmcintyresetmessages: + msg112164
2010-07-29 17:43:31georg.brandlsetfiles: + zipfile_empty.diff
versions: + Python 3.2, - Python 2.7
nosy: + georg.brandl

messages: + msg111979
2009-07-04 02:35:15ezio.melottisetkeywords: + patch
stage: test needed
type: behavior
versions: + Python 2.7, - Python 2.5
2008-01-12 21:51:34alanmcintyresetfiles: + empty-zipfile.diff
messages: + msg59841
2007-05-01 16:43:50markaflacycreate