Title: ZipFile doesn't range check in _EndRecData()
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: alanmcintyre, ebfe, mcherm, neologix, pitrou, python-dev, serhiy.storchaka, ymgve
Priority: normal Keywords: patch

Created on 2009-01-05 15:24 by ymgve, last changed 2022-04-11 14:56 by admin. This issue is now closed.

File name Uploaded Description Edit ymgve, 2009-01-05 15:36
issue4844.diff gpolo, 2009-01-05 21:05 review
issue4844-with-test.diff alanmcintyre, 2010-08-22 01:39 added test to patch review
zipfile_unpack_check.patch serhiy.storchaka, 2013-01-26 17:01 review
Messages (11)
msg79155 - (view) Author: Yngve AAdlandsvik (ymgve) Date: 2009-01-05 15:24
If you have a .zip file with an incomplete "End of Central Directory" 
record, _EndRecData() will throw a struct.error:

D:\c64workdir\Ultimate_Mag_Archive> "old - 
Handling A-z\0\
Traceback (most recent call last):
  File "E:\wwwroot\c64db\tools\", line 48, in <module>
    ok = handle_file(data, rel_filename)
  File "E:\wwwroot\c64db\tools\", line 19, in handle_file
    z = zipfile.ZipFile(cStringIO.StringIO(data), "r")
  File "C:\Python26\lib\", line 698, in __init__
  File "C:\Python26\lib\", line 718, in _GetContents
  File "C:\Python26\lib\", line 728, in _RealGetContents
    endrec = _EndRecData(fp)
  File "C:\Python26\lib\", line 219, in _EndRecData
    endrec = list(struct.unpack(structEndArchive, recData))
struct.error: unpack requires a string argument of length 22

The fix is to include a check to see if there is data enough for the 
whole record before attempting to unpack.
msg79156 - (view) Author: Lukas Lueg (ebfe) Date: 2009-01-05 15:28
please attach if possible
msg79158 - (view) Author: Yngve AAdlandsvik (ymgve) Date: 2009-01-05 15:36
Here is the file. Note that this can be reproduced with any zip file if 
you delete the last byte of the file.
msg114636 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2010-08-22 01:39
I wrote a test for this and tried out the patch on the Python3 trunk, and it seems to work ok.  I've attached an updated patch that includes the test.

It probably wouldn't hurt to go look for other places where a struct is being unpacked without checking lengths first, and see if it makes sense to add a similar check in those places, too.  I may do that later if I have some more free time.
msg116885 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2010-09-19 21:06
Following EAFP principle, it would be better - cleaner and more efficient - to put the stuct.unpack inside a try/except clause than checking the lengths beforehand.
msg116889 - (view) Author: Alan McIntyre (alanmcintyre) * (Python committer) Date: 2010-09-19 21:59
I had to look up the abbreviation (Easier to Ask Forgiveness than Permission), but that does sound like a good idea.  Thanks for mentioning it. :-)
msg176744 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-01 14:21
Here is a patch for 3.4, which adds checks for other unpacks (except one, for which issue14315 exists). Also BadZipfile replaced by BadZipFile and trailing whitespaces deleted.

For 2.7 BadZipFile should be replaced by BadZipfile back.
msg176803 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-12-02 18:15
In test_damaged_zipfile:

+        for N in range(len(s) - 2):
+            with open(TESTFN, "wb") as f:
+                f.write(s[:N])

why not `range(len(s))` instead?
msg176809 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-12-02 20:59
I just copy it from Alan's test. Actually this is not needed, `range(len(s))` can be used.
msg180683 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-26 17:01
Patch updated. Now the test use io.BytesIO() for input too. A loop limit changed from len() -2 to len().

If there are no objections I'll commit this patch next week.
msg181019 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-01-31 13:33
New changeset 32de35f0f877 by Serhiy Storchaka in branch '2.7':
Issue #4844: ZipFile now raises BadZipfile when opens a ZIP file with an

New changeset 01147e468c8c by Serhiy Storchaka in branch '3.2':
Issue #4844: ZipFile now raises BadZipFile when opens a ZIP file with an

New changeset 46f24a18a4ab by Serhiy Storchaka in branch '3.3':
Issue #4844: ZipFile now raises BadZipFile when opens a ZIP file with an

New changeset e406b8bd7b38 by Serhiy Storchaka in branch 'default':
Issue #4844: ZipFile now raises BadZipFile when opens a ZIP file with an
Date User Action Args
2022-04-11 14:56:43adminsetgithub: 49094
2013-01-31 14:15:48serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013-01-31 13:33:39python-devsetnosy: + python-dev
messages: + msg181019
2013-01-26 17:01:19serhiy.storchakasetfiles: + zipfile_unpack_check.patch
assignee: mcherm -> serhiy.storchaka
messages: + msg180683
2013-01-26 16:55:31serhiy.storchakasetfiles: - zipfile_unpack_check.patch
2012-12-02 20:59:57serhiy.storchakasetmessages: + msg176809
2012-12-02 18:15:09pitrousetnosy: + pitrou
messages: + msg176803
2012-12-01 14:21:49serhiy.storchakasetfiles: + zipfile_unpack_check.patch
versions: + Python 2.7, Python 3.2, Python 3.3, Python 3.4, - Python 2.6
messages: + msg176744

type: behavior
stage: patch review
2012-04-07 19:17:15serhiy.storchakasetnosy: + serhiy.storchaka
2010-09-19 21:59:23alanmcintyresetmessages: + msg116889
2010-09-19 21:06:09neologixsetnosy: + neologix
messages: + msg116885
2010-08-22 01:39:50alanmcintyresetfiles: + issue4844-with-test.diff

messages: + msg114636
2010-08-21 23:43:10georg.brandlsetassignee: mcherm

nosy: + alanmcintyre, mcherm
2009-01-05 21:05:22gpolosetfiles: + issue4844.diff
keywords: + patch
2009-01-05 15:36:03ymgvesetfiles: +
messages: + msg79158
2009-01-05 15:28:17ebfesetnosy: + ebfe
messages: + msg79156
2009-01-05 15:24:10ymgvecreate