This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile.BadZipFile: File is not a zip file
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Yasar L. Ahmed, alexei.romanov, ronaldoussoren, serhiy.storchaka
Priority: normal Keywords:

Created on 2015-07-12 19:24 by Yasar L. Ahmed, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
not_working.zip Yasar L. Ahmed, 2015-07-12 19:24 non working zip-file
Messages (7)
msg246662 - (view) Author: Yasar L. Ahmed (Yasar L. Ahmed) Date: 2015-07-12 19:24
I have a zip-file that can be opened/extracted with 7zip but gives an error when I try work with it in python:
Py2 (2.7.3 / Linux/Debian7):
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/zipfile.py", line 714, in __init__
    self._GetContents()
  File "/usr/lib/python2.7/zipfile.py", line 748, in _GetContents
    self._RealGetContents()
  File "/usr/lib/python2.7/zipfile.py", line 763, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file


Py3 (3.4.3 / Windows 7 /64bit):
Traceback (most recent call last):
  File "<pyshell#20>", line 1, in <module>
    badzip = ZipFile(bad)
  File "C:\Python34\lib\zipfile.py", line 937, in __init__
    self._RealGetContents()
  File "C:\Python34\lib\zipfile.py", line 978, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

The zip-file is attached.
msg246794 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-16 07:29
unzip can't proceed this file too.

$ unzip -v not_working.zip 
Archive:  not_working.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of not_working.zip or
        not_working.zip.zip, and cannot find not_working.zip.ZIP, period.
msg246857 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-17 15:05
not_working.zip has 85972 extra null bytes at the end. This doesn't look as common ZIP file, and adding support such files can be considered as new feature (if it is worth to do at all). How did you get this file Yasar?
msg246858 - (view) Author: Alexei Romanov (alexei.romanov) Date: 2015-07-17 15:47
7z archiver could extract this ZIP archive without any problems:
~/tmp $ 7z x not_working.zip 

7-Zip [64] 9.20  Copyright (c) 1999-2010 Igor Pavlov  2010-11-18
p7zip Version 9.20 (locale=en_US.utf8,Utf16=on,HugeFiles=on,4 CPUs)

Processing archive: not_working.zip

Extracting  CoordinateData.AmplitudesDataType
Extracting  CoordinateData.Amplitudes
Extracting  CoordinateData.VolumesDataType
Extracting  CoordinateData.Volumes

Everything is Ok

Files: 4
Size:       103746
Compressed: 176384
msg246861 - (view) Author: Yasar L. Ahmed (Yasar L. Ahmed) Date: 2015-07-17 17:34
@Serhiy These files are inside another Zip-bundle exported from a commercial control software for chromatography (UNICORN 6+ by GE Healthcare). Some of the other Zip-Files in the bundle work fine but some (like this one) don't.

I'm writing a script to extract/decode the deta so I'd be happy to get them extracted via python in some way (without requiring external dependencies).

Would it help to remove the offending bytes and then feed the bytes-object to ZipFile?
msg246864 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-17 18:31
> Would it help to remove the offending bytes and then feed the bytes-object to ZipFile?

Yes, it will.

import zipfile, struct, io
with open('not_working.zip', 'rb') as f:
    data = f.read()

i = data.rindex(b'PK\5\6') + 22
i += struct.unpack('<H', data[i-2: i])[0]
if data[i:].strip(b'\0') == b'':
    data = data[:i]

zf = zipfile.ZipFile(io.BytesIO(data))
msg247422 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-26 10:27
It is common to prepend some data (e.g. self-extractor exacutable) to ZIP file, but it is not common to append data to ZIP file. In this case all appended data is zero bytes, and it looks as particularity of particular software rather than common case. Thus I am closing this issue.
History
Date User Action Args
2022-04-11 14:58:18adminsetgithub: 68809
2015-07-26 10:27:07serhiy.storchakasetstatus: open -> closed
resolution: rejected
messages: + msg247422

stage: resolved
2015-07-17 18:31:42serhiy.storchakasetmessages: + msg246864
2015-07-17 17:34:00Yasar L. Ahmedsetmessages: + msg246861
2015-07-17 15:47:12alexei.romanovsetnosy: + alexei.romanov
messages: + msg246858
2015-07-17 15:05:06serhiy.storchakasettype: behavior -> enhancement
messages: + msg246857
versions: + Python 3.6, - Python 2.7, Python 3.4
2015-07-16 07:29:51serhiy.storchakasetmessages: + msg246794
2015-07-16 07:10:31ronaldoussorensetnosy: + ronaldoussoren
2015-07-12 19:36:15serhiy.storchakasetassignee: serhiy.storchaka

nosy: + serhiy.storchaka
2015-07-12 19:24:01Yasar L. Ahmedcreate