This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jozef Cernak, alanmcintyre, serhiy.storchaka, twouters
Priority: normal Keywords:

Created on 2019-04-09 09:49 by Jozef Cernak, last changed 2022-04-11 14:59 by admin.

Messages (7)
msg339722 - (view) Author: Jozef Cernak (Jozef Cernak) Date: 2019-04-09 09:49
Hi,
in the short program, that works well for password of 4 character, when I change password length I got this error (parameter MAXD)



Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

program:
import  string, zipfile, zlib

from zipfile import ZipFile

zf= ZipFile('11_02_2019.pdf.zip')

MAXD=6

upper_case=string.ascii_uppercase
uc=list(upper_case)

n=len(uc)
print (n)

pos=[]
for k in range(0,MAXD):
    pos.append(0)
    
print (pos) 


for let in range(0,n):
    print (let, uc[let]) 








let=0
koniec=0;
k3=0
p=0

while koniec != MAXD :
    
 
    
    k=0
    
    password=''
    for k2 in range(0,MAXD):
        
        password=password+uc[pos[k2]]
        
    print   (password)
  
           
    try:

        with zipfile.ZipFile('11_02_2019.pdf.zip') as zf:
            zf.extractall( pwd=password.encode('cp850','replace'))
            print ("Password found:" + password)
            exit(0)
        
    except RuntimeError:
        pass
    
    except zlib.error:
        pass
        
    
    #print "ppppppppppppppppppppppppp",p,  paswd

    
    pos[0]=pos[0]+1
    
    for k2  in range(0,MAXD-1):
        if pos[k2]>=n:
            pos[k2]=0
            pos[k2+1]=pos[k2+1]+1
    
    koniec=0
    
    for k2 in range(0,MAXD):
        if pos[k2] >= n-1:
            koniec=koniec+1
            


Similar behaviuor I observed in older version of python (2.7) and correspondig library.

The zip archive is procted by simple password 'ABCD', the file is not big less tha 1MB.

Best regards
Jozef
msg339727 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-09 10:47
Do you get an error when try to extract the file using the valid password?
msg339728 - (view) Author: Jozef Cernak (Jozef Cernak) Date: 2019-04-09 11:01
Dear Serhiy,
in the case of correct password, the program works well:

OACD
PACD
QACD
RACD
SACD
TACD
UACD
VACD
WACD
XACD
YACD
ZACD
ABCD
Password found:ABCD

for five characters:

RRJBA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

specially for RRJBA
AAAAA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

for six characters:

KMQAAA
LMQAAA
MMQAAA
NMQAAA
OMQAAA
PMQAAA
QMQAAA
RMQAAA
SMQAAA
TMQAAA
UMQAAA
VMQAAA
WMQAAA
XMQAAA
YMQAAA
ZMQAAA
ANQAAA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

It seems that after certain attempts command produces different behaviour
as in the previous attemts to call
   zf.extractall( pwd=password.encode('cp850','replace'))

Best regards

Jozef

On Tue, Apr 9, 2019 at 12:47 PM Serhiy Storchaka <report@bugs.python.org>
wrote:

>
> Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:
>
> Do you get an error when try to extract the file using the valid password?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue36573>
> _______________________________________
>
msg339729 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-09 11:05
If you try to extract the file using an invalid password, it is an expected behavior.
msg339730 - (view) Author: Jozef Cernak (Jozef Cernak) Date: 2019-04-09 11:09
Ok, however behaviur is detected after several attempts i.e. behaviour is
not regular but depends on the previous history, how or how many times
functions was called. I think such behaviur should indicate that function
store previous data, i.e. history.
Best regards
Jozef

On Tue, Apr 9, 2019 at 1:05 PM Serhiy Storchaka <report@bugs.python.org>
wrote:

>
> Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:
>
> If you try to extract the file using an invalid password, it is an
> expected behavior.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue36573>
> _______________________________________
>
msg339734 - (view) Author: Jozef Cernak (Jozef Cernak) Date: 2019-04-09 12:05
Hi,
I changed zipped file password to the new  string "RRJBB" that is a
combination after RRJBA to see what will happen.
At the  input combination KWFEA
I got the message:

KWFEA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019B.pdf'

  Jozef

On Tue, Apr 9, 2019 at 1:05 PM Serhiy Storchaka <report@bugs.python.org>
wrote:

>
> Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:
>
> If you try to extract the file using an invalid password, it is an
> expected behavior.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue36573>
> _______________________________________
>
msg339736 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-09 12:06
This is how the weak encryption in ZIP files work. In 255 cases from 256 the wrong password can be detected earlier (this make the encryption just weaker). But it 1 case of 256 this check is passed and you will get either an error of mismatched CRC, or the compressor specific error if use compression. There is even very small chance (1 of 2**32 or like) that you will silently get incorrectly decrypted data.

It is better to not use the weak encryption in ZIP files. If you need to encrypt data safely, use third-party encryption libraries.
History
Date User Action Args
2022-04-11 14:59:13adminsetgithub: 80754
2019-04-09 12:06:00serhiy.storchakasetmessages: + msg339736
2019-04-09 12:05:05Jozef Cernaksetmessages: + msg339734
2019-04-09 11:09:34Jozef Cernaksetmessages: + msg339730
2019-04-09 11:05:44serhiy.storchakasetmessages: + msg339729
2019-04-09 11:01:46Jozef Cernaksetmessages: + msg339728
2019-04-09 10:47:03serhiy.storchakasetmessages: + msg339727
2019-04-09 10:11:01SilentGhostsetnosy: + twouters, alanmcintyre, serhiy.storchaka
type: crash -> behavior
2019-04-09 09:49:43Jozef Cernakcreate