This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ValueError in zipfile.ZipFile
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: iritkatriel, jvoisin, sam_ezeh, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-12-16 12:58 by jvoisin, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
crash-4da08e9ababa495ac51ecad588fd61081a66b5bb6e7a0e791f44907fa274ec62 jvoisin, 2019-12-16 12:58
Pull Requests
URL Status Linked Edit
PR 30863 closed iritkatriel, 2022-01-24 22:35
PR 32291 open sam_ezeh, 2022-04-03 18:36
Messages (8)
msg358484 - (view) Author: jvoisin (jvoisin) Date: 2019-12-16 12:58
The attached file produces the following stacktrace when opened via `zipfile.ZipFile`, on Python 3.7.5rc1:

```
$ cat ziprepro.py 
import zipfile
import sys

zipfile.ZipFile(sys.argv[1])
```

```
$ python3 ziprepro.py crash-4da08e9ababa495ac51ecad588fd61081a66b5bb6e7a0e791f44907fa274ec62
Traceback (most recent call last):
  File "ziprepro.py", line 4, in <module>
    zipfile.ZipFile(sys.argv[1])
  File "/usr/lib/python3.7/zipfile.py", line 1225, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.7/zipfile.py", line 1310, in _RealGetContents
    fp.seek(self.start_dir, 0)
ValueError: cannot fit 'int' into an offset-sized integer
```

The ValueError exception isn't documented as a possible exception when using zipfile.ZipFile ( https://docs.python.org/3/library/tarfile.html ).
msg410707 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2022-01-16 18:26
It's unlikely that anyone will download a binary from bpo and open it. Can you help us reproduce the issue without that?

First question is whether you can reproduce this on a version of python that is still in maintenance - 3.9 or higher?
msg410760 - (view) Author: jvoisin (jvoisin) Date: 2022-01-17 11:31
Yes, I can reproduce it:

```
$ python3 --version
Python 3.9.9

$ python3.9 ziprepo.py ./crash-4da08e9ababa495ac51ecad588fd61081a66b5bb6e7a0e791f44907fa274ec62 
Traceback (most recent call last):
  File "/home/jvoisin/Downloads/ziprepo.py", line 4, in <module>
    zipfile.ZipFile(sys.argv[1])
  File "/usr/lib/python3.9/zipfile.py", line 1257, in __init__
    self._RealGetContents()
  File "/usr/lib/python3.9/zipfile.py", line 1342, in _RealGetContents
    fp.seek(self.start_dir, 0)
ValueError: cannot fit 'int' into an offset-sized integer
$
```

> It's unlikely that anyone will download a binary from bpo and open it. Can you help us reproduce the issue without that?

The *binary* is a corrupted zip file to open with `zipfile.ZipFile()`, it can't be executed on its own.
msg411523 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2022-01-24 22:37
It's easy enough to convert the exception type (see patch), but I don't know how to write a unit test for this.
msg416632 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-04-03 16:41
Try to create a normal ZIP file (it can be empty), then try to set some byte to FF (or a pair of bytes to FFFF, or 4 consequent bytes to FFFFFFFF, until you get the exactly same error). Then you can just add the binary dump of that file in tests.
msg416634 - (view) Author: Sam Ezeh (sam_ezeh) * Date: 2022-04-03 18:11
One way of doing this is by making the central directory offset negative by first taking the zip file containing just an EOCD record and then listing the total size of the central directory records as positive.

```
Python 3.11.0a4+ (heads/bpo-39064:eb1935dacf, Apr  3 2022, 19:09:53) [GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> import io
>>> b = [80, 75, 5, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>> b[12] = 1
>>> f = io.BytesIO(bytes(b))
>>> zipfile.ZipFile(f)
Traceback (most recent call last):
  File "/run/media/sam/OS/Git/cpython/Lib/zipfile.py", line 1370, in _RealGetContents
    fp.seek(self.start_dir, 0)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: negative seek value -1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/run/media/sam/OS/Git/cpython/Lib/zipfile.py", line 1284, in __init__
    self._RealGetContents()
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/sam/OS/Git/cpython/Lib/zipfile.py", line 1372, in _RealGetContents
    raise BadZipFile("Bad offset for central directory")
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
zipfile.BadZipFile: Bad offset for central directory
>>> 
```
msg416635 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2022-04-03 18:16
Sam, can you put that in a PR please?
msg416636 - (view) Author: Sam Ezeh (sam_ezeh) * Date: 2022-04-03 18:17
Yes, of course.
History
Date User Action Args
2022-04-11 14:59:24adminsetgithub: 83245
2022-04-03 18:36:35sam_ezehsetpull_requests: + pull_request30353
2022-04-03 18:17:25sam_ezehsetmessages: + msg416636
2022-04-03 18:16:10iritkatrielsetmessages: + msg416635
2022-04-03 18:11:13sam_ezehsetmessages: + msg416634
2022-04-03 16:41:04serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg416632
2022-04-03 16:13:15sam_ezehsetnosy: + sam_ezeh
2022-01-24 22:37:27iritkatrielsetmessages: + msg411523
2022-01-24 22:35:30iritkatrielsetkeywords: + patch
stage: patch review
pull_requests: + pull_request29044
2022-01-17 11:43:00iritkatrielsetversions: + Python 3.9, - Python 3.7
2022-01-17 11:31:49jvoisinsetstatus: pending -> open

messages: + msg410760
2022-01-16 18:26:48iritkatrielsetstatus: open -> pending
nosy: + iritkatriel
messages: + msg410707

2019-12-16 12:58:42jvoisincreate