This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile should catch ValueError as well as OSError to detect bad seek calls
Type: Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: FFY00, iritkatriel, llllllllll
Priority: normal Keywords:

Created on 2017-02-07 03:14 by llllllllll, last changed 2022-04-11 14:58 by admin.

Messages (3)
msg287185 - (view) Author: Joe Jevnik (llllllllll) * Date: 2017-02-07 03:14
In zipfile.py only OSError is checked to see if seek fails:

```
def _EndRecData64(fpin, offset, endrec):
    """
    Read the ZIP64 end-of-archive records and use that to update endrec
    """
    try:
        fpin.seek(offset - sizeEndCentDir64Locator, 2)
    except OSError:
        # If the seek fails, the file is not large enough to contain a ZIP64
        # end-of-archive record, so just return the end record we were given.
        return endrec
```

I belive that this should also catch ValueError so that other file-like objects may be passed to ZipFile. The particular case I ran into was passing an mmap object:

```
"""
$ python p.py
sys.version_info(major=3, minor=6, micro=0, releaselevel='final', serial=0)
[]
Traceback (most recent call last):
  File "p.py", line 34, in <module>
    with mmap_shared_raw_zipfile(f.name) as zf:
  File "/usr/lib64/python3.6/contextlib.py", line 82, in __enter__
    return next(self.gen)
  File "p.py", line 23, in mmap_shared_raw_zipfile
    ZipFile(mm) as zf:
  File "/usr/lib64/python3.6/zipfile.py", line 1100, in __init__
    self._RealGetContents()
  File "/usr/lib64/python3.6/zipfile.py", line 1163, in _RealGetContents
    endrec = _EndRecData(fp)
  File "/usr/lib64/python3.6/zipfile.py", line 264, in _EndRecData
    return _EndRecData64(fpin, -sizeEndCentDir, endrec)
  File "/usr/lib64/python3.6/zipfile.py", line 196, in _EndRecData64
    fpin.seek(offset - sizeEndCentDir64Locator, 2)
ValueError: seek out of range
"""
from contextlib import contextmanager
import mmap
import sys
from tempfile import NamedTemporaryFile
from zipfile import ZipFile


print(sys.version_info)


@contextmanager
def mmap_shared_raw_zipfile(path):
    """Open a zipfile with mmap shared so the data can be shared in multiple
    processes.

    Parameters
    ----------
    path : str
        The path the zipfile to open.

    Notes
    -----
    The context manager returns a :class:`zipfile.ZipFile` on enter.
    """
    with open(path) as f, \
            mmap.mmap(f.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) as mm, \
            ZipFile(mm) as zf:
        yield zf


with NamedTemporaryFile() as f:
    ZipFile(f, mode='w').close()
    print(ZipFile(f.name).infolist())


with NamedTemporaryFile() as f:
    ZipFile(f, mode='w').close()
    with mmap_shared_raw_zipfile(f.name) as zf:
        print(zf.infolist())
```
msg393314 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-05-09 10:27
Maybe a better option is to change mmap's seek() to raise an OSError, because it's supposed to behave like a file object.
msg394263 - (view) Author: Filipe Laíns (FFY00) * (Python triager) Date: 2021-05-24 18:42
That would not stay true to its meaning. AFAIK there are no implied exceptions in file objects. Given the meaning of ValueError, I'd say it is appropriate here.
History
Date User Action Args
2022-04-11 14:58:42adminsetgithub: 73654
2021-05-24 18:42:06FFY00setnosy: + FFY00
messages: + msg394263
2021-05-09 10:27:50iritkatrielsetnosy: + iritkatriel

messages: + msg393314
versions: + Python 3.11, - Python 3.6
2017-02-07 03:14:05llllllllllcreate