This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile.Path / importlib.resources raises KeyError if a file wasn't found
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: The Compiler, alanmcintyre, brett.cannon, eamanu, jaraco, lukasz.langa, serhiy.storchaka, twouters
Priority: normal Keywords:

Created on 2021-01-29 16:56 by The Compiler, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (4)
msg385919 - (view) Author: Florian Bruhin (The Compiler) * Date: 2021-01-29 16:56
When a package is installed as an egg, importlib.resources.files returns a zipfile.Path rather than a pathlib.Path (maybe it returns other things too, seeing that it's documented to return a importlib.abc.Traversable - I didn't check).

However, those two have a rather odd inconsistency when it comes to reading files which don't actually exist. In that case, zipfile.Path raises KeyError rather than FileNotFoundError. After a "zip tmp/test.zip somefile":

    >>> import zipfile
    >>> p = zipfile.Path('tmp/test.zip')
    >>> (p / 'helloworld').read_text()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/zipfile.py", line 2318, in read_text
        with self.open('r', *args, **kwargs) as strm:
      File "/usr/lib/python3.9/zipfile.py", line 2306, in open
        stream = self.root.open(self.at, zip_mode, pwd=pwd)
      File "/usr/lib/python3.9/zipfile.py", line 1502, in open
        zinfo = self.getinfo(name)
      File "/usr/lib/python3.9/zipfile.py", line 1429, in getinfo
        raise KeyError(
    KeyError: "There is no item named 'helloworld' in the archive"

Note that the "zipp" backport (used by the "importlib_resources" backport) does raise FileNotFoundError instead:

    >>> import zipp
    >>> p2 = zipp.Path('tmp/test.zip')
    >>> (p2 / 'helloworld').read_text()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/site-packages/zipp.py", line 267, in read_text
        with self.open('r', *args, **kwargs) as strm:
      File "/usr/lib/python3.9/site-packages/zipp.py", line 250, in open
        raise FileNotFoundError(self)
    FileNotFoundError: tmp/test.zip/helloworld

And of course, so does pathlib.Path:

    >>> import pathlib
    >>> p3 = pathlib.Path('tmp')
    >>> (p3 / 'helloworld').read_text()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/pathlib.py", line 1255, in read_text
        with self.open(mode='r', encoding=encoding, errors=errors) as f:
      File "/usr/lib/python3.9/pathlib.py", line 1241, in open
        return io.open(self, mode, buffering, encoding, errors, newline,
      File "/usr/lib/python3.9/pathlib.py", line 1109, in _opener
        return self._accessor.open(self, flags, mode)
    FileNotFoundError: [Errno 2] No such file or directory: 'tmp/helloworld'

When using `importlib.resources.files`, this can be very surprising - especially because during testing, the package might not be installed as an egg, so this never turns up until users complain about it (which is what happened in my case - no bad feelings though!).

This seems to have been fixed by jaraco in ebbe8033b1c61854c4b623aaf9c3e170d179f875, by introducing an explicit:

    if not self.exists() and zip_mode == 'r':
        raise FileNotFoundError(self)

in open(), as part of what seems like an unrelated change (bpo-40564 / GH-22371).

While this is arguably a backwards-compatible change between 3.9 and 3.10, it might be a good idea to either adjust Python 3.9 to have the same behavior (perhaps with a new exception which inherits from both KeyError and FileNotFoundError?). At the very least, I feel like this should be documented prominently in the importlib.resources.files, importlib.abc.Traversable and zipfile.Path documentation for 3.9.

As an aside, the error message when using `.iterdir()` is similarly confusing, though that's at least consistent between the stdlib and the zipp backport:

    >>> (p / 'helloworld').iterdir()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/zipfile.py", line 2342, in iterdir
        raise ValueError("Can't listdir a file")
    ValueError: Can't listdir a file

    >>> (p2 / 'helloworld').iterdir()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/site-packages/zipp.py", line 291, in iterdir
        raise ValueError("Can't listdir a file")
    ValueError: Can't listdir a file

    >>> list((p3 / 'helloworld').iterdir())
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/pathlib.py", line 1149, in iterdir
        for name in self._accessor.listdir(self):
    FileNotFoundError: [Errno 2] No such file or directory: 'tmp/helloworld'
msg385920 - (view) Author: Florian Bruhin (The Compiler) * Date: 2021-01-29 17:16
Whoops, I was mistaken about Python 3.8 not being affected. I thought it wouldn't be because importlib.resources.files was added in 3.9, but zipfile.Path was added in 3.8.

With that in mind, I guess changing the behavior of 3.9 would be rather confusing, though I wonder what others think.

(Also, "arguably a backwards-compatible change" in my previous message should say *incompatible* - sorry about that!)
msg385956 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-01-29 22:32
The change to error handling for zipp.Path was added in https://github.com/jaraco/zipp/issues/46 and released as [3.1.0](https://zipp.readthedocs.io/en/latest/history.html#v3-1-0). Probably that change was incorporated into CPython shortly thereafter with bpo-40564, as you observed.

I agree with you, backporting these as bugfixes doesn't feel appropriate. It is a change in behavior. On the other hand, it's a change within the documented scope of the API (that is, it was never stipulated what the behavior would be). My slight preference is to leave the CPython version alone and to recommend the use of the backports to get Python 3.10 compatibility.

It really comes down to the judgment of the release manager. Łukasz, how do you feel about changing the exception that's raised when a directory or file doesn't exist from a KeyError to a more-appropriate OSError for future compatibility?
msg394503 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-27 01:07
If this issue affects you, please use the `zipp` backport. I realize there are some use-cases that aren't readily amenable to relying on the backport. Please report any such use-case here as they may provide a justification for back-porting the change.
History
Date User Action Args
2022-04-11 14:59:40adminsetgithub: 87229
2021-05-27 01:07:47jaracosetstatus: open -> closed
resolution: wont fix
messages: + msg394503

stage: resolved
2021-02-01 19:59:38eamanusetnosy: + eamanu
2021-01-29 22:32:55jaracosetnosy: + lukasz.langa
messages: + msg385956
2021-01-29 17:16:42The Compilersetmessages: + msg385920
versions: + Python 3.8
2021-01-29 16:56:11The Compilercreate