This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: importlib.metadata.version can return None
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: jaraco Nosy List: David Robertson, jaraco
Priority: normal Keywords:

Created on 2022-03-18 18:11 by David Robertson, last changed 2022-04-11 14:59 by admin.

Messages (2)
msg415516 - (view) Author: David Robertson (David Robertson) Date: 2022-03-18 18:11
Originally written up at the typeshed repo: https://github.com/python/typeshed/issues/7513. The conclusion was that this is a bug in the implementation rather than an incorrect annotation.

To my surprise, I discovered in https://github.com/matrix-org/synapse/issues/12223 that it is possible for `importlib.metadata.version(...)` to return `None`. To reproduce this:

1. Create a new virtual environment. I'm using CPython 3.10.2 as my interpreter.
2. Within the venv, `pip install bottle`. (Any package will do; I choose `bottle` because it's small and doesn't have any dependencies).
3. Check importlib reports the `version` of `bottle`:
   ```python
   >>> import importlib.metadata as m
   >>> m.version('bottle')
   '0.12.19'
   ```
4. Here's the dirty bit: remove the metadata files for that package but keep the metadata directory.
   - Use `pip show bottle` to find the `site-packages` location
   - From there, remove all files in the `bottle-VERSION-.dist-info` directory: `rm /path/to/site-packages/bottle-VERSION.dist-info/*'.
5. The `version` of `bottle` is now judged to be `None`:
   ```python
   >>> import importlib.metadata as m
   >>> m.version("bottle") is None
   True
   ```
   `pip show bottle` now determines that `bottle` isn't installed:
   ```shell
   $ pip show bottle
   WARNING: Package(s) not found: bottle
   ```

As well as importlib.metadata.version, importlib.metadata.Distribution.version and importlib.metadata.Distribution.name return None in this situation.

I couldn't see any suggestion in the stdlib docs (https://docs.python.org/3.10/library/importlib.metadata.html#distribution-versions) that this was possible. (Aside: it'd be great if the docs mention that PackageNotFoundError is raised if a package is not installed.)

No-one in their right mind should do step 4 willingly, but I have seen it happen in the wild (https://github.com/matrix-org/synapse/issues/12223). We suspected a botched backup or similar was to blame.

I'm not familiar with all the machinery of Python package management, but I think I'd expect there to be a PackageNotFoundError raised in this situation? (I can imagine a package that doesn't declare its version, where `version()` returning `None` might make sense; but that feels odd.) Is the behaviour as intended?

It looks like this might be related to https://github.com/python/importlib_metadata/issues/371?
msg415529 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2022-03-18 21:42
Thanks for the report.

Yes, the issues are related, where .version and .name returning None are specific manifestations of the metadata not having that key and the behavior being ill-defined.

I haven't yet decided if metadata items being undefined should result in None or raise an Exception (maybe KeyError).

For the specific case of a missing Name or Version, however, the packaging spec says that these fields are required (https://packaging.python.org/en/latest/specifications/core-metadata/#core-metadata-specifications), so it may be reasonable for the behavior when the specification is not met that the resulting behavior would be undefined (i.e. importlib.metadata should be able to assume the specification). It's outside the scope of importlib.metadata to detect, report, and repair invalid metadata. I would welcome and even encourage a third-party package to take on the responsibility of validating all distributions in an environment and reporting on non-compliant aspects.

In that sense, the type declaration is correct. `.name` and `.version` should always return `str` or raise an exception.

This additional example leads me stronger toward the position that `.metadata[missing]` should raise a KeyError, which would also fix this issue.

I'd also argue that if the metadata file is missing altogether, that should perhaps be a different error. That is, missing metadata is different from null metadata. Right now, the two are indistinguishable from the interface.

> I'd expect there to be a PackageNotFoundError raised in this situation

That doesn't sound quite right to me. If there's a `.dist-info` directory, that implies a package is present. e.g.:

```
~ $ mkdir foo.dist-info
~ $ py -c "import importlib.metadata as md; print(md.distribution('foo'))"
<importlib.metadata.PathDistribution object at 0x1032f8580>
```

I'm going to ponder this one some more and probably address the `.metadata` issue(s) first before making any pronouncements on the best approach here.
History
Date User Action Args
2022-04-11 14:59:57adminsetgithub: 91216
2022-03-20 19:58:49jaracosetassignee: jaraco
2022-03-18 21:42:56jaracosetmessages: + msg415529
2022-03-18 18:16:59zach.waresetnosy: + jaraco
2022-03-18 18:11:51David Robertsoncreate