classification
Title: Port importlib_metadata to Python 3.8
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: barry Nosy List: arne, barry, brett.cannon, jaraco, rhettinger, vstinner, yan12125
Priority: normal Keywords: patch

Created on 2018-09-11 18:16 by barry, last changed 2019-10-11 17:07 by arne. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 9327 closed barry, 2018-09-14 23:43
PR 12547 merged jaraco, 2019-03-26 02:27
PR 13563 merged yan12125, 2019-05-25 07:51
PR 13565 closed yan12125, 2019-05-25 10:01
PR 13566 merged jaraco, 2019-05-25 13:42
Messages (20)
msg325043 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2018-09-11 18:16
https://importlib_metadata.rtfd.org

We're fleshing out the API and implementation in the standalone library, but once we're confident of the API and semantics, we will want to port this into Python 3.8.
msg343440 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-05-24 23:59
New changeset 1bbf7b661f0ac8aac12d5531928d9a85c98ec1a9 by Barry Warsaw (Jason R. Coombs) in branch 'master':
bpo-34632: Add importlib.metadata (GH-12547)
https://github.com/python/cpython/commit/1bbf7b661f0ac8aac12d5531928d9a85c98ec1a9
msg343441 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-05-24 23:59
Thanks @jaraco!  This is now merged into 3.8.
msg343445 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-25 00:35
Unhappy buildbot: AMD64 Fedora Rawhide Clang Installed 3.x

https://buildbot.python.org/all/#/builders/188/builds/302

Example:

0:00:28 load avg: 4.02 [182/422/1] test_importlib failed
Failed to import test module: test.test_importlib.test_main
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora.installed/build/target/lib/python3.8/unittest/loader.py", line 436, in _find_test_path
    module = self._get_module_from_name(name)
  File "/home/buildbot/buildarea/3.x.cstratak-fedora.installed/build/target/lib/python3.8/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/home/buildbot/buildarea/3.x.cstratak-fedora.installed/build/target/lib/python3.8/test/test_importlib/test_main.py", line 6, in <module>
    import importlib.metadata
ModuleNotFoundError: No module named 'importlib.metadata'
msg343457 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-05-25 07:53
I got the same ModuleNotFoundError on Arch Linux and https://github.com/python/cpython/pull/13563 fixes it. I believe it can fix the issue on Fedora buildbots, too.
msg343459 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-05-25 08:08
I started trying to replicate the failure. I got as far as this Dockerfile:

```
FROM fedora:rawhide

RUN yum install -y clang make git

RUN git clone https://github.com/python/cpython
WORKDIR cpython
RUN ./configure
RUN make
```

And then running `./python Tools/scripts/run_tests.py test_importlib`, but the tests fail due to zlib not being installed.

Sounds like yan12125 has the fix, so I'm shelving my investigation.
msg343460 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-05-25 08:09
New changeset c3738cfe63b1f2c1dc4a28d0ff9adb4e9e3aae1f by Jason R. Coombs (Chih-Hsuan Yen) in branch 'master':
bpo-34632: fix installation of importlib.metadata (#13563)
https://github.com/python/cpython/commit/c3738cfe63b1f2c1dc4a28d0ff9adb4e9e3aae1f
msg343461 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-05-25 08:25
Oops apparently my fix is incomplete. From the builder "AMD64 Fedora Rawhide Clang Installed 3.x" [1]:

ModuleNotFoundError: No module named 'test.test_importlib.data'

[1] https://buildbot.python.org/all/api/v2/logs/824407/raw
msg343462 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-05-25 08:27
By the way, I think Python.framework is not needed? https://github.com/python/cpython/commit/1bbf7b661f0ac8aac12d5531928d9a85c98ec1a9#diff-206dc381e448d5121da9a6040a2b13c1
msg343465 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-05-25 10:24
I managed to create a setup similar to the buildbot builder "AMD64 Fedora Rawhide Clang Installed 3.x" [1] on Arch Linux. Running test_importlib on an installed CPython copy is fine now:

$ /usr/bin/python3.8 -m test.regrtest test_importlib
Run tests sequentially
0:00:00 load avg: 0.14 [1/1] test_importlib

== Tests result: SUCCESS ==

1 test OK.

Total duration: 1 sec 288 ms
Tests result: SUCCESS

I apologize for not checking things carefully and misunderstanding the issue on "AMD64 Fedora Rawhide Clang Installed 3.x".

[1] https://github.com/python/buildmaster-config/blob/master/master/custom/factories.py
msg343480 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-05-25 13:59
> By the way, I think Python.framework is not needed?

Correct. That was an artifact that I unintentionally added.

I've submitted https://github.com/python/cpython/pull/13566 to address the two concerns.

I've also opened issue37043 and issue37044 to address the causes of these emergent failures.
msg343481 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-05-25 14:00
New changeset f7fba6cfb62edfc22e9b2e12a00ebaf5f348398e by Jason R. Coombs in branch 'master':
bpo-34632 fix buildbots and remove artifact (GH-13566)
https://github.com/python/cpython/commit/f7fba6cfb62edfc22e9b2e12a00ebaf5f348398e
msg343483 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-05-25 14:10
I believe buildbots are fixed. Please re-open if you find otherwise.
msg349415 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-11 22:27
Quick question:  Is there a reason that requires() and files() return iterators instead of lists?  ISTM that a list-based solution would be more usable than returning a starmap() object or somesuch.  I suspect almost every user would have to call list(files(package)) rather than files(package).  An iterator return type would only make sense if we need the values are produces lazily or if a known consumer required an iterator input.

Also consider changing the parameter from files(package) to files(package_name).  When I first tried-out this API, I typed:  "import requests; files(requests)" instead of "files('requests')".

Sorry to bring this up at a late stage, but the purpose of a beta release is to let other users try-out the API while there is still a chance to make adjustments.
msg349416 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2019-08-11 23:09
@jaraco will be able to answer that better than me.  I actually thought those did return concrete lists.

I also thought that the APIs accepted either a module or a package name, but maybe I'm thinking about importlib.resources.  Again, @jaraco can clarify, but I think the problem is that there's no unambiguous mapping between packages and package names for metadata the way there is for resources.
msg349421 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-08-12 00:27
> Is there a reason that requires() and files() return iterators instead of lists?

I'm a huge fan of `itertools` and Python 3's move to prefer iterables over materialized lists, and I feel that forcing materialized results gives the caller less control over the results.

Following the same pattern that many standard Python objects return (`open`, `map`, `filter`), the approach is less constrained in that it can support arbitrarily large results. I wished to leave it to the caller to materialize a list if that was needed and my assumption was that 90% of the use-cases would be iterating over the results once.

> I also thought that the APIs accepted either a module or a package name,

Early on, I had hoped to have the API accept either the distribution package name or a Python package... and I even started creating a protocol for package vendors to provide a reference from their module or package back to the distribution package name. But I decided that approach was to invasive and unlikely to get widespread support, but also that it added little value.

What importlib really works with is distribution packages (also known as Projects in PyPI) and not Python packages... and it works at an earlier abstraction (often you want to know metadata about a package without importing it).

> Also consider changing the parameter from files(package) to files(package_name).

I think at one point, the parameter name was distribution_name_or_package. We removed the acceptance of packages, but then renamed the parameter to 'package' for brevity. This parameter is used in many functions (files, requires, version, metadata, distribution). We'd want to change it in all of those. Once it becomes a parameter of the Distribution class (such as in Distribution.from_name), the 'distribution' is implied, so 'name' is clear enough. I do try to avoid long and multi-word parameters when possible. Perhaps more appropriate would be 'distribution_name' or 'dist_name'.

I'm leaning toward 'dist_name' right now. What do you think?
msg349423 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-12 00:52
> Following the same pattern that many standard Python objects 
> return (`open`, `map`, `filter`), the approach is less 
> constrained in that it can support arbitrarily large results.
> I wished to leave it to the caller to materialize a list if 
> that was needed and my assumption was that 90% of the use-cases
> would be iterating over the results once.

My recommendation is to return a concrete list.  Shifting the responsibility to the user makes the API less convenient, particularly if someone is running this at the interactive prompt and just wants to see the results.

We replaced the list version of map() with itertools.imap() for memory efficiency with potentially enormous or infinite inputs.  However, this came at a cost for usability.  In every single course where I present map() there is an immediate stumble over the need to wrap it in list() just to see the output.  In general, we should save the iterators for cases where there is real value in lazy execution.  Otherwise, usability, inspectability, sliceability, and re-iterability wins.  (Just think of how awkward it would be if dir() or os.listdir() returned an iterator instead of a list.)

FWIW, the doc string for requires() is:

   Return a list of requirements for the indicated distribution


> Perhaps more appropriate would be 'distribution_name' or 'dist_name'.

I recommend 'distribution_name'.  It will normally be used as a positional parameter but the full name will show-up in the tool tips, providing guidance on how to use it.



When you get a chance, please look at https://github.com/python/cpython/pull/15204
where I'm presenting your creation to the world.

Overall, I think this was a nice addition to the standard library.  Thanks for your work :-)
msg352105 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2019-09-12 11:00
I've addressed the requests made by rhettinger in issue38086 and issue38121.
msg354466 - (view) Author: Arne Recknagel (arne) Date: 2019-10-11 16:53
Is there a reason the object returned by importlib.metadata.metadata is an EmailMessage and not a dict? If it quacks like a duck it should be a duck, no?
msg354469 - (view) Author: Arne Recknagel (arne) Date: 2019-10-11 17:07
I just learned that metadata is stored as an email, and changing the format was rejected in PEP 426. Be that as it may, if it isn't too much of an issue it might still be something that should be hidden from users of the module. Noone wants to know that this particular duck is actually powered by fins under the surface, right?
History
Date User Action Args
2019-10-11 17:07:27arnesetmessages: + msg354469
2019-10-11 16:53:36arnesetnosy: + arne
messages: + msg354466
2019-09-12 11:00:18jaracosetstatus: open -> closed

messages: + msg352105
2019-08-12 00:52:13rhettingersetmessages: + msg349423
2019-08-12 00:27:51jaracosetmessages: + msg349421
2019-08-11 23:09:59barrysetmessages: + msg349416
2019-08-11 22:27:11rhettingersetstatus: closed -> open
nosy: + rhettinger
messages: + msg349415

2019-05-25 14:10:49jaracosetstatus: open -> closed
resolution: fixed
messages: + msg343483

stage: patch review -> resolved
2019-05-25 14:00:29jaracosetmessages: + msg343481
2019-05-25 13:59:46jaracosetmessages: + msg343480
2019-05-25 13:42:01jaracosetpull_requests: + pull_request13476
2019-05-25 10:24:58yan12125setmessages: + msg343465
2019-05-25 10:01:50yan12125setstage: resolved -> patch review
pull_requests: + pull_request13475
2019-05-25 08:27:55yan12125setmessages: + msg343462
2019-05-25 08:25:08yan12125setmessages: + msg343461
2019-05-25 08:09:43jaracosetmessages: + msg343460
2019-05-25 08:08:47jaracosetmessages: + msg343459
stage: patch review -> resolved
2019-05-25 07:53:21yan12125setnosy: + yan12125
messages: + msg343457
2019-05-25 07:51:34yan12125setstage: resolved -> patch review
pull_requests: + pull_request13474
2019-05-25 00:35:20vstinnersetstatus: closed -> open

nosy: + vstinner
messages: + msg343445

resolution: fixed -> (no value)
2019-05-24 23:59:46barrysetstatus: open -> closed
resolution: fixed
messages: + msg343441

stage: patch review -> resolved
2019-05-24 23:59:09barrysetmessages: + msg343440
2019-03-26 02:27:12jaracosetpull_requests: + pull_request12496
2018-09-14 23:43:26barrysetkeywords: + patch
stage: patch review
pull_requests: + pull_request8750
2018-09-11 18:16:43barrycreate