classification
Title: 3.10 beta 1: breaking change in importlib.metadata entry points
Type: Stage: resolved
Components: Versions: Python 3.11, Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Anthony Sottile, Zac Hatfield-Dodds, domdfcoding, jaraco, kj, miss-islington, steve.dower
Priority: normal Keywords: patch

Created on 2021-05-27 15:30 by Anthony Sottile, last changed 2021-06-10 12:38 by jaraco.

Pull Requests
URL Status Linked Edit
PR 26408 merged jaraco, 2021-05-27 16:16
PR 26415 merged miss-islington, 2021-05-27 23:36
PR 26467 merged jaraco, 2021-05-31 15:29
PR 26468 merged jaraco, 2021-05-31 15:36
PR 26469 merged miss-islington, 2021-05-31 15:52
PR 26471 merged miss-islington, 2021-05-31 16:19
PR 26472 merged jaraco, 2021-05-31 16:28
PR 26473 merged miss-islington, 2021-05-31 16:29
Messages (42)
msg394548 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-27 15:30
this is breaking code that's unfortunately out of my control (vendor) -- also it looks really wrong

```python
import importlib.metadata
print('looks like a list:')
print(importlib.metadata.distribution('pip').entry_points)
print('first item:')
print(importlib.metadata.distribution('pip').entry_points[0])
```

output in 3.9:

```console
$ ./venv39/bin/python t.py
looks like a list:
[EntryPoint(name='pip', value='pip._internal.cli.main:main', group='console_scripts'), EntryPoint(name='pip3', value='pip._internal.cli.main:main', group='console_scripts'), EntryPoint(name='pip3.8', value='pip._internal.cli.main:main', group='console_scripts')]
first item:
EntryPoint(name='pip', value='pip._internal.cli.main:main', group='console_scripts')
```

```console
$ venv310/bin/python t.py
looks like a list:
(EntryPoint(name='pip', value='pip._internal.cli.main:main', group='console_scripts'), EntryPoint(name='pip3', value='pip._internal.cli.main:main', group='console_scripts'), EntryPoint(name='pip3.8', value='pip._internal.cli.main:main', group='console_scripts'))
first item:
Traceback (most recent call last):
  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 217, in __getitem__
    return next(iter(self.select(name=name)))
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/y/t.py", line 5, in <module>
    print(importlib.metadata.distribution('pip').entry_points[0])
  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 219, in __getitem__
    raise KeyError(name)
KeyError: 0
```
msg394550 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2021-05-27 15:37
Seems this was reported in https://github.com/python/importlib_metadata/issues/300 . Closed in https://github.com/python/importlib_metadata/commit/5ca9bc7dcf73d72260486afb28dadf5e532cf657.
msg394552 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-27 15:48
This backward incompatibility was unintentionally introduced in importlib_metadata 3.6 (https://importlib-metadata.readthedocs.io/en/latest/history.html#v3-6-0, released Feb 23) and was previously reported in https://github.com/python/importlib_metadata/issues/300. While technically it's a breaking change, here's my analysis from that issue:

> Basically what it boils down to is that access by index of a `Distribution.entry_points` was dropped, but that appears not to be a problem. At least, this is the first report of such a problem. The assumption has been, and the tests bear this out implicitly, that the consumer of `Distribution.entry_points` will be iterated over and not accessed by index.

My assessment is that the issue is theoretically incompatible but in practice, this usage is not found in the wild. If there was a substantial need to maintain compatibility for this use-case, I'd definitely consider adding such compatibility. As it stands, however, I'm unaware of even a single use case that demands this interface.

Given the non-issue this has been for importlib_metadata, I expect it to be a non-issue for Python 3.10 as well.
msg394553 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-27 15:49
the "what's new" mentions nothing of this break, nor the aggressive deprecation warnings, nor the various shifted interfaces (such as this one): https://docs.python.org/3.10/whatsnew/3.10.html#importlib-metadata
msg394557 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-27 15:54
Yes, perhaps the What's New could be refreshed.

There is a Compatibility Note in the docs for `entry_points` about the deprecated usage (https://docs.python.org/3.10/library/importlib.metadata.html?highlight=importlib%20metadata#entry-points).

I want the What's New to be only those aspects that are particularly salient to the users, so I probably would not even mention the compatibility change reported above unless I expected it to affect users, but I wouldn't be opposed to more thorough messaging if you believe that would address the concern.
msg394559 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-27 16:03
personally I think they should be reverted -- they were not committed in spirit with the backwards compatibility policy: https://www.python.org/dev/peps/pep-0387/#making-incompatible-changes

- I don't think they were discussed thoroughly, and when opposition was presented it was not listened to thoroughly: https://github.com/python/importlib_metadata/pull/278
- the change significantly complicates importlib.metadata with lots of sneaky types (they all look like builtin types but do not act like them)
- it simultaneously introduces new apis and old apis which will both be around for extended periods of time but immediately moves to DeprecationWarning
- the new apis aren't remarkably better than the old apis -- the motivation was written as "typing issues" but without citing actual issues. in fact, the new items are significantly more complicated to type properly
- the change breaks many significantly important projects, from perusing related issues it's at the very least flake8, pandas, virtualenv, astropy, pytest, hypothesis -- and hundreds more from a quick github code search
msg394560 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-27 16:06
In the What's New, I used :func:`importlib.metadata.entry_points`, but that doesn't seem to resolve to the docs for the function. I need to figure out how to link to the entry_points anchor that's there.
msg394588 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-27 19:00
Although it feels the topic has shifted from the original concern (an unintentional incompatibility) to the broader topic of the API change to entry_points generally, I'm happy to address your comments:

> - I don't think they were discussed thoroughly, and when opposition was presented it was not listened to thoroughly: https://github.com/python/importlib_metadata/pull/278

Anyone reading that thread will see that I was responsive to your concerns, adapted the solution based on your concerns, and spent extra energy documenting the motivations for the change and exploring solutions until I came up with something that I believed would address the concerns and has since largely borne out that goal in the release.

> - the change significantly complicates importlib.metadata with lots of sneaky types (they all look like builtin types but do not act like them)

The solution that I ultimately settled on in python/importlib_metadata#278 does not use "sneaky" or "cutsey" types, but in fact reduces the number of magic types. EntryPoints is a tuple subclass, EntryPoint is a namedtuple subclass (with __iter__ magic deprecated), and SelectableGroups is a compatibility shim to be removed. Which types are sneaky and look like built-in types but do not act like them?

> - it simultaneously introduces new apis and old apis which will both be around for extended periods of time but immediately moves to DeprecationWarning

Given that backports are available, I saw no strong reason to delay the DeprecationWarning. I was open to the possibility that the transition could prove too taxing and the deprecation would have to be delayed. This early exposure means that most projects will already have addressed the deprecation concerns prior to the release of Python 3.10. In https://github.com/python/importlib_metadata/issues/298, the SQLAlchemy team helped me understand a nuanced use-case I hadn't considered, where libraries don't necessarily have the luxury of requiring newer backport releases, but in that case, I developed a solution that would provide future compatibility even with older stdlib and backport releases (through backports.entry_points_selectable). As far as I understand, there are no known use-cases that aren't satisfied by this design.

> - the new apis aren't remarkably better than the old apis -- the motivation was written as "typing issues" but without citing actual issues. in fact, the new items are significantly more complicated to type properly

The typing issues were discussed here (https://github.com/pypa/twine/pull/728#pullrequestreview-573342005) in the PR referenced by python/importlib_metadata#278. The magic casting of a two tuple to an item of a dict was found to be incompatible and unsupported by mypy (https://github.com/python/mypy/issues/9938). I further expanded on the motivations that led to this approach in https://github.com/python/importlib_metadata/issues/282 and https://github.com/python/importlib_metadata/issues/284.

So while Twine is able to declare types properly using this new design, it was infeasible to do so in the old design. After the compatibility layers are removed, `entry_points` will simply return an `EntryPoints` object, which presents an iterable of `EntryPoint` objects but with some facilities for selection. It's straightforward and clean. Please demonstrate the complication you see with the current approach.

The new APIs are not only easier to describe with types, but they are easier to describe in documentation.

> - the change breaks many significantly important projects, from perusing related issues it's at the very least flake8, pandas, virtualenv, astropy, pytest, hypothesis -- and hundreds more from a quick github code search

I'm not aware of a single breakage. The code is compatible for all known use-cases, but does present a DeprecationWarning in some usages. In the case of flake8, I've proposed a solution to avoid the DeprecationWarning and move to the preferred design, even without requiring an updated importlib_metadata backport.

I'm invested in providing as seamless a transition as possible to all projects, and I believe this change accomplishes that. Please let me know if there's a project or application where that's not the case.

> personally I think [all API changes] should be reverted 

Given the amount of adoption already, reverting these changes is likely to cause more disruption than moving forward with them.

If you would like to see the changes reverted or adapted further, please provide an example of a use-case that's broken by the current approach. So far, the only example presented above appears contrived and not an actual concern. It's common for Python minor releases to introduce undocumented changes that are technically incompatible but in ways that affect few or no users.
msg394616 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-27 23:36
New changeset 28f12c9f4f39d283d823d81e311d863526dfdb54 by Jason R. Coombs in branch 'main':
bpo-44246: Update What's New for importlib.metadata. (#26408)
https://github.com/python/cpython/commit/28f12c9f4f39d283d823d81e311d863526dfdb54
msg394620 - (view) Author: miss-islington (miss-islington) Date: 2021-05-27 23:57
New changeset 59f9594f9018450101079eb47e06e3c44cfeee36 by Miss Islington (bot) in branch '3.10':
[3.10] bpo-44246: Update What's New for importlib.metadata. (GH-26408) (GH-26415)
https://github.com/python/cpython/commit/59f9594f9018450101079eb47e06e3c44cfeee36
msg394625 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-28 01:27
> Which types are sneaky and look like built-in types but do not act like them?

well for starters, there's the tuple subclass which pretends to be a dict.  but it violates substitutability for both `tuple` and `Mapping` so it's not useful in either contexts.  mypy complains about incorrect types in overrides for both.  the worst part of this is that the `__getitem__` moves from O(1) to O(N) (in some private code this makes importlib.metadata on 3.10 _10x slower than on 3.9_).  next there's the `EntryPoints` tuple subclass which looks like a tuple but doesn't at all act like one (__getitem__ fails substitutability for example) -- this is an api break with 3.9 which returned a `list` (can't `.sort()` `.extend(...)`, etc. any more)

> Given that backports are available

I don't think this is appropriate.  re-introducing a backport brings in a tree of dependencies that have been shaky at best with backward compatibility.  in other words, using standard library importlib.metadata provides significantly improved compatibility and stability over the moving target backport (and its tree of dependencies, zipp being the one that breaks the most often from experience).  you'll notice I closed the flake8 PRs specifically because I didn't want to reintroduce the backport.  This backport also globally monkeypatches the import machinery breaking any other consumer as well -- in a tool as popular as flake8 I can't really make that global mutation decision for all of the other consumers.

> The typing issues

these were *trivially solved* by a dictionary comprehension using `entrypoint.name` -- it really did not need a full rework and break of the api to solve (could have just deprecated the `__iter__` which I actually suggested on the *original implementation* way back in 3.8)

> The new APIs are not only easier to describe with types

the types describing the new apis require *significant* `# type: ignore`s to pass mypy because they violate basic substitutability.  they also cannot be used in any of the contexts they were appropriate for in <3.10 (Dict[str, List[EntryPoint]] or List[Entrypoint] depending on the api).

> I'm not aware of a single breakage.

I'm sorry but you have to have realized from the many issues on importlib-metadata or the many issues linking to your deprecation issue that there is significant toil caused directly by your change.  cpython is meant to be a stable substrate to build upon, please do not force the community to shoulder the burden of your poor api decisions.

> even without requiring an updated importlib_metadata backport.

your proposed change introduced a different, unrelated package.  not without its own maintenance problems (an additional dependency that has to ~work indefinitely, a hack at best to support this breaking change)

> Given the amount of adoption already, reverting these changes is likely to cause more disruption than moving forward with them.

I disagree, bigger things have been reverted (see: __future__.annotations)

>  So far, the only example presented above appears contrived and not an actual concern

I promise you this is not a contrived case, if you look at your issue tracker it has been reported before and by others.  For every issue reported there's likely tens or hundreds of others which are not reported.

> It's common for Python minor releases to introduce undocumented changes that are technically incompatible

I've seen this as a rationalization for intentional surprise breaking changes but I don't buy it.

Additionally your comments about (paraphrased) "the testsuite didn't demonstrate this usecase so I'm free to change it" are frankly a cop out -- your api returned a dict and a list, that's part of the public api -- changing that is a breaking change.
msg394726 - (view) Author: gaborjbernat (gaborjbernat) * Date: 2021-05-29 09:17
After reading through the points here, I must say I agree with Anthony here. The standard library has clear rules regarding how previously working interfaces should be deprecated, and this changeset is violating those. At no point was documented that relying on the list/dict trait of the existing interface is not part of the interface. I don't think the importlib libraries are special enough to warrant exclusion from this rule (as opposed let's say the zoneinfo).
msg394727 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-29 09:29
> there's the tuple subclass which pretends to be a dict.

There's no tuple subclass that's pretending to be a dict. It overrides __getitem__ for convenience. It never claims to support Mapping.

> mypy complains about incorrect types in overrides for both.

I'm unsure what the concern is. If there's an issue, it hasn't been reported to the project. importlib_metadata runs mypy tests with the test suite (all passing) and [twine uses the API with strict mypy checking enabled](https://github.com/pypa/twine/blob/eff3a454df49c6e998d3d21d07ef846d8318e446/twine/cli.py#L43-L69) without any exclusions.

> the worst part of this is that the `__getitem__` moves from O(1) to O(N) (in some private code this makes importlib.metadata on 3.10 _10x slower than on 3.9_)

This issue was revealed during the review and I acknowledged the concern and agreed to address the issue if it mattered. This project has demonstrated its concern for performance issues as are apparent through a number of optimizations in the changelog. In every use case I've seen, the performance is improved by the current approach (a group/sort operation is avoided). If the performance is a concern, I once again welcome a bug report describing the use-case and the impact, though I suspect it's an isolated case and likely would best be addressed outside the official codebase.

> this is an api break with 3.9 which returned a `list`

I acknowledge this break, though I believe the concerns are overblown. The API specifically sought to reduce dependence on receiving a list and instead to provide a more abstract collection. 

> I don't think [introducing behavior in backports] is appropriate.

It's true, the "backport" monkier is a false one here. From the very beginning, these modules first introduced their behavior outside the stdlib and were then ported into CPython. Moreover, the past couple of years have seen substantial refinement and innovation and was able to move much faster and reach stability much faster and with wider adoption than if the library had followed the stardard Python development cadence.

It's quite likely that this project will eventually stabilize to the point that most users do not need the backport, but while it exists, it's providing massive value. Consider the most recent example (https://importlib-metadata.readthedocs.io/en/latest/history.html#v4-3-1) where a performance improvement caused a regression. The regression was detected and fixed within a day. Now when CPython adopts that behavior, we can all have higher confidence in the viability of the implementation.

It would be a pretty big shift to block this approach, but it's not out of the realm of consideration. Still, it's out of scope for this discussion. Feel free to raise it separately.

> [the typing issues] were *trivially solved* by a dictionary comprehension

No such solution was proposed by anybody, but more importantly, I don't believe the solution would have been so trivial and still met the objectives.

> the types describing the new apis require *significant* `# type: ignore`s to pass mypy because they violate basic substitutability.

I'm unaware of this issue and it's not been reported, but I also don't believe it's an issue. Both twine and keyring have adopted the latest API and pass mypy tests.

> they also cannot be used in any of the contexts they were appropriate for in <3.10 (Dict[str, List[EntryPoint]] or List[Entrypoint] depending on the api).

That's right. The API has changed.

> many issues on importlib-metadata

Where "many" ~= 1 (https://github.com/python/importlib_metadata/issues?q=is%3Aissue+DeprecationWarning).

> many issues linking to your deprecation issue

Do you mean https://github.com/python/importlib_metadata/pull/289 or something else? I see ~4 projects (astropy, pytest-randomly, keyring, virtualenv) making mention there. I'd expected the number of projects to be affected to be more than that.

> there is significant toil

I care about toil. A lot. I don't make incompatible changes lightly, and I spent a good deal of time documenting the motivations and providing guidance on how to transition. I've actively worked with each project that's requested help to minimize their toil and provide a one-shot transition to the new API.

> if you look at your issue tracker it has been reported before and by others

I looked and didn't find it. Help me see what I'm missing.

> "the testsuite didn't demonstrate this usecase so I'm free to change it"

That's not the spirit of my words. The API had an intended usage that was borne out by the documentation and tests. If users relied on other interfaces that were incidentally present, the user bears some risk in relying on those behaviors. Still, I accept responsibility to provide a transitional support even for those cases.
msg394728 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-29 09:51
> The standard library has clear rules regarding how previously working interfaces should be deprecated, and this changeset is violating those. At no point was documented that relying on the list/dict trait of the existing interface is not part of the interface.

Thanks Gábor for chiming in.

A minor correction, the "dict" trait was documented, and compatibility is retained for that trait.

It's only the "list" trait of the less-commonly-used "Distribution.entry_points" that's a concern here, and I've yet to see an example of it being an actual concern. I've already offered to add compatibility if a compelling use case is presented.

> I don't think the importlib libraries are special enough to warrant exclusion from this rule (as opposed let's say the zoneinfo).

The docs do explicitly call out that the module is provisional. https://docs.python.org/3/library/importlib.metadata.html

Still, I believe it's best for this module to honor the stdlib practices as best as possible, and I believe the indicated change does that.
msg394749 - (view) Author: Zac Hatfield-Dodds (Zac Hatfield-Dodds) * Date: 2021-05-30 01:43
Just chiming in with a plea to slow down the rate of changes to importlib.metadata - I understand that you want to tidy up the API, but even deprecations cause substantial work downstream.  Would it really be so bad to support the older APIs until they go end-of-life in CPython?

For example, in Hypothesis we care a great deal about compatibility with all supported Python versions (including treating warnings as errors) and try to minimize the hard dependencies.  As a result, our entry-points handling looks like this:
https://github.com/HypothesisWorks/hypothesis/blob/0a90ed6edf56319149956c7321d4110078a5c228/hypothesis-python/src/hypothesis/entry_points.py

Change can be worth the cost, but there *is* a cost and the packaging ecosystem is already painfully complex and fragmented.  Compatibility should be a foundational principle, not an optional extra _if someone presents a compelling use case!_

I'm also seriously concerned that you take GitHub links as an indication of who is affected.  Python is very very widely used, including in environments that don't feed much back into the GitHub-open-source space, and I think it's important to avoid breaking things for low-visibility users too.
msg394767 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-30 16:11
Thanks Zac for your input.

> Just chiming in with a plea to slow down the rate of changes to importlib.metadata - I understand that you want to tidy up the API, but even deprecations cause substantial work downstream.

It would be difficult to go much slower. Are you suggesting delaying the deprecation warning? My rationale for not delaying the deprecation warning is because it's possible using the backport to support the newer APIs all the way back to some end-of-life Pythons. If the deprecation warning is delayed, that seems to only delay the inevitable - that most projects will ultimately have to endure the toil of transitioning the code and relying on backports to support older Pythons.

Still, projects have the option to use the older APIs indefinitely by pinning the backport, or delay their adoption of the newer APIs by suppressing the warning. There's a lot of flexibility to limit the toil.

What approach would you recommend?

> Would it really be so bad to support the older APIs until they go end-of-life in CPython?

At this point, I believe the code is compatible with all known use cases and it's conceivable the compatibility layer could be supported in Python releases until Python 3.9 is EOL. Is that what you're proposing? Would that help the Hypothesis case (or others)? My instinct is the value proposition there is small.

> For example, in Hypothesis we care a great deal about compatibility with all supported Python versions (including treating warnings as errors) and try to minimize the hard dependencies.  As a result, our entry-points handling looks like this...

Project maintainers are allowed of course to treat warnings like errors, but since deprecation warnings are the primary mechanism for an upstream API to signal to the downstream that something is changing, projects should expect an increased amount of toil by transforming the default behavior of warnings.

I suspect that the hypothesis project could achieve forward compatibility within its constraints by vendoring `backports.entry_points_selectable`, and thus re-using a shared implementation of the conditional import dance. I've added a note to the project's README indicating that option. The implementation you have seems suitable, though.

> Change can be worth the cost, but there *is* a cost and the packaging ecosystem is already painfully complex and fragmented.  Compatibility should be a foundational principle, not an optional extra _if someone presents a compelling use case!_

Agreed: compatibility is a foundational principle. Compatibility was built into the first release of this new behavior (importlib_metadata 3.6). Thanks to Anthony's feedback in the PR and extensive exploration and rewrites, the solution presented there has a fairly straightforward transition and clean separation of concerns. The case reported above, where compatibility was not achieved is an acknowledged missed concern, and I'm happy to invest the time to restore that compatibility if it's worth the trouble. The reason I'd thought it's not worth the trouble is because contrary to Anthony's claim, no user has reported an issue with index-based access on Distribution.entry_points results for the three months that this functionality has been in place, which is why a note about the incompatibility seemed sufficient (after the fact).

I'll proceed with adding compatibility for this reported case, although I worry that won't satisfy the concerns. Is there a satisfactory solution that doesn't involve reverting the changes? Is there an approach that meets the goals of the change with less disruption?

> I'm also seriously concerned that you take GitHub links as an indication of who is affected.  Python is very very widely used, including in environments that don't feed much back into the GitHub-open-source space, and I think it's important to avoid breaking things for low-visibility users too.

I surely recognize that Github links and reports are only one indicator of one segment of the user base, but it's the sole indicator these projects have to solicit user concerns. That's why I pinned the issue reported about the Deprecation warning and used that forum to express concern and support for the users' issues and to present a variety of approaches for any number of users to avail themselves. I wanted to increase the visibility of the issue and limit the difficulty of addressing the intentional deprecation.

I mainly rely on Github reports and +1s on those reports as an indication of the impact of an issue. I use Github links as a means of networking. It was Anthony who suggested the links were an indication of a widespread issue. I only meant to contrast that concern to other breakages (in my experience) that showed dozens of links to affected issues. Linked issues are a secondary indicator at best.

I do expect that if users have an issue that they would report it through python/importlib_metadata or bpo, but I also expect that absence of a report demonstrates stability. At least, that's been my experience in the hundreds of projects I've developed on Sourceforge, Bitbucket, GitLab, and Github.

After employing defensive, empathetic programming, developing docs, and writing comprehensive tests, what  approaches should I be taking to solicit user concerns other than to have an open forum for soliciting issues and responding to those issues promptly and with proactive solutions?
msg394769 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-30 16:17
I also need `.sort(key=...)` for what it's worth, the error in this issue was just the first encountered

I think my only satisfactory outcome would be:

- the original api returns actual dicts
- the sub-api returns actual lists
- the new select is implemented as a separate *new* api without changing the existing api
msg394770 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-30 16:22
also https://github.com/miurahr/aqtinstall/issues/221

(this links to importlib-metadata tracker, not sure how you missed it)
msg394773 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-30 17:04
the `.select(...)` api is at least twice as slow as indexing as well:

setup:
```
virtualenv venv39 -p python3.9
venv39/bin/pip install flake8 pytest pytest-randomly
virtualenv venv39 -p python3.10
venv310/bin/pip install flake8 pytest pytest-randomly
```

```python
import importlib.metadata
import sys
import time


def f():
    eps = importlib.metadata.entry_points()
    if sys.version_info >= (3, 10):
        eps.select(name='console_scripts')
    else:
        eps['console_scripts']


t0 = time.time()
for _ in range(100):
    f()
t1 = time.time()
print(f'{t1-t0}')
```

```
$ ./venv39/bin/python t.py
0.687570333480835
$ ./venv310/bin/python t.py
1.3486714363098145
```

it is *way* worse when involving multiple entry points:

```python
import importlib.metadata
import sys
import time


# moved outside of the loop, already showed this component is slower
eps = importlib.metadata.entry_points()
def f():
    # common for plugin systems to look up multiple entry points
    for ep in ('console_scripts', 'flake8.extension', 'pytest11'):
        if sys.version_info >= (3, 10):
            eps.select(name=ep)
        else:
            eps[ep]


t0 = time.time()
for _ in range(10000):
    f()
t1 = time.time()
print(f'{t1-t0}')
```

```console
$ ./venv39/bin/python t.py
0.01629471778869629
$ ./venv310/bin/python t.py
8.569908380508423
```
msg394774 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-30 17:12
oops, tiny typo in those code examples, they should say `group=` instead of `name=` -- though the performance is unchanged:

(first example)
```console
$ ./venv39/bin/python t.py
0.6641988754272461
$ ./venv310/bin/python t.py
1.3172023296356201
```

(second example)
```console
$ ./venv39/bin/python t.py
0.014233589172363281
$ ./venv310/bin/python t.py
8.910593271255493
```
msg394775 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-30 17:36
There are known performance concerns. I recommend to set those aside for now or move them to a separate issue because

(a) The performance is theoretically better in the nominal case because it avoids a sort/group operation.
(b) There are known performance degradations introduced by importlib_metadata 3.5 to de-duplicate distributions, degradations mitigated somewhat by importlib_metadata 4.3.
(c) Compatibility layers may be confounding performance concerns.

May I suggest addressing performance concerns in the importlib_metadata project as that project provides much better granularity on the different changes?

> I think my only satisfactory outcome would be:
> - the original api returns actual dicts

The original API returns an actual dict subclass (SelectableGroups).

- the sub-api returns actual lists

With https://github.com/python/importlib_metadata/pull/323, this expectation is also met as EntryPoints is a list.

- the new select is implemented as a separate *new* api without changing the existing api

The new API is invoked only through opt-in calls not previously available in the old API. I believe this achieves your goal without requiring a new name for `entry_points` or `Distribution.entry_points` (and thus creating less toil for consumers).
msg394777 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-30 19:11
the toil is still present, the existing, good apis are deprecated and the new, bad apis are slow -- and the odd subclasses are still present
msg394778 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-30 19:11
I have also shown that the performance is indeed not better in the nominal case, as demonstrated in the first case
msg394806 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-31 14:59
importlib_metadata 4.4 restores compatibility for the reported concerns. I'll merge those into CPython later.
msg394807 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-05-31 15:03
it does not, it restores apis but in a way which requires a huge performance hit to avoid deprecation warnings

it also still has the 2-500x performance regression I've stated above
msg394811 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-31 15:52
New changeset 410b70d39d9d77384f8b8597560f6731530149ca by Jason R. Coombs in branch 'main':
bpo-44246: Entry points performance improvements. (GH-26467)
https://github.com/python/cpython/commit/410b70d39d9d77384f8b8597560f6731530149ca
msg394812 - (view) Author: miss-islington (miss-islington) Date: 2021-05-31 16:12
New changeset d1480ad2f5d5f02ecbe4b4091e8c428ddfa39ff6 by Miss Islington (bot) in branch '3.10':
bpo-44246: Entry points performance improvements. (GH-26467)
https://github.com/python/cpython/commit/d1480ad2f5d5f02ecbe4b4091e8c428ddfa39ff6
msg394813 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-31 16:19
New changeset c34ed08d975fb7daa7b329f7c631647782290393 by Jason R. Coombs in branch 'main':
bpo-44246: Restore compatibility in entry_points (GH-26468)
https://github.com/python/cpython/commit/c34ed08d975fb7daa7b329f7c631647782290393
msg394814 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-31 16:23
The compatibility concerns are addressed with PR 26468 and some of the performance concerns may be mitigated with PR 26467. As I mentioned before, if there are continuing performance concerns, please raise them separately (preferably with python/importlib_metadata, and please include use-cases that exemplify not just the effect, but the impact.
msg394815 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-31 16:29
New changeset 78d9a9b1904f0e1d9db1e941c19782f4f5a881d4 by Jason R. Coombs in branch 'main':
bpo-44246: Remove note about access by index now that a compatibility shim is offered. (GH-26472)
https://github.com/python/cpython/commit/78d9a9b1904f0e1d9db1e941c19782f4f5a881d4
msg394816 - (view) Author: miss-islington (miss-islington) Date: 2021-05-31 16:42
New changeset 7207203e1d71e4bf65e5b4991f60e7dc1e35e813 by Miss Islington (bot) in branch '3.10':
[3.10] bpo-44246: Restore compatibility in entry_points (GH-26468) (GH-26471)
https://github.com/python/cpython/commit/7207203e1d71e4bf65e5b4991f60e7dc1e35e813
msg394830 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-05-31 22:29
New changeset d0991e2db3bb932e2411ee9dca54fd69ff2611c4 by Miss Islington (bot) in branch '3.10':
bpo-44246: Remove note about access by index now that a compatibility shim is offered. (GH-26472) (#26473)
https://github.com/python/cpython/commit/d0991e2db3bb932e2411ee9dca54fd69ff2611c4
msg395202 - (view) Author: Ken Jin (kj) * (Python triager) Date: 2021-06-06 10:38
The new test `test_entry_points_by_index` (added in c34ed08d975fb7daa7b329f7c631647782290393 ) seems to fail on some windows buildbots: 

https://dev.azure.com/Python/cpython/_build/results?buildId=81807&view=logs&j=c8a71634-e5ec-54a0-3958-760f4148b765&t=599737bc-ad72-560d-1530-0f89b05729e4

A copy of the error output for everyone's convenience:
======================================================================
ERROR: test_entry_points_by_index (test.test_importlib.test_metadata_api.APITests)
Prior versions of Distribution.entry_points would return a
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\a\1\b\layout-appx-amd64\lib\test\test_importlib\test_metadata_api.py", line 145, in test_entry_points_by_index
    expected = next(iter(caught))
StopIteration

----------------------------------------------------------------------

BTW, the same buildbot is currently failing on main with a different error which masks that error above. I'll do more digging if no one takes this up by next week. Currently I'm not able to reproduce that locally on my windows machine. Thanks all!
msg395207 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-06-06 15:04
The line where the failure occurs is the point where it's checking that the warning was issued. The fact that a StopIteration is raised indicates that no warnings were caught. I can think of a couple of scenarios where that could happen:

- That warning is somehow disabled.
- The `Distribution` object returned by `distribution('distinfo-pkg')` is somehow an older implementation (perhaps an older importlib_metadata is present).

Given that the DeprecationWarnings aren't missed on other tests, the latter seems to be a more likely candidate.

I notice that the regular tests are passing. It's only in the 'appx' environment where the test fails. I'm not familiar with appx, but it seems likely that something from the appx environment creation is a factor in the divergent behavior.

Steve, can you advise on how appx environments are created and how one could replicate a test failure that only occurs in that environment?
msg395282 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-06-07 20:17
The appx layout is also the only one in CI that actually uses an installed layout - all the rest run tests from the source tree. So it could be related to that.

If it's a warning, it could also be that the warning is being triggered somewhere else first. Since tests run in a random order, you'll want to look for how reliable the failure is.
msg395301 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-06-08 00:58
Thanks Steve for the feedback.

> If it's a warning, it could also be that the warning is being triggered somewhere else first. Since tests run in a random order, you'll want to look for how reliable the failure is.

It is a warning, but it seems unlikely that any other code is calling it, given that the supporting codepath was not present until the same PR.

> The appx layout is also the only one in CI that actually uses an installed layout - all the rest run tests from the source tree. So it could be related to that.

If someone could help by producing a docker image that can build the appx layout and run the tests, that would help me as I don't have a lot of proficiency with installing build tools on Windows through the CLI... and the experience I do have has been fraught with challenges.
msg395333 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-06-08 14:17
here's the performance regressions, they affect any callers of `distributions()` and are even worse on callers of the new apis.

a call to distributions() is about 3x slower than in 3.9

here is the setup I am using:

virtualenv venv39 -ppython3.9
venv39/bin/pip install flake8 pytest twine pre-commit
virtualenv venv310 -ppython3.10
venv310/bin/pip install flake8 pytest twine pre-commit

to test just the `distributions()` call I'm using the following:

```console
$ venv39/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'entry_points()'
20 loops, best of 20: 12.5 msec per loop
$ venv310/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'entry_points()'
20 loops, best of 20: 36.7 msec per loop
```

this is a less-extreme example, many applications have more dependencies installed -- but even in this case this is adding ~24ms startup to any application using `entry_points()` -- and it gets worse

the return value of `entry_points()` alone isn't all that useful, next an application needs to retrieve its entry points.  let's start for the somewhat normal case of retrieving a single category of entry points:

```console
$ venv39/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'entry_points()["flake8.extension"]'
20 loops, best of 20: 12.7 msec per loop
$ venv310/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'entry_points(name="flake8.extension")'
20 loops, best of 20: 37.1 msec per loop
$ venv310/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'entry_points().select(group="flake8.extension")'
20 loops, best of 20: 37.1 msec per loop
```

again, 3x slower and very real time to the end user (~24-25ms)

now let's show an example usage that something like flake8 uses where multiple groups are requested (this is common for apps and plugin systems which provide multiple distinct functionalities)

```console
$ venv39/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'eps = entry_points(); eps["flake8.extension"]; eps["flake8.report"]'
 
20 loops, best of 20: 12.6 msec per loop
$ venv310/bin/python -m timeit -n 20 -r 20 -s 'from importlib.metadata import entry_points' 'eps = entry_points(); eps.select(group="flake8.extension"); eps.select(group="flake8.report")'
20 loops, best of 20: 38.2 msec per loop
```

also slower, but an additional ms per call to `.select(...)`


and it only gets worse with more and more packages installed

here's the versions I'm using to ensure they are up to date:

```console
$ venv39/bin/python --version --version
Python 3.9.5 (default, May 19 2021, 11:32:47) 
[GCC 9.3.0]
$ venv310/bin/python --version --version
Python 3.10.0b2 (default, Jun  2 2021, 00:22:18) [GCC 9.3.0]
```
msg395360 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-06-08 20:29
As mentioned in msg394775, I'd like to decouple the performance concerns from the original incompatibility. I recognize that performance regressions are in their own way a form of incompatibility, but there have been a lot of changes to entry points with respect to performance, both prior to beta 1 and in beta 2, including changes that intentionally traded performance for correctness (https://github.com/python/importlib_metadata/pull/281).

To that end, I've filed https://github.com/python/importlib_metadata/issues/324 to track the concerns.
msg395361 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2021-06-08 20:33
they are directly coupled which is why I commented here

the api redesign forces O(N) lookups and O(N) constructions which directly impact performance causing the regression
msg395495 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-06-10 00:13
In [this Dockerfile](https://github.com/jaraco/jaraco.windows/blob/feature/vs-2/Dockerfile), I've attempted to install Visual Studio, but without success. Docker fails to build on [line 6](https://github.com/jaraco/jaraco.windows/blob/4de80c897cb92362bd1084a9851f3f08b6ed697e/Dockerfile#L6). The install fails with this error: 

```
ERROR: Running ["C:\Users\ContainerAdministrator\AppData\Local\Temp\chocolatey\visualstudio2019buildtools\16.10.0.0\vs_BuildTools.exe" --quiet --add Microsoft.VisualStudio.Workload.ManagedDesktopBuildTools --add Microsoft.VisualStudio.Workload.NetCoreBuildTools --norestart --wait] was not successful. Exit code was '-2147024770'. See log for possible error messages.
```

Any tips on creating a Windows environment with build support would be appreciated.
msg395500 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-06-10 02:10
I managed to put together a Dockerfile that seemingly has the build tools installed (https://github.com/jaraco/jaraco.windows/blob/d2edad2e2af9d469189d7ac6a14a4ba6f6270348/Dockerfile).

When I attempt to build CPython, however, it fails with this error:

```
PS C:\code\public\cpython> PCBuild\build.bat
Using py -3.9 (found 3.9 with py.exe)
Fetching external libraries...
bzip2-1.0.6 already exists, skipping.
sqlite-3.35.5.0 already exists, skipping.
xz-5.2.2 already exists, skipping.
zlib-1.2.11 already exists, skipping.
Fetching external binaries...
libffi already exists, skipping.
openssl-bin-1.1.1k-1 already exists, skipping.
tcltk-8.6.11.0 already exists, skipping.
Finished.
Using "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\MSBuild\Current\Bin\MSBuild.exe"  (found in the PATH)
Using py -3.9 (found 3.9 with py.exe)

C:\code\public\cpython>"C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\MSBuild\Current\Bin\MSBuild.exe"  "C:\code\public\cpython\PCbuild\pcbuild.proj" /t:Build /m /nologo /v:m /clp:summary /p:Configuration=Release /p:Platform=x64 /p:IncludeExternals=true /p:IncludeCTypes=true /p:IncludeSSL=true /p:IncludeTkinter=true /p:UseTestMarker= /p:GIT="C:\Program Files\Git\cmd\git.exe"
C:\code\public\cpython\PCbuild\python.props(111,31): error MSB4184: The expression "[System.Version]::Parse('')" cannot be evaluated. Version string portion was too short or too long. [C:\code\public\cpython\PCbuild\pythoncore.vcxproj]

Build FAILED.

C:\code\public\cpython\PCbuild\python.props(111,31): error MSB4184: The expression "[System.Version]::Parse('')" cannot be evaluated. Version string portion was too short or too long. [C:\code\public\cpython\PCbuild\pythoncore.vcxproj]
    0 Warning(s)
    1 Error(s)

Time Elapsed 00:00:00.15
```
msg395531 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-06-10 12:38
The build error is tracked in issue43298.
History
Date User Action Args
2021-06-10 12:38:35jaracosetmessages: + msg395531
2021-06-10 02:10:08jaracosetmessages: + msg395500
2021-06-10 00:13:54jaracosetmessages: + msg395495
2021-06-08 20:33:34Anthony Sottilesetstatus: closed -> open
resolution: fixed ->
messages: + msg395361
2021-06-08 20:29:54jaracosetstatus: open -> closed
resolution: fixed
messages: + msg395360
2021-06-08 14:17:33Anthony Sottilesetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg395333
2021-06-08 00:58:58jaracosetmessages: + msg395301
2021-06-07 20:17:54steve.dowersetmessages: + msg395282
2021-06-06 15:04:57jaracosetnosy: + steve.dower
messages: + msg395207
2021-06-06 10:38:52kjsetnosy: + kj
messages: + msg395202
2021-05-31 22:29:27jaracosetmessages: + msg394830
2021-05-31 16:42:02miss-islingtonsetmessages: + msg394816
2021-05-31 16:29:44miss-islingtonsetpull_requests: + pull_request25068
2021-05-31 16:29:34jaracosetmessages: + msg394815
2021-05-31 16:28:51jaracosetpull_requests: + pull_request25067
2021-05-31 16:23:46jaracosetstatus: open -> closed
versions: + Python 3.11
messages: + msg394814

resolution: fixed
stage: patch review -> resolved
2021-05-31 16:19:49miss-islingtonsetpull_requests: + pull_request25066
2021-05-31 16:19:49jaracosetmessages: + msg394813
2021-05-31 16:12:36miss-islingtonsetmessages: + msg394812
2021-05-31 15:52:41miss-islingtonsetpull_requests: + pull_request25064
2021-05-31 15:52:37jaracosetmessages: + msg394811
2021-05-31 15:36:54jaracosetpull_requests: + pull_request25063
2021-05-31 15:29:41jaracosetpull_requests: + pull_request25062
2021-05-31 15:03:54Anthony Sottilesetmessages: + msg394807
2021-05-31 14:59:34jaracosetmessages: + msg394806
2021-05-30 19:11:43Anthony Sottilesetmessages: + msg394778
2021-05-30 19:11:04Anthony Sottilesetmessages: + msg394777
2021-05-30 17:36:55jaracosetmessages: + msg394775
2021-05-30 17:12:51Anthony Sottilesetmessages: + msg394774
2021-05-30 17:04:11Anthony Sottilesetmessages: + msg394773
2021-05-30 16:22:42Anthony Sottilesetmessages: + msg394770
2021-05-30 16:17:31Anthony Sottilesetmessages: + msg394769
2021-05-30 16:11:09jaracosetmessages: + msg394767
2021-05-30 01:43:05Zac Hatfield-Doddssetnosy: + Zac Hatfield-Dodds
messages: + msg394749
2021-05-29 09:51:21jaracosetmessages: + msg394728
2021-05-29 09:29:12jaracosetnosy: - gaborjbernat
messages: + msg394727
2021-05-29 09:17:41gaborjbernatsetnosy: + gaborjbernat
messages: + msg394726
2021-05-28 13:39:39domdfcodingsetnosy: + domdfcoding
2021-05-28 01:27:35Anthony Sottilesetmessages: + msg394625
2021-05-27 23:57:11miss-islingtonsetmessages: + msg394620
2021-05-27 23:36:21jaracosetmessages: + msg394616
2021-05-27 23:36:15miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request25009
2021-05-27 19:00:33jaracosetmessages: + msg394588
2021-05-27 16:16:49jaracosetkeywords: + patch
stage: patch review
pull_requests: + pull_request25001
2021-05-27 16:06:22jaracosetmessages: + msg394560
2021-05-27 16:03:16Anthony Sottilesetmessages: + msg394559
2021-05-27 15:54:56jaracosetmessages: + msg394557
2021-05-27 15:49:15Anthony Sottilesetmessages: + msg394553
2021-05-27 15:48:13jaracosetnosy: - xtreak
messages: + msg394552
2021-05-27 15:37:28xtreaksetnosy: + xtreak
messages: + msg394550
2021-05-27 15:33:15xtreaksetnosy: + jaraco
2021-05-27 15:30:11Anthony Sottilecreate