Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importlib.metadata documentation deficiencies #82775

Closed
indygreg mannequin opened this issue Oct 26, 2019 · 7 comments
Closed

importlib.metadata documentation deficiencies #82775

indygreg mannequin opened this issue Oct 26, 2019 · 7 comments
Assignees
Labels
3.8 only security fixes 3.9 only security fixes docs Documentation in the Doc dir type-feature A feature request or enhancement

Comments

@indygreg
Copy link
Mannequin

indygreg mannequin commented Oct 26, 2019

BPO 38594
Nosy @warsaw, @jaraco, @indygreg
PRs
  • bpo-39022, bpo-38594: Sync with importlib_metadata 1.3 #17568
  • [3.8] bpo-39022, bpo-38594: Sync with importlib_metadata 1.3 (GH-17568) #17569
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/jaraco'
    closed_at = <Date 2019-12-11.02:45:25.863>
    created_at = <Date 2019-10-26.03:24:54.750>
    labels = ['type-feature', '3.8', '3.9', 'docs']
    title = 'importlib.metadata documentation deficiencies'
    updated_at = <Date 2019-12-11.02:45:25.855>
    user = 'https://github.com/indygreg'

    bugs.python.org fields:

    activity = <Date 2019-12-11.02:45:25.855>
    actor = 'jaraco'
    assignee = 'jaraco'
    closed = True
    closed_date = <Date 2019-12-11.02:45:25.863>
    closer = 'jaraco'
    components = ['Documentation']
    creation = <Date 2019-10-26.03:24:54.750>
    creator = 'indygreg'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38594
    keywords = ['patch']
    message_count = 7.0
    messages = ['355402', '358038', '358041', '358234', '358237', '358241', '358243']
    nosy_count = 4.0
    nosy_names = ['barry', 'jaraco', 'docs@python', 'indygreg']
    pr_nums = ['17568', '17569']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38594'
    versions = ['Python 3.8', 'Python 3.9']

    @indygreg
    Copy link
    Mannequin Author

    indygreg mannequin commented Oct 26, 2019

    As I was attempting to implement the find_distributions() interface for PyOxidizer, I got confused by importlib.metadata's documentation.

    The documentation for this module states:

    What this means in practice is that to support finding distribution package
    metadata in locations other than the file system, you should derive from
    ``Distribution`` and implement the ``load_metadata()`` method. Then from
    your finder, return instances of this derived ``Distribution`` in the
    ``find_distributions()`` method.
    

    The reference to load_metadata() is the only occurrence of the string load_metadata in the CPython and importlib_metadata code bases. I therefore believe the documentation in both CPython and the importlib_metadata standalone package are wrong because they are referring to a method that is never implemented nor called.

    Looking at the documentation and source code for importlib.metadata, I'm also a bit confused about how exactly I'm supposed to implement a custom Distribution which isn't based on filesystems. For example, I see that certain APIs return Path-like objects (which I will need to implement). But it isn't clear exactly which attributes are mandated to exist! Am I expected to implement the full pathlib.Path interface or just a subset?

    Regarding how find_distributions() is called, I also don't understand why the Context is optional and how Context could be used in some situations. For example, the implementation of discover() can construct Context instances with no arguments, which is then fed into find_distributions(). So I guess context=None or context.name=None implies "return Distribution's for every known package?" If so, this behavior is undocumented.

    I'm also not sure what Context.path is for. I /think/ it is only used for the path-based finder/distribution. But the way it is documented implies it should always exist, which doesn't seem appropriate for cases like PyOxidizer which will retrieve metadata from in-memory without filesystem I/O.

    I think what I'm trying to say is that the existing documentation for importlib.metadata is not sufficient to robustly implement a custom find_distributions() + Distribution type. I would kindly request that a domain expert revise the documentation such that a 3rd party can implement a custom solution. My preferred solution would be for there to be formal interfaces in importlib.abc like there are for everything else in the importlib realm. (The interfaces for finders and loaders are super useful when implementing a finder/loader from scratch.)

    FWIW I think I like the new metadata API and I think it is flexible enough to allow tools like PyOxidizer to do crazy things like divorce resources from the filesystem! But it is hard to say for sure since the interfaces aren't clearly defined at present.

    @indygreg indygreg mannequin assigned docspython Oct 26, 2019
    @indygreg indygreg mannequin added 3.8 only security fixes 3.9 only security fixes docs Documentation in the Doc dir type-feature A feature request or enhancement labels Oct 26, 2019
    @jaraco jaraco assigned jaraco and unassigned docspython Dec 6, 2019
    @jaraco
    Copy link
    Member

    jaraco commented Dec 8, 2019

    Good suggestions. Thanks for taking the time to articulate in such a friendly way the shortcomings you encountered. I'm happy to help.

    In this ticket, I've mirrored this ticket in the backport project, where I can iterate much faster.

    I'll provide brief answers to some of your questions/concerns here and then work out the wording for the documentation (and code changes) necessary to communicate that effectively.

    The reference to load_metadata() is the only occurrence of the string load_metadata in the CPython and importlib_metadata code bases. I therefore believe the documentation in both CPython and the importlib_metadata standalone package are wrong because they are referring to a method that is never implemented nor called.

    That's right. The documentation is wrong. It should say to implement a Distribution subclass (especially its abstract methods). Nothing more should be necessary.

    I see that certain APIs return Path-like objects (which I will need to implement)

    Are you sure about that? The only code I see in the Distribution class that references a Path object is .at(), a static method that we decided to add to the Distribution class even though it would be more appropriate in the PathDistribution class in order to make that method readily available to projects that wished to construct the Path-based distribution objects from a file system (or zipfile) path. I'm pretty sure everything else in the Distribution class relies on the two abstract methods. If you disregard at (and I recommend you do) and focus on implementing the abstract methods, I think things will work. Let me know if you find otherwise.

    why the Context is optional and how Context could be used?

    The interface is intentionally vague in order not to be too prescriptive, because as you point out, name or path may not be relevant in some contexts. It's meant to narrow the scope of any search.

    So if a path is present, that means the query is looking in a specific 'sys.path' entry. And if the name is present, that means it's looking for a distribution having a specific name. But basically, you can solicit any properties you like. You could expect a size='max(100)' parameter and only return distributions smaller than 100 (for whatever interpretation of size you wish to implement. Your DistributionFinder should do its best to honor whatever context might be relevant to the Distributions you provide.

    Does PyOxidizer interact with sys.path at all? If not, it can disregard Context.path.

    @jaraco
    Copy link
    Member

    jaraco commented Dec 8, 2019

    Please have a look at https://gitlab.com/python-devs/importlib_metadata/merge_requests/104/diffs, which attempts to clarify the documentation to indicate how one would implement a custom finder. If you have a prototype implementation, I'd be happy to have a look.

    The use-case you present is exactly the type of use-case this project wishes to enable, so I'm grateful that you're working on it and I'd like to do what I can to support the effort.

    @jaraco
    Copy link
    Member

    jaraco commented Dec 10, 2019

    I've merged the recommended changes into importlib_metadata 1.3 and I'm including those changes in bpo-39022.

    @jaraco
    Copy link
    Member

    jaraco commented Dec 11, 2019

    New changeset b7a0109 by Jason R. Coombs in branch 'master':
    bpo-39022, bpo-38594: Sync with importlib_metadata 1.3 (GH-17568)
    b7a0109

    @jaraco
    Copy link
    Member

    jaraco commented Dec 11, 2019

    New changeset b738237 by Jason R. Coombs (Miss Islington (bot)) in branch '3.8':
    bpo-39022, bpo-38594: Sync with importlib_metadata 1.3 (GH-17568) (GH-17569)
    b738237

    @jaraco
    Copy link
    Member

    jaraco commented Dec 11, 2019

    I'm hoping those documentation edits address the deficiencies, but if not, we can take another stab at it. Feel free to re-open as needed.

    @jaraco jaraco closed this as completed Dec 11, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes docs Documentation in the Doc dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant