Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent importlib change breaks most recent certifi == 2020.4.5.2 #85096

Closed
ned-deily opened this issue Jun 9, 2020 · 13 comments
Closed

Recent importlib change breaks most recent certifi == 2020.4.5.2 #85096

ned-deily opened this issue Jun 9, 2020 · 13 comments
Assignees
Labels
3.9 only security fixes 3.10 only security fixes release-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ned-deily
Copy link
Member

BPO 40924
Nosy @jaraco, @ned-deily, @ambv, @zooba, @dstufft
PRs
  • [3.9] bpo-40924: Revert "bpo-39791 native hooks for importlib.resources.files (GH-20576)" #20760
  • [3.9] bpo-40924: Remove protocol for supplying Traversable objects from loaders #20820
  • bpo-40924: Ensure importlib.resources.path returns an extant path #20857
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/jaraco'
    closed_at = <Date 2020-06-29.21:10:55.215>
    created_at = <Date 2020-06-09.05:59:48.615>
    labels = ['type-bug', 'library', '3.9', '3.10', 'release-blocker']
    title = 'Recent importlib change breaks most recent certifi == 2020.4.5.2'
    updated_at = <Date 2020-06-29.21:10:55.214>
    user = 'https://github.com/ned-deily'

    bugs.python.org fields:

    activity = <Date 2020-06-29.21:10:55.214>
    actor = 'lukasz.langa'
    assignee = 'jaraco'
    closed = True
    closed_date = <Date 2020-06-29.21:10:55.215>
    closer = 'lukasz.langa'
    components = ['Library (Lib)']
    creation = <Date 2020-06-09.05:59:48.615>
    creator = 'ned.deily'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40924
    keywords = ['patch']
    message_count = 13.0
    messages = ['371073', '371084', '371094', '371095', '371132', '371138', '371143', '371144', '371252', '371256', '371502', '372631', '372634']
    nosy_count = 5.0
    nosy_names = ['jaraco', 'ned.deily', 'lukasz.langa', 'steve.dower', 'dstufft']
    pr_nums = ['20760', '20820', '20857']
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue40924'
    versions = ['Python 3.9', 'Python 3.10']

    @ned-deily
    Copy link
    Member Author

    The very recent latest commits for bpo-39791, "New files() api from importlib_resources", have broken the popular certifi package, a package which provides a basic set of Root Certificates for TLS secure network connection verification. Among other users of it, the python.org macOS installers encourage its users to run a provided script to install certifi. Alas, we discovered just after v3.9.2b2 was tagged that that script is broken because certifi.where() is returning a bogus path toe the pem file, what appears to be a path in a deleted temp directory.

    The culprit commits are 843c277 (master) and 9cf1be4 (3.9). This is now a critical problem because in its most recent release, certifi 2020.4.5.2, certifi was changed to try to use importlib.resources.path(), if available, to find the path to the installed cacert.pem file. (The previous release, 2020.4.5.1, appears not to use path() and so is not affected by this bug.)

    certifi/python-certifi@3fc8fec

    https://pypi.org/project/certifi/2020.4.5.2/

    Without trying to debug the bug, I was able to bisect the branch and then reduce the problem seen in macOS installer testing to a fairly simple reproducible test case. The problem was reproduced on both Linux and macOS systems.

    The test case:

    # in a current cpython git repo, checkout the commit before the failing one
    git checkout 843c277^
    git log HEAD^..HEAD
    git clean -fdxq
    ./configure --prefix=$PWD/root -q
    make -j4
    ./python -m ensurepip
    ./python -m pip install --upgrade pip # not necessary to reproduce
    ./python -m pip install --force --no-binary certifi certifi==2020.4.5.2
    ./python -c 'import certifi;print(certifi.where())'

    The output tail should be something like:
    [...]
    Successfully installed certifi-2020.4.5.2
    /home/nad/cpython/root/lib/python3.10/site-packages/certifi/cacert.pem

    Now checkout the failing commit and repeat all the other steps:

    git checkout 843c277
    git log HEAD^..HEAD
    [...]

    The output tail is now incorrect:
    [...]
    Successfully installed certifi-2020.4.5.2
    /tmp/tmpqfjnbj5bcacert.pem

    The cacert.pem is installed to the expected (same) location in either case; its just the output from importlib.resources.path that is incorrect:

    ./python
    Python 3.10.0a0 (remotes/upstream/master:0a40849eb9, Jun  9 2020, 00:35:07)
    [GCC 8.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import importlib.resources, os, os.path, certifi
    >>> certifi.__file__
    '/home/nad/cpython/root/lib/python3.10/site-packages/certifi/__init__.py'
    >>> os.listdir(os.path.dirname(certifi.__file__))
    ['__pycache__', 'cacert.pem', '__init__.py', 'core.py', '__main__.py']
    >>> with importlib.resources.path('certifi', 'cacert.pem') as f: print(f)
    ...
    /tmp/tmpxsrjxb8lcacert.pem
    >>> with importlib.resources.path('certifi', 'core.py') as f: print(f)
    ...
    /tmp/tmpjq8h3si5core.py

    No test suite failures were noted. Perhaps there should be a test case for this?

    Presumably any other downstream users of importlib.resources.path() are affected.

    Łukasz as 3.9 release manager is aware there is an issue but was awaiting the tracking down of the problem before making a decision about what to do for 3.9.0b2.

    cc: Donald as the author of the certifi change.

    @ned-deily ned-deily added 3.9 only security fixes 3.10 only security fixes release-blocker labels Jun 9, 2020
    @ned-deily ned-deily added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error 3.9 only security fixes 3.10 only security fixes release-blocker labels Jun 9, 2020
    @ned-deily ned-deily added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 9, 2020
    @zooba
    Copy link
    Member

    zooba commented Jun 9, 2020

    That was a *big* patch to importlib.resources... And newly defined and standardised protocols? That seems like a lot to bring in without having gone through a PEP anyway, despite the minor bug that has drawn attention to it.

    I don't think backports should be treated as incubators for standard modules, getting new features that then flow directly into the stdlib. It's fine to have a fork for incubation, but people need to rely on consistency in stdlib and backports.

    All that to say, I think we should revert the importlib changes *anyway*, and also because there's something wrong with them right now.

    @jaraco
    Copy link
    Member

    jaraco commented Jun 9, 2020

    I understand the issue here and can supply more details soon (no later than this weekend).

    @ambv
    Copy link
    Contributor

    ambv commented Jun 9, 2020

    In this case we need to revert the 3.9 backport to release a hotfix tonight. Please respect the beta feature freeze.

    @ambv
    Copy link
    Contributor

    ambv commented Jun 9, 2020

    New changeset ce5e6f0 by Łukasz Langa in branch '3.9':
    [3.9] bpo-40924: Revert "bpo-39791 native hooks for importlib.resources.files (GH-20576)" (bpo-20760)
    ce5e6f0

    @jaraco
    Copy link
    Member

    jaraco commented Jun 9, 2020

    This issue stems from improper reliance on implementation details of importlib.resources.path, which returns a context manager that is designed to clean up the file after the context closes. certifi would have encountered the same problem on older Pythons if certifi had been installed as a zip egg or other non-filesystem-based package.

    That said, it is also undesirable for path no longer to return a reference to an existing path when one exists. That behavior has come about as the importlib.resources API moves from the legacy implementation to the new one based on TraversableResources (files function).

    I encountered a similar issue during the original submission, which I addressed by removing the same assumption from another library (importlib.metadata).

    I believe the best fix here is to restore that assumption while retaining other important changes in this patch. Should this assumed behavior also be tested (guaranteed)? That I'm less sure about.

    Please respect the beta feature freeze.

    I do respect the beta feature freeze. The relevant feature was added prior to b1. The reverted change is an incremental fix addressing underlying implementation details such as how resources are resolved and removing duplicate code paths.

    More importantly, the change also addresses a key interface problem that I identified in b1 - that the previously advertised interface of loaders supplying files() methods is inadequate.

    I've put a lot of effort into pulling this all together for 3.9 (three full days just this past weekend and hundreds of hours leading up to that). It would be a real shame for it to be released in a broken state due to a minor (though admittedly impactful) hiccup.

    I suspect there's a small change that to the submitted patch that will restore the prior expectation and leave the codebase in a healthier state.

    I'll prepare that soon.

    @ambv
    Copy link
    Contributor

    ambv commented Jun 9, 2020

    I do respect the beta feature freeze. The relevant feature was added prior to b1. The reverted change is an incremental fix addressing underlying implementation details such as how resources are resolved and removing duplicate code paths.

    Using the revert hammer is never an easy call and I've only done that a handful times so far. In this particular case the following helped me make the decision:

    • the change was over 2,000 lines big;
    • as you say, a non-trivial amount of it was refactoring;
    • the change was committed without review;
    • the backport was done three weeks after the beta freeze without consulting the RM or anybody else in that matter.

    Sure, it's worth fixing the problems you identified after Beta 1, maybe even bring the fixes back to 3.9, but not how this was done this time. We'd like to keep our release cadence stable and avoid hotfixes in the future.

    We are all grateful for your work, Jason, and we're happy to have somebody around who is invested in making Python better. To reiterate, I'm less worried about the introduced bug itself, and more about the change management attitude. Yes, it would indeed be a shame if your feature had to wait for another Python release. But risking widespread breakage isn't a good trade off.

    It sucks for me to be this boring beaurocrat, believe me, but the purest solution to a deep design issue identified after Beta 1 in a new feature is to... revert that feature altogether and go back to the drawing board for the next release. It's not a theoretical situation. I went through this in the Python 3.8 cycle (see BPO-38242). It sucked but it was the right thing to do.

    @jaraco
    Copy link
    Member

    jaraco commented Jun 9, 2020

    I feel terrible and really regret that this was able to break things so badly.

    What options are available at this point? I'd at the very least like to remove the loader.files() behavior to avoid encouraging other loaders to implement it. Should we also revert the resources.files() feature altogether, as you suggested?

    @ambv
    Copy link
    Contributor

    ambv commented Jun 10, 2020

    I thought about this all day. Given the scope of the change you made I think no option looks ideal. If we revert the entire feature, it's a big bummer for everybody involved and it might be an overreaction. If we keep the feature at the state of Beta 1, it might become a maintenance nightmare for third-party users. If we agree to a the big change you now see necessary, we are reintroducing the same risk that I highlighted in previous comments.

    Please note that this is in no way intended to make you feel terrible. Don't beat yourself up about it. Now, think as the core developer you are: if you are 100% sure the change is worth pushing to 3.9, the best course of action at this point seems to be: find a champion within core who will work with you to quickly reassess the situation and give us an action plan as a team.

    Say, Barry seems like the perfect candidate. If such champion shares your opinion about it being important for 3.9, and is willing to work with you on finishing what you need changing (incl. review), *and* to ensure the existing users of importlib.resources don't see any breakage not documented as of Beta 1, then we can talk about an exceptional late inclusion in 3.9.

    @jaraco
    Copy link
    Member

    jaraco commented Jun 11, 2020

    Thanks for the thorough and considerate response.

    I do think your original recommendation of reverting the broken feature is the best approach at this point. That is, keep resources.files with the fallback shim and eliminate support for loaders supplying that behavior. That will avoid users relying on that protocol but enable the files feature for Source and Zip importers.

    That will buy time for the remaining functionality, mainly the provider interface, to mature and possibly evolve further, for eventual adoption in a future Python release.

    Some planning will need to be dialed back, but I don’t have confidence in the implementation to say that it’s the best one. Better to defer that effort.

    I’ll put together a patch for 3.9 to remove the loader support (backward incompatible with b1/2 but compatible with 3.8 but only for custom loaders), and put together a hot fix for master so it’s no longer broken.

    @ambv
    Copy link
    Contributor

    ambv commented Jun 14, 2020

    New changeset 8a34690 by Jason R. Coombs in branch '3.9':
    [3.9] bpo-40924: Remove protocol for supplying Traversable objects from loaders (GH-20820)
    8a34690

    @ambv
    Copy link
    Contributor

    ambv commented Jun 29, 2020

    New changeset 2fb5f03 by Jason R. Coombs in branch 'master':
    bpo-40924: Ensure importlib.resources.path returns an extant path (GH-20857)
    2fb5f03

    @ambv
    Copy link
    Contributor

    ambv commented Jun 29, 2020

    This can now be closed. Thank you for all hard work here.

    @ambv ambv closed this as completed Jun 29, 2020
    @ambv ambv closed this as completed Jun 29, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes release-blocker stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants