Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace bundled pip and setuptools with a downloader in the ensurepip module #80789

Closed
webknjaz mannequin opened this issue Apr 11, 2019 · 11 comments
Closed

Replace bundled pip and setuptools with a downloader in the ensurepip module #80789

webknjaz mannequin opened this issue Apr 11, 2019 · 11 comments
Labels
stdlib Python modules in the Lib dir topic-ensurepip type-feature A feature request or enhancement

Comments

@webknjaz
Copy link
Mannequin

webknjaz mannequin commented Apr 11, 2019

BPO 36608
Nosy @ericvsmith, @ned-deily, @serhiy-storchaka, @dstufft, @webknjaz, @pradyunsg
PRs
  • gh-80789: Implement build-time pip bundling in ensurepip #12791
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-04-11.22:13:02.045>
    labels = ['3.8', 'type-feature', 'library', '3.9']
    title = 'Replace bundled pip and setuptools with a downloader in the ensurepip module'
    updated_at = <Date 2021-07-05.23:28:08.628>
    user = 'https://github.com/webknjaz'

    bugs.python.org fields:

    activity = <Date 2021-07-05.23:28:08.628>
    actor = 'ned.deily'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-04-11.22:13:02.045>
    creator = 'webknjaz'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36608
    keywords = ['patch']
    message_count = 6.0
    messages = ['339998', '340039', '340060', '340169', '340218', '347836']
    nosy_count = 6.0
    nosy_names = ['eric.smith', 'ned.deily', 'serhiy.storchaka', 'dstufft', 'webknjaz', 'pradyunsg']
    pr_nums = ['12791']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue36608'
    versions = ['Python 3.8', 'Python 3.9']

    Linked PRs

    @webknjaz
    Copy link
    Mannequin Author

    webknjaz mannequin commented Apr 11, 2019

    Hi,

    I've noticed that there's an idea to not pollute Git tree with vendored blobs. In particular, ensurepip is one of the components doing this.

    Such a wish was expressed here: https://bugs.python.org/issue35277#msg330098

    So I thought I'd take a stab at it...

    @webknjaz webknjaz mannequin added 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Apr 11, 2019
    @ericvsmith
    Copy link
    Member

    ensurepip does not access the network, by design. We do not want it to start access the network without a lot of discussion.

    And if it does access the network, it will need to be able to use alternate URLs. For example: where I deploy Python, it would not have access to the URLs in your PR, but instead would need to specify a different (internal) location. This is the same reason that pip install has --find-links, --no-index, --extra-index-url, etc.

    I think this would need a lot of discussion (probably on distutils-sig), and probably a PEP.

    @ericvsmith
    Copy link
    Member

    And I don't mean to sound like a total downer. I just think it's important that we recognize all of the use cases.

    Thanks for your work on this.

    @pradyunsg
    Copy link
    Member

    (Not sure how the Roundup handles email replies but I'm hoping this goes to
    the right place)

    I think it would be better if the downloading got invoked during the
    interpreter build process -- to download the wheels and add them to the
    final distribution. This lets us remove the wheels from the source
    tree/version control and prevents needing to change the PEP for this change.

    Functionally, I imagine having all the download logic in some sort of
    ensurepip._bootstrap which has all the download logic and that getting
    invoked in the build process. python -m ensurepip not hitting the
    internet is a good invariant to keep.

    On Fri, 12 Apr 2019 at 8:54 PM, Eric V. Smith <report@bugs.python.org>
    wrote:

    Eric V. Smith <eric@trueblade.com> added the comment:

    And I don't mean to sound like a total downer. I just think it's important
    that we recognize all of the use cases.

    Thanks for your work on this.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue36608\>


    @serhiy-storchaka
    Copy link
    Member

    I proposed to move bundled pip and setuptools to the external repository and download them at build time like Tcl and other dependencies on Windows.

    @webknjaz
    Copy link
    Mannequin Author

    webknjaz mannequin commented Jul 13, 2019

    Thanks for the feedback!

    I've changed it a bit to have a separate command for downloading bundles to the source tree. It'd work as in python -m ensurepip.bundle (needs a better name/CLI args probably).

    Does it sound better now?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    @webknjaz
    Copy link
    Contributor

    Folks, I revamped the PR and need some help with the build process integration here: https://github.com/python/cpython/pull/12791/files#r1306308587. Could anybody take a look?

    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 26, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 27, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 27, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 27, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 27, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    webknjaz added a commit to webknjaz/cpython that referenced this issue Aug 27, 2023
    Prior to this patch, Pip wheels were stored in the Git repository of
    CPython. Git is optimized for text but these artifacts are binary. So
    the unpleasant side effect of doing this is that the bare Git
    repository size is being increased by the zip archive side every time
    it is added, removed or modified. It's time to put a stop to this.
    
    The patch implements an `ensurepip.bundle` module that is meant to be
    called through `runpy` to download the Pip wheel and place it into the
    same location as before. It removes the wheel file from the Git
    repository and prevents re-adding it by defining a new `.gitignore`
    configuration file.
    
    The idea is that the builders of CPython are supposed to invoke the
    following command during the build time:
    
    ```console
    $ python -m ensurepip.bundle
    ```
    
    This command will verify the existing wheel's SHA-256 hash and, if it
    does not match, or doesn't exist, it will proceed to download the
    artifact from PyPI. It will confirm its SHA-256 hash before placing it
    into the `Lib/ensurepip/_bundled/` directory.
    
    Every single line added or modified as a part of this change is also
    covered with tests. Every new module has 100% coverage. The only
    uncovered lines under `Lib/ensurepip/` are the ones that are
    absolutely unrelated to this effort.
    
    Resolves python#80789.
    
    Ref: https://bugs.python.org/issue36608.
    @webknjaz
    Copy link
    Contributor

    @pradyunsg should this issue get newer labels for 3.10, 3.11, 3.12 and 3.13?

    @pradyunsg pradyunsg removed 3.9 only security fixes 3.8 only security fixes labels Oct 12, 2023
    @pradyunsg
    Copy link
    Member

    I discussed this at some length at the core developers sprint this week (including some of current RMs).

    The main benefit of doing this is that we can avoid committing a ~2 MB wheel1, on a quarterly basis for pip releases, due to a limitation of git + binary blobs. Given that we no longer vendor a setuptools wheel, which evolves at a faster cadence than pip, the size growth rate as well as the rate of change isn't particularly high.

    The costs of this are (i) every one who tries to build CPython for the first time and run tests would need to perform this download, (ii) the CPython release process needs to be amended to include a downloaded wheel as part of the release tarball, (iii) it's unclear how/when this downloaded wheels should be deleted/removed and how that interacts with existing make clean/distclean etc, and (iv) it introduces an additional point of interaction with PyPI as part of the development/release cycle for CPython.

    Based on the additional side-effects and work that this would cause, I'm gonna say that this isn't a big deal for now. We can revisit this if there is a strong(er) argument against committing the wheels into the repository to justify the additional side-effects on the development and release processes that it would have.

    Footnotes

    1. Or a few more, depending on bugfixes and whether pip's RM decides to place every wheel.

    @AA-Turner AA-Turner closed this as not planned Won't fix, can't repro, duplicate, stale Oct 12, 2023
    @SpecLad
    Copy link
    Contributor

    SpecLad commented Oct 12, 2023

    Would it perhaps be viable to include pip as an (unpacked) sdist in the ensurepip source tree, and build it during the CPython build process? The repo bloat should be much smaller that way.

    (You'd also need to include pip's build dependencies, but those don't need to be updated frequently.)

    @webknjaz
    Copy link
    Contributor

    Would it perhaps be viable to include pip as an (unpacked) sdist in the ensurepip source tree, and build it during the CPython build process? The repo bloat should be much smaller that way.

    @AA-Turner @pradyunsg WDYT about the suggestion above? Could also be an unpacked wheel (wheel unpack pip-*.whl output), which feels like a more slim solution, than an sdist.

    pradyunsg added a commit that referenced this issue Jan 30, 2024
    Co-authored-by: vstinner@python.org
    Co-authored-by: Pradyun Gedam <pradyunsg@gmail.com>
    Co-authored-by: Adam Turner <9087854+aa-turner@users.noreply.github.com>
    aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
    …ython#109245)
    
    Co-authored-by: vstinner@python.org
    Co-authored-by: Pradyun Gedam <pradyunsg@gmail.com>
    Co-authored-by: Adam Turner <9087854+aa-turner@users.noreply.github.com>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-ensurepip type-feature A feature request or enhancement
    Projects
    None yet
    Development

    Successfully merging a pull request may close this issue.

    6 participants