Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make import machinery explicit #58810

Closed
brettcannon opened this issue Apr 17, 2012 · 33 comments
Closed

Make import machinery explicit #58810

brettcannon opened this issue Apr 17, 2012 · 33 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error

Comments

@brettcannon
Copy link
Member

BPO 14605
Nosy @malemburg, @brettcannon, @ncoghlan, @ericvsmith, @merwok, @bitdancer, @ericsnowcurrently
Files
  • explicit.diff
  • explicit_path_hooks.diff
  • explicit_path_hooks.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2012-04-27.20:03:12.259>
    created_at = <Date 2012-04-17.15:58:04.465>
    labels = ['interpreter-core', 'type-bug', 'release-blocker']
    title = 'Make import machinery explicit'
    updated_at = <Date 2012-04-27.20:03:12.259>
    user = 'https://github.com/brettcannon'

    bugs.python.org fields:

    activity = <Date 2012-04-27.20:03:12.259>
    actor = 'brett.cannon'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2012-04-27.20:03:12.259>
    closer = 'brett.cannon'
    components = ['Interpreter Core']
    creation = <Date 2012-04-17.15:58:04.465>
    creator = 'brett.cannon'
    dependencies = []
    files = ['25307', '25334', '25359']
    hgrepos = []
    issue_num = 14605
    keywords = ['patch']
    message_count = 33.0
    messages = ['158552', '158996', '158997', '159107', '159114', '159115', '159145', '159149', '159157', '159177', '159195', '159234', '159236', '159237', '159240', '159242', '159244', '159245', '159246', '159248', '159253', '159262', '159263', '159327', '159334', '159337', '159343', '159346', '159347', '159482', '159486', '159487', '159488']
    nosy_count = 9.0
    nosy_names = ['lemburg', 'brett.cannon', 'ncoghlan', 'eric.smith', 'eric.araujo', 'Arfrever', 'r.david.murray', 'python-dev', 'eric.snow']
    pr_nums = []
    priority = 'release blocker'
    resolution = None
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14605'
    versions = ['Python 3.3']

    @brettcannon
    Copy link
    Member Author

    There should no longer be any implicit part of import when there doesn't have to be. To make import fully explicit, some things need to happen (in context to importlib):

    • Expose FileLoader
    • Expose SourceFileLoader
    • Expose PathFinder
    • Expose the extension loader
    • Expose the default path hook
    • Explicitly set sys.meta_path with PathFinder
    • Explicitly set sys.path_hooks with the default path hook and zipimport
    • Make sys.path_importer_cache view None as no finder discovered
    • Insert None into sys.path_importer_cache instead of NullImporter

    I am not exposing SourcelessFileLoader because importlib publicly tries to discourage the shipping of .pyc files w/o their corresponding source files. Otherwise all objects as used by importlib for performing imports will become public.

    @brettcannon brettcannon self-assigned this Apr 17, 2012
    @brettcannon brettcannon added release-blocker interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Apr 17, 2012
    @brettcannon
    Copy link
    Member Author

    Attached is my (failing) attempt at making import explicit. Unfortunately I have four failing tests, 3 of which revolve around __main__ (which is why I added Nick to see if he had any ideas):

    test_cmd_line_script test_runpy test_threaded_import test_zipimport_support

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 23, 2012

    New changeset 1da623513b26 by Brett Cannon in branch 'default':
    Issue bpo-14605: Expose importlib.abc.FileLoader and
    http://hg.python.org/cpython/rev/1da623513b26

    @brettcannon
    Copy link
    Member Author

    Updated patch for an explicit sys.path_hooks that simply tweaks the test_cmd_line_script tests to stop requiring an absolute path on the files and instead verifies that the relative paths are legit. It also adds warnings when None is found in sys.path_importer_cache and then subsequently just tries sys.path_hooks again. It also warnings when sys.path_hooks is empty.

    @ncoghlan
    Copy link
    Contributor

    I put those explicit path checks in there deliberately.

    I really don't want to go back to having any __file__ attributes that are sensitive to sys.path changes.

    @ncoghlan
    Copy link
    Contributor

    Oops: s/sensitive to sys.path changes/sensitive to current working directory changes/

    @brettcannon
    Copy link
    Member Author

    Of course you did because you just like making my life a living hell when it comes to the __file__ attribute. =)

    OK, I will have another look when I get home, but last time I dealt with this the issue is some tests expect a relative path while others expect an absolute one. And allowing an empty string for the current directory is just evil.

    @malemburg
    Copy link
    Member

    Brett Cannon wrote:

    I am not exposing SourcelessFileLoader because importlib publicly tries to discourage the shipping of .pyc files w/o their corresponding source files. Otherwise all objects as used by importlib for performing imports will become public.

    What's the reasoning behind this idea ? Is Python 3.3 no longer meant to
    be used for closed source applications ?

    @brettcannon
    Copy link
    Member Author

    That initial comment is out-of-date. If you look that the commit I made I documented importlib.machinery._SourcelessFileLoader. I am continuing the discouragement of using bytecode files as an obfuscation technique (because it's a bad one), but I decided to at least document the class so people can use it at their own peril and know about it if they happen to come across the object during execution.

    @malemburg
    Copy link
    Member

    Brett Cannon wrote:

    That initial comment is out-of-date. If you look that the commit I made I documented importlib.machinery._SourcelessFileLoader. I am continuing the discouragement of using bytecode files as an obfuscation technique (because it's a bad one), but I decided to at least document the class so people can use it at their own peril and know about it if they happen to come across the object during execution.

    It's not a perfect obfuscation technique, but a pretty simple and
    (legally) effective one to use.

    FWIW, I don't think the comment in the check-in is appropriate:

    """
    1.127 + It is strongly suggested you do not rely on this loader (hence the
    1.128 + leading underscore of the class). Direct use of bytecode files (and thus not
    1.129 + source code files) inhibits your modules from being usable by all Python
    1.130 + implementations. It also runs the risk of your bytecode files not being
    1.131 + usable by new versions of Python which change the bytecode format. This
    1.132 + class is only documented as it is directly used by import and thus can
    1.133 + potentially have instances show up as a module's __loader__ attribute.
    """

    The "risks" you mention there are really up to the application developers
    to decide how to handle, not the Python developers. Python has a long
    tradition of being friendly to commercial applications and I don't see
    any reason why we should stop that.

    If you do want this to change, please write a PEP. This may appear
    to be a small change in direction, but it does in fact have quite
    some impact on the usefulness of CPython in commercial settings.

    I also think that the SourcelessFileLoader loader should be first class
    citizen without the leading underscore if the importlib is to completely
    replace the current import mechanism. Why force developers to write their
    own loader instead of using the standard one just because of the leading
    underscore, when it's only 20 lines of code ?

    Thanks,

    Marc-Andre Lemburg
    eGenix.com


    2012-04-28: PythonCamp 2012, Cologne, Germany 4 days to go

    ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

    eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
    Registered at Amtsgericht Duesseldorf: HRB 46611
    http://www.egenix.com/company/contact/

    @brettcannon
    Copy link
    Member Author

    I documented it explicitly so people can use it if they so choose (e.g. look at sys._getframe()). If you want to change this that's fine, but I am personally not going to put the effort in to rename the class, update the tests, and change the docs for this (we almost stopped allowing the importation of bytecode directly not so long ago but got push-back so we backed off).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 24, 2012

    New changeset 8aa4737d67d2 by Marc-Andre Lemburg in branch 'default':
    Issue bpo-14605: Rename _SourcelessFileLoader to SourcelessFileLoader
    http://hg.python.org/cpython/rev/8aa4737d67d2

    @malemburg
    Copy link
    Member

    Brett Cannon wrote:

    I documented it explicitly so people can use it if they so choose (e.g. look at sys._getframe()). If you want to change this that's fine, but I am personally not going to put the effort in to rename the class, update the tests, and change the docs for this (we almost stopped allowing the importation of bytecode directly not so long ago but got push-back so we backed off).

    I renamed the loader and reworded the notice in the docs.

    Thanks,

    Marc-Andre Lemburg
    eGenix.com


    2012-04-28: PythonCamp 2012, Cologne, Germany 3 days to go

    ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::

    eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
    Registered at Amtsgericht Duesseldorf: HRB 46611
    http://www.egenix.com/company/contact/

    @bitdancer
    Copy link
    Member

    Hmm. Some at least of the buildbots have failed to build after that patch:

    ./python ./Python/freeze_importlib.py \
    ./Lib/importlib/_bootstrap.py Python/importlib.h
    make: ./python: Command not found
    make: *** [Python/importlib.h] Error 127
    program finished with exit code 2

    (http://www.python.org/dev/buildbot/all/builders/AMD64%20Gentoo%20Wide%203.x/builds/3771)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 25, 2012

    New changeset e30196bfc11d by Marc-Andre Lemburg in branch 'default':
    Issue bpo-14605: Revert renaming of _SourcelessFileLoader, since it caused
    http://hg.python.org/cpython/rev/e30196bfc11d

    @malemburg
    Copy link
    Member

    R. David Murray wrote:

    R. David Murray <rdmurray@bitdance.com> added the comment:

    Hmm. Some at least of the buildbots have failed to build after that patch:

    ./python ./Python/freeze_importlib.py \
    ./Lib/importlib/_bootstrap.py Python/importlib.h
    make: ./python: Command not found
    make: *** [Python/importlib.h] Error 127
    program finished with exit code 2

    (http://www.python.org/dev/buildbot/all/builders/AMD64%20Gentoo%20Wide%203.x/builds/3771)

    Thanks for mentioning this. I've reverted the change for now and
    will have a look tomorrow.

    The logs of the failing bots are not very informative about what
    is going on:

    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/dynamic_annotations.o Python/dynamic_annotations.c
    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/errors.o Python/errors.c
    ./python ./Python/freeze_importlib.py \
    ./Lib/importlib/_bootstrap.py Python/importlib.h
    make: ./python: Command not found
    make: *** [Python/importlib.h] Error 127
    program finished with exit code 2

    vs.

    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/dynamic_annotations.o Python/dynamic_annotations.c
    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/errors.o Python/errors.c
    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/frozen.o Python/frozen.c
    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/frozenmain.o Python/frozenmain.c
    gcc -pthread -c -Wno-unused-result -g -O0 -Wall -Wstrict-prototypes -I. -I./Include
    -DPy_BUILD_CORE -o Python/future.o Python/future.c

    I guess some commands are not printed to stdout.

    Looking at the buildbots again: reverting the patch has not caused
    the lights to go green again. Very strange indeed.

    Looking further I found this line in the Makefile:

    ############################################################################
    # Importlib

    Python/importlib.h: $(srcdir)/Lib/importlib/_bootstrap.py $(srcdir)/Python/freeze_importlib.py
    ./$(BUILDPYTHON) $(srcdir)/Python/freeze_importlib.py \
    $(srcdir)/Lib/importlib/_bootstrap.py Python/importlib.h

    Since the patch modified _bootstrap.py, make wants to recreate importlib.h,
    but at that time $(BUILDPYTHON) doesn't yet exist.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 25, 2012

    New changeset a2cf07135e4f by Marc-Andre Lemburg in branch 'default':
    Issue bpo-14605: Rename _SourcelessFileLoader to SourcelessFileLoader.
    http://hg.python.org/cpython/rev/a2cf07135e4f

    @malemburg
    Copy link
    Member

    Marc-Andre Lemburg wrote:

    Looking further I found this line in the Makefile:

    ############################################################################

    Importlib

    Python/importlib.h: $(srcdir)/Lib/importlib/_bootstrap.py $(srcdir)/Python/freeze_importlib.py
    ./$(BUILDPYTHON) $(srcdir)/Python/freeze_importlib.py \
    $(srcdir)/Lib/importlib/_bootstrap.py Python/importlib.h

    Since the patch modified _bootstrap.py, make wants to recreate importlib.h,
    but at that time $(BUILDPYTHON) doesn't yet exist.

    I now ran 'make' after applying the patches to have the importlib.h
    recreated.

    This setup looks a bit fragile to me.

    I think it would be better to make creation of the importlib.h an
    explicit operation that has to be done in case the Python code
    changes (e.g. by creating a make target build-importlib.h),
    with the Makefile only warning about a needed update instead
    of failing completely.

    @brettcannon
    Copy link
    Member Author

    You can see a little discussion in http://bugs.python.org/issue14642, but it has been discussed elsewhere and the automatic rebuilding was preferred (but it is not a requirement to build as importlib.h is in hg).

    @brettcannon
    Copy link
    Member Author

    I found out why runpy was giving back an absolute file path for __file__; because pkgutil was doing that through its ImpLoader, unlike what import does by default which is just taking what path it has and appending file names (and thus not making anything absolute).

    So I have simply forced runpy to make absolute paths of the files it is given. Nick, can you have a look and let me know if this fix is reasonable, or if another os.path.abspath() call is needed somewhere in runpy?

    @brettcannon brettcannon assigned ncoghlan and unassigned brettcannon Apr 25, 2012
    @ncoghlan
    Copy link
    Contributor

    Yeah, I was actually going to suggest forcing an absolute path for __main__.__file__ in runpy if you didn't want to do it in importlib itself.

    I'm much happier with that approach than changing the tests, so the updated patch looks good to me.

    @ncoghlan ncoghlan assigned brettcannon and unassigned ncoghlan Apr 25, 2012
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 25, 2012

    New changeset acfdf46b8de1 by Marc-Andre Lemburg in branch 'default':
    Issue bpo-14605 and bpo-14642:
    http://hg.python.org/cpython/rev/acfdf46b8de1

    @malemburg
    Copy link
    Member

    Brett Cannon wrote:

    You can see a little discussion in http://bugs.python.org/issue14642, but it has been discussed elsewhere and the automatic rebuilding was preferred (but it is not a requirement to build as importlib.h is in hg).

    An automatic rebuild is fine, but only as long as the local ./python
    actually exists.

    I was unaware of make rule, so did not run make to check things before
    the checkin. As a result, the bootstrap module received a more recent
    timestamp than importlib.h and this caused all the buildbots to
    force a rebuild of importlib.h - which failed, since they didn't
    have a built ./python at that stage.

    I checked in a fix and added a warning to the bootstrap script.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 25, 2012

    New changeset 5fea362b92fc by Marc-Andre Lemburg in branch 'default':
    Issue bpo-14605 and bpo-14642: Issue a warning in case Python\importlib.h needs to
    http://hg.python.org/cpython/rev/5fea362b92fc

    @ericsnowcurrently
    Copy link
    Member

    While not in the initial list, _find_module() would be really handy. Perhaps we could call it "get_loader" instead. "find_module" is a misleading name and I don't see any parallel with imp.find_module as something to aspire to.

    @brettcannon
    Copy link
    Member Author

    importlib.find_module() (or get_loader() as it would replace pkgutil.get_loader() as well) is definitely planned. So is importlib.util.resolve_name() (although maybe that is basic enough to want top-level?).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 26, 2012

    New changeset 8dab93ec19de by Brett Cannon in branch 'default':
    Issue bpo-14605: Insert to the front of sys.path_hooks instead of appending.
    http://hg.python.org/cpython/rev/8dab93ec19de

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 26, 2012

    New changeset 57d558f1904d by Brett Cannon in branch 'default':
    Issue bpo-14605: Make explicit the entries on sys.path_hooks that used to
    http://hg.python.org/cpython/rev/57d558f1904d

    @brettcannon
    Copy link
    Member Author

    Just to document why my explicit sys.path_hooks patch didn't quite change the meaning of None in sys.path_importer_cache, I found a bunch of places in the stdlib and in Modules/main.c where NullImporter is explicitly expected to be returned, so I wanted to get this initial patch in before I saw what would happen if None didn't change its meaning.

    I think I will try a patch to have None mean "no finder" instead of the current "retry sys.path_hooks" and see how that goes. The real trick is whether stopping the use of NullImporter will be easy or not. That will be what really decides if it stops being on sys.path_hooks by default.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 27, 2012

    New changeset c18256de00bb by Brett Cannon in branch 'default':
    Issue bpo-14605: Insert to the front of sys.meta_path, don't append.
    http://hg.python.org/cpython/rev/c18256de00bb

    New changeset 3bd60cc27664 by Brett Cannon in branch 'default':
    Issue bpo-14605: Stop having implicit entries for sys.meta_path.
    http://hg.python.org/cpython/rev/3bd60cc27664

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 27, 2012

    New changeset 7025ee00dbf6 by Brett Cannon in branch 'default':
    Issue bpo-14605: Use None in sys.path_importer_cache to represent no
    http://hg.python.org/cpython/rev/7025ee00dbf6

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 27, 2012

    New changeset 141ed4b426e1 by Brett Cannon in branch 'default':
    Issue bpo-14605: Don't error out if get_importer() returns None.
    http://hg.python.org/cpython/rev/141ed4b426e1

    @brettcannon
    Copy link
    Member Author

    OK, so the todo of this issue is now finished. I am just waiting for the buildbots to come back green before I close this issue fully.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants