Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a lazy import loader mixin #61821

Closed
brettcannon opened this issue Apr 2, 2013 · 26 comments
Closed

Create a lazy import loader mixin #61821

brettcannon opened this issue Apr 2, 2013 · 26 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@brettcannon
Copy link
Member

BPO 17621
Nosy @Yhg1s, @warsaw, @brettcannon, @ericvsmith, @jwilk, @ericsnowcurrently
Files
  • test_lazy_loader.py: tests
  • lazy_mixin.py: Solution using a mixin
  • lazy_proxy.py: Solution using a proxy
  • lazy_test.py
  • lazy_loader.diff
  • lazy_loader.diff
  • lazy_loader.diff
  • lazy_loader.diff: Detect sys.modules swap
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2014-04-04.17:55:06.543>
    created_at = <Date 2013-04-02.18:00:47.615>
    labels = ['type-feature', 'library']
    title = 'Create a lazy import loader mixin'
    updated_at = <Date 2014-04-04.22:10:33.832>
    user = 'https://github.com/brettcannon'

    bugs.python.org fields:

    activity = <Date 2014-04-04.22:10:33.832>
    actor = 'eric.snow'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2014-04-04.17:55:06.543>
    closer = 'brett.cannon'
    components = ['Library (Lib)']
    creation = <Date 2013-04-02.18:00:47.615>
    creator = 'brett.cannon'
    dependencies = []
    files = ['30670', '30671', '30672', '33136', '33947', '34556', '34576', '34591']
    hgrepos = []
    issue_num = 17621
    keywords = ['patch']
    message_count = 26.0
    messages = ['185852', '191551', '191616', '191623', '191651', '191660', '191667', '191675', '191772', '191776', '191779', '201930', '202445', '206205', '206211', '210410', '214410', '214508', '214536', '214627', '214921', '214922', '214954', '215542', '215543', '215572']
    nosy_count = 10.0
    nosy_names = ['twouters', 'barry', 'brett.cannon', 'eric.smith', 'jwilk', 'durin42', 'Arfrever', 'python-dev', 'sbt', 'eric.snow']
    pr_nums = []
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue17621'
    versions = ['Python 3.5']

    @brettcannon
    Copy link
    Member Author

    People keep asking and I keep promising to get a lazy loader into Python 3.4. This issue is for me to track what I have done towards meeting that promise.

    To start, I have something working at https://code.google.com/p/importers/, but I need to make sure that the code copies any newly assigned objects post-import but before completing an attribute read::

      import lazy_mod
      lazy_mod.attr = True  # Does not have to trigger import, although there is nothing wrong if it did.
      lazy_mod.func()  # Since it might depend on 'attr', don't return attr until after import and 'attr' with a value of True has been set.

    Also need to see if anything can be done about isinstance/issubclass checks as super() is used for assigning to __loader__ and thus might break checks for ABCs. Maybe create a class from scratch w/o the mixin somehow (which I don't see from looking at http://docs.python.org/3.4/library/types.html#module-types w/o re-initializing everything)? Somehow get __subclasscheck__() on the super() class? Some other crazy solution that avoids having to call __init__() a second time?

    @brettcannon brettcannon self-assigned this Apr 2, 2013
    @brettcannon brettcannon added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Apr 2, 2013
    @brettcannon
    Copy link
    Member Author

    One problem with the importers code is it doesn't properly clean up after the module if the load failed and it wasn't a reload. Should store an attribute on the module signaling whether it was a reload or not to know whether an exception raised during loading should lead to the module being removed from sys.modules.

    @brettcannon
    Copy link
    Member Author

    While seing if it was worth making isinstance work with super(), I came up with this at Antoine's suggestion of using a proxy instead of a mixin:

    class LazyProxy(importlib.abc.Loader):
    
      def __init__(self, loader):
        self.loader = loader
    
      def __call__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
        return self
    
      def load_module(self, fullname):
        # XXX ignoring sys.modules details, e.g. if module already existed.
        lazy_module = LazyModule(fullname, proxy=self, name=fullname)
        sys.modules[fullname] = lazy_module
        return lazy_module
    
    class LazyModule(types.ModuleType):
    
        def __init__(*args, proxy, name, **kwargs):
          self.__proxy = proxy
          self.__name = name
          super().__init__(*args, **kwargs)
    
        def __getattribute__(self, attr):
          self.__class__ = Module
          state = self.__dict__.copy()
          loader = self.__proxy.loader(*self.proxy.args, **self.proxy.kwargs)
          # XXX ignoring sys.modules details, e.g. removing module on load
    failure.
          loader.load_module(self.__name)
          self.__dict__.update(state)
          return getattr(module, attr)

    @brettcannon
    Copy link
    Member Author

    I think the first step for this bug, now that I have two possible approaches, is to write the unit tests. That way both approaches can equally be validated based on their merits of complexity, etc. while verifying the work properly.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jun 22, 2013

    Apologies for being dense, but how would you actually use such a loader?

    Would you need to install something in sys.meta_path/sys.path_hooks? Would it make all imports lazy or only imports of specified modules?

    @brettcannon
    Copy link
    Member Author

    So the approaches I have been using make a loader lazy, so what you have to change in terms of sys.meta_path, sys.path_hooks, etc. would very from loader to loader.

    I have realized one tricky thing with all of this is that importlib itself inspects modules post-import to verify that __loader__ and __package__ have been set. That typically triggers an immediate load and so might need to be special-cased.

    @brettcannon
    Copy link
    Member Author

    I have attached the test suite and two versions: one using a mixin and one using a proxy. Neither solve the issue of import touching the module in any way to check __loader__ and __package__. The mixin version is also failing one test which could quite possibly be a pain to fix and so it might cause me to prefer the proxy solution.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jun 22, 2013

    Shouldn't the import lock be held to make it threadsafe?

    @brettcannon
    Copy link
    Member Author

    Import manages the lock, not loaders.

    @sbt
    Copy link
    Mannequin

    sbt mannequin commented Jun 24, 2013

    I was thinking about the line

          self.__dict__.update(state)

    overwriting new data with stale data.

    @brettcannon
    Copy link
    Member Author

    It still falls under the purview of import to manage that lock. It's just the design of the API from PEP-302. Otherwise it's just like any other reload.

    @brettcannon
    Copy link
    Member Author

    Won't work until PEP-451 code lands and lets us drop __package__/loader checking/setting post-load.

    @brettcannon
    Copy link
    Member Author

    Shifting this to Python 3.5 to rework using the forthcoming PEP-451 APIs which have been designed to explicitly allow for a lazy loader.

    @brettcannon
    Copy link
    Member Author

    Need to quickly test that this will work with PEP-451 works with the design in my head before we get farther into 3.4.

    @brettcannon
    Copy link
    Member Author

    Attached is a script to verify that PEP-451 works as desired, so Python 3.5 doesn't have any technical blockers for doing a lazy loader for PEP-451 loaders.

    And with __spec__.loader_state it might be possible to work things out through a common API to work around issue bpo-18275 so that relying on super() and doing this as a mixin goes away and instead just somehow store the final loader on the spec (e.g. loader's constructor just takes a spec so that you can instantiate the actual loader, reset __loader__ & __spec__.loader, and then proceed with exec_module()).

    @brettcannon
    Copy link
    Member Author

    Here is a patch which implements a lazy loader for loaders that define exec_module().

    @brettcannon
    Copy link
    Member Author

    New patch that includes docs and integrates the tests. If someone who understands import can look it over and give me an LGTM that would be appreciated.

    @ericsnowcurrently
    Copy link
    Member

    Review posted. Thanks for working on this, Brett.

    @brettcannon
    Copy link
    Member Author

    Here is a new patch that addresses Eric's comments and also fills in some holes that I realized I had while fixing things up. PTAL.

    @brettcannon
    Copy link
    Member Author

    Another update to trigger loading on attribute deletions as well as detecting when an import swapped the module in sys.modules, raising ValueError if detected since it won't have the affect that's expected (could be convinced to make that ImportError instead).

    @ericsnowcurrently
    Copy link
    Member

    I wonder if there would be any benefit to using this for some of the modules that get loaded during startup. I seem to remember there being a few for which lazy loading would have an effect.

    @ericsnowcurrently
    Copy link
    Member

    New review posted. Basically LGTM.

    @brettcannon
    Copy link
    Member Author

    So as-is, this won't help with startup as we already make sure that no unnecessary modules are loaded during startup. But one way we do that is through local imports in key modules, e.g. os.get_exec_path(). So what we could consider is instead of doing a local import we use lazy imports. We could introduce importlib._lazy_import() which could be (roughly):

      def _lazy_import(fullname):
        try:
          return sys.modules[fullname]
        except KeyError:
          spec = find_spec(fullname)
          loader = LazyLoader(spec.loader)
          # Make module with proper locking and get it inserted into sys.modules.
          loader.exec_module(module)
          return module

    I don't know if that simplifies things, though, compared to a local import. It might help once a module is identified as on the critical path of startup since all global imports in that module could be lazy, but we would still need to identify those critical modules.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 4, 2014

    New changeset 52b58618199c by Brett Cannon in branch 'default':
    Issue bpo-17621: Introduce importlib.util.LazyLoader.
    http://hg.python.org/cpython/rev/52b58618199c

    @brettcannon
    Copy link
    Member Author

    I went ahead and committed. I realized I could loosen the "no create_module()" requirement, but I think it could lead to more trouble than it's worth so I left it as-is for now. If people says it's an issue we can revisit it.

    @ericsnowcurrently
    Copy link
    Member

    Sweet!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants