classification
Title: Create a lazy import loader mixin
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: Arfrever, barry, brett.cannon, durin42, eric.smith, eric.snow, jwilk, python-dev, sbt, twouters
Priority: low Keywords: patch

Created on 2013-04-02 18:00 by brett.cannon, last changed 2014-04-04 22:10 by eric.snow. This issue is now closed.

Files
File name Uploaded Description Edit
test_lazy_loader.py brett.cannon, 2013-06-22 19:45 tests
lazy_mixin.py brett.cannon, 2013-06-22 19:46 Solution using a mixin
lazy_proxy.py brett.cannon, 2013-06-22 19:48 Solution using a proxy
lazy_test.py brett.cannon, 2013-12-15 02:25
lazy_loader.diff brett.cannon, 2014-02-06 19:02 review
lazy_loader.diff brett.cannon, 2014-03-21 19:40 review
lazy_loader.diff brett.cannon, 2014-03-23 02:15 review
lazy_loader.diff brett.cannon, 2014-03-23 19:01 Detect sys.modules swap review
Messages (26)
msg185852 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-04-02 18:00
People keep asking and I keep promising to get a lazy loader into Python 3.4. This issue is for me to track what I have done towards meeting that promise.

To start, I have something working at https://code.google.com/p/importers/, but I need to make sure that the code copies any newly assigned objects post-import but before completing an attribute read::

  import lazy_mod
  lazy_mod.attr = True  # Does not have to trigger import, although there is nothing wrong if it did.
  lazy_mod.func()  # Since it might depend on 'attr', don't return attr until after import and 'attr' with a value of True has been set.

Also need to see if anything can be done about isinstance/issubclass checks as super() is used for assigning to __loader__ and thus might break checks for ABCs. Maybe create a class from scratch w/o the mixin somehow (which I don't see from looking at http://docs.python.org/3.4/library/types.html#module-types w/o re-initializing everything)? Somehow get __subclasscheck__() on the super() class? Some other crazy solution that avoids having to call __init__() a second time?
msg191551 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-21 01:38
One problem with the importers code is it doesn't properly clean up after the module if the load failed and it wasn't a reload. Should store an attribute on the module signaling whether it was a reload or not to know whether an exception raised during loading should lead to the module being removed from sys.modules.
msg191616 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-21 22:35
While seing if it was worth making isinstance work with super(), I came up with this at Antoine's suggestion of using a proxy instead of a mixin:


class LazyProxy(importlib.abc.Loader):

  def __init__(self, loader):
    self.loader = loader

  def __call__(self, *args, **kwargs):
    self.args = args
    self.kwargs = kwargs
    return self

  def load_module(self, fullname):
    # XXX ignoring sys.modules details, e.g. if module already existed.
    lazy_module = LazyModule(fullname, proxy=self, name=fullname)
    sys.modules[fullname] = lazy_module
    return lazy_module

class LazyModule(types.ModuleType):

    def __init__(*args, proxy, name, **kwargs):
      self.__proxy = proxy
      self.__name = name
      super().__init__(*args, **kwargs)

    def __getattribute__(self, attr):
      self.__class__ = Module
      state = self.__dict__.copy()
      loader = self.__proxy.loader(*self.proxy.args, **self.proxy.kwargs)
      # XXX ignoring sys.modules details, e.g. removing module on load
failure.
      loader.load_module(self.__name)
      self.__dict__.update(state)
      return getattr(module, attr)
msg191623 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-22 00:33
I think the first step for this bug, now that I have two possible approaches, is to write the unit tests. That way both approaches can equally be validated based on their merits of complexity, etc. while verifying the work properly.
msg191651 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-06-22 18:24
Apologies for being dense, but how would you actually use such a loader?

Would you need to install something in sys.meta_path/sys.path_hooks?  Would it make all imports lazy or only imports of specified modules?
msg191660 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-22 19:12
So the approaches I have been using make a loader lazy, so what you have to change in terms of sys.meta_path, sys.path_hooks, etc. would very from loader to loader.

I have realized one tricky thing with all of this is that importlib itself inspects modules post-import to verify that __loader__ and __package__ have been set. That typically triggers an immediate load and so might need to be special-cased.
msg191667 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-22 19:48
I have attached the test suite and two versions: one using a mixin and one using a proxy. Neither solve the issue of import touching the module in any way to check __loader__ and __package__. The mixin version is also failing one test which could quite possibly be a pain to fix and so it might cause me to prefer the proxy solution.
msg191675 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-06-22 21:41
Shouldn't the import lock be held to make it threadsafe?
msg191772 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-24 14:38
Import manages the lock, not loaders.
msg191776 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-06-24 15:06
I was thinking about the line

      self.__dict__.update(state)

overwriting new data with stale data.
msg191779 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-06-24 15:23
It still falls under the purview of import to manage that lock. It's just the design of the API from PEP 302. Otherwise it's just like any other reload.
msg201930 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-11-01 18:12
Won't work until PEP 451 code lands and lets us drop __package__/__loader__ checking/setting post-load.
msg202445 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-11-08 20:32
Shifting this to Python 3.5 to rework using the forthcoming PEP 451 APIs which have been designed to explicitly allow for a lazy loader.
msg206205 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-12-15 00:28
Need to quickly test that this will work with PEP 451 works with the design in my head before we get farther into 3.4.
msg206211 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-12-15 02:25
Attached is a script to verify that PEP 451 works as desired, so Python 3.5 doesn't have any technical blockers for doing a lazy loader for PEP 451 loaders.

And with __spec__.loader_state it might be possible to work things out through a common API to work around issue #18275 so that relying on super() and doing this as a mixin goes away and instead just somehow store the final loader on the spec (e.g. loader's constructor just takes a spec so that you can instantiate the actual loader, reset __loader__ & __spec__.loader, and then proceed with exec_module()).
msg210410 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014-02-06 19:02
Here is a patch which implements a lazy loader for loaders that define exec_module().
msg214410 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014-03-21 19:40
New patch that includes docs and integrates the tests. If someone who understands import can look it over and give me an LGTM that would be appreciated.
msg214508 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-03-22 19:12
Review posted.  Thanks for working on this, Brett.
msg214536 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014-03-23 02:09
Here is a new patch that addresses Eric's comments and also fills in some holes that I realized I had while fixing things up. PTAL.
msg214627 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014-03-23 19:01
Another update to trigger loading on attribute deletions as well as detecting when an import swapped the module in sys.modules, raising ValueError if detected since it won't have the affect that's expected (could be convinced to make that ImportError instead).
msg214921 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-03-27 00:09
I wonder if there would be any benefit to using this for some of the modules that get loaded during startup.  I seem to remember there being a few for which lazy loading would have an effect.
msg214922 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-03-27 00:12
New review posted.  Basically LGTM.
msg214954 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014-03-27 14:17
So as-is, this won't help with startup as we already make sure that no unnecessary modules are loaded during startup. But one way we do that is through local imports in key modules, e.g. os.get_exec_path(). So what we could consider is instead of doing a local import we use lazy imports. We could introduce importlib._lazy_import() which could be (roughly):

  def _lazy_import(fullname):
    try:
      return sys.modules[fullname]
    except KeyError:
      spec = find_spec(fullname)
      loader = LazyLoader(spec.loader)
      # Make module with proper locking and get it inserted into sys.modules.
      loader.exec_module(module)
      return module

I don't know if that simplifies things, though, compared to a local import. It might help once a module is identified as on the critical path of startup since all global imports in that module could be lazy, but we would still need to identify those critical modules.
msg215542 - (view) Author: Roundup Robot (python-dev) Date: 2014-04-04 17:53
New changeset 52b58618199c by Brett Cannon in branch 'default':
Issue #17621: Introduce importlib.util.LazyLoader.
http://hg.python.org/cpython/rev/52b58618199c
msg215543 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2014-04-04 17:55
I went ahead and committed. I realized I could loosen the "no create_module()" requirement, but I think it could lead to more trouble than it's worth so I left it as-is for now. If people says it's an issue we can revisit it.
msg215572 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-04-04 22:10
Sweet!
History
Date User Action Args
2014-04-04 22:10:33eric.snowsetmessages: + msg215572
2014-04-04 17:55:06brett.cannonsetstatus: open -> closed
resolution: fixed
messages: + msg215543

stage: patch review -> resolved
2014-04-04 17:53:46python-devsetnosy: + python-dev
messages: + msg215542
2014-03-27 14:17:10brett.cannonsetmessages: + msg214954
2014-03-27 00:12:39eric.snowsetmessages: + msg214922
2014-03-27 00:09:55eric.snowsetmessages: + msg214921
2014-03-26 18:35:24durin42setnosy: + durin42
2014-03-26 18:01:08twouterssetnosy: + twouters
2014-03-23 19:01:18brett.cannonsetfiles: + lazy_loader.diff

messages: + msg214627
2014-03-23 02:15:40brett.cannonsetfiles: - lazy_loader.diff
2014-03-23 02:15:13brett.cannonsetfiles: + lazy_loader.diff
2014-03-23 02:09:29brett.cannonsetfiles: + lazy_loader.diff

messages: + msg214536
2014-03-22 22:37:24jwilksetnosy: + jwilk
2014-03-22 19:12:03eric.snowsetnosy: + eric.snow
messages: + msg214508
2014-03-21 19:40:58brett.cannonsetfiles: + lazy_loader.diff

messages: + msg214410
2014-02-06 19:02:54brett.cannonsetdependencies: - Make isinstance() work with super type instances
2014-02-06 19:02:42brett.cannonsetstage: test needed -> patch review
2014-02-06 19:02:27brett.cannonsetfiles: + lazy_loader.diff
keywords: + patch
messages: + msg210410
2013-12-15 02:25:41brett.cannonsetfiles: + lazy_test.py

dependencies: - Implementation for PEP 451 (importlib.machinery.ModuleSpec)
messages: + msg206211
2013-12-15 00:28:29brett.cannonsetmessages: + msg206205
2013-11-08 20:32:09brett.cannonsetmessages: + msg202445
versions: + Python 3.5, - Python 3.4
2013-11-01 18:12:25brett.cannonsetdependencies: + Implementation for PEP 451 (importlib.machinery.ModuleSpec)
messages: + msg201930
2013-06-24 15:23:52brett.cannonsetmessages: + msg191779
2013-06-24 15:06:30sbtsetmessages: + msg191776
2013-06-24 14:38:37brett.cannonsetmessages: + msg191772
2013-06-22 21:41:23sbtsetmessages: + msg191675
2013-06-22 19:48:23brett.cannonsetfiles: + lazy_proxy.py

messages: + msg191667
2013-06-22 19:46:00brett.cannonsetfiles: + lazy_mixin.py
2013-06-22 19:45:44brett.cannonsetfiles: + test_lazy_loader.py
2013-06-22 19:12:11brett.cannonsetmessages: + msg191660
2013-06-22 18:24:46sbtsetnosy: + sbt
messages: + msg191651
2013-06-22 09:43:05Arfreversetnosy: + Arfrever
2013-06-22 00:33:16brett.cannonsetmessages: + msg191623
2013-06-21 22:35:08brett.cannonsetdependencies: + Make isinstance() work with super type instances
messages: + msg191616
2013-06-21 01:38:30brett.cannonsetmessages: + msg191551
2013-04-02 19:03:54eric.smithsetnosy: + eric.smith
2013-04-02 18:24:44barrysetnosy: + barry
2013-04-02 18:00:47brett.cannoncreate