diff --git a/Doc/reference/import.rst b/Doc/reference/import.rst --- a/Doc/reference/import.rst +++ b/Doc/reference/import.rst @@ -210,6 +210,7 @@ .. index:: single: finder single: loader + single: module spec If the named module is not found in :data:`sys.modules`, then Python's import protocol is invoked to find and load the module. This protocol consists of @@ -230,13 +231,17 @@ range and scope of module searching. Finders do not actually load modules. If they can find the named module, they -return a :term:`loader`, which the import machinery then invokes to load the -module and create the corresponding module object. +return a :term:`module spec`, an encapsulation of the module's import-related +information, which the import machinery then uses when loading the module. The following sections describe the protocol for finders and loaders in more detail, including how you can create and register new ones to extend the import machinery. +.. versionchanged:: 3.4 + In previous versions of Python, finders returned :term:`loaders ` + directly, whereas now they return module specs which *contain* loaders. + Loaders are still used during import but have fewer responsibilities. Import hooks ------------ @@ -270,24 +275,23 @@ .. index:: single: sys.meta_path - pair: finder; find_module - pair: finder; find_loader + pair: finder; find_spec When the named module is not found in :data:`sys.modules`, Python next searches :data:`sys.meta_path`, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module. Meta path finders must implement a method called -:meth:`find_module()` which takes two arguments, a name and an import path. +:meth:`find_spec()` which takes two arguments, a name and an import path. The meta path finder can use any strategy it wants to determine whether it can handle the named module or not. If the meta path finder knows how to handle the named module, it returns a -loader object. If it cannot handle the named module, it returns ``None``. If +spec object. If it cannot handle the named module, it returns ``None``. If :data:`sys.meta_path` processing reaches the end of its list without returning -a loader, then an :exc:`ImportError` is raised. Any other exceptions raised +a spec, then an :exc:`ImportError` is raised. Any other exceptions raised are simply propagated up, aborting the import process. -The :meth:`find_module()` method of meta path finders is called with two +The :meth:`find_spec()` method of meta path finders is called with two arguments. The first is the fully qualified name of the module being imported, for example ``foo.bar.baz``. The second argument is the path entries to use for the module search. For top-level modules, the second @@ -296,15 +300,22 @@ the appropriate ``__path__`` attribute cannot be accessed, an :exc:`ImportError` is raised. +Meta path finders in earlier versions of Python implemented a +:meth:`find_module()` method instead of find_spec(). The method takes +the same arguments but returns just the loader (instead of the whole spec). +However, find_module() is deprecated. While it will continue to work +without change, the import machinery will try it only if the finder does +not implement find_spec(). + The meta path may be traversed multiple times for a single import request. For example, assuming none of the modules involved has already been cached, importing ``foo.bar.baz`` will first perform a top level import, calling -``mpf.find_module("foo", None)`` on each meta path finder (``mpf``). After +``mpf.find_spec("foo", None)`` on each meta path finder (``mpf``). After ``foo`` has been imported, ``foo.bar`` will be imported by traversing the meta path a second time, calling -``mpf.find_module("foo.bar", foo.__path__)``. Once ``foo.bar`` has been +``mpf.find_spec("foo.bar", foo.__path__)``. Once ``foo.bar`` has been imported, the final traversal will call -``mpf.find_module("foo.bar.baz", foo.bar.__path__)``. +``mpf.find_spec("foo.bar.baz", foo.bar.__path__)``. Some meta path finders only support top level imports. These importers will always return ``None`` when anything other than ``None`` is passed as the @@ -315,22 +326,98 @@ modules, and one that knows how to import modules from an :term:`import path` (i.e. the :term:`path based finder`). +.. versionchanged:: 3.4 + The find_spec() method of meta path finders replaced find_module(). + + +Loading +======= + +If and when a module spec is found, the import machinery will use it (and +the loader it contains) when loading the module. Here is an approximation +of what happens during the loading portion of import:: + + module = None + if hasattr(spec.loader, 'create_module'): + module = spec.loader.create_module(spec) + if module is None: + module = ModuleType(spec.name) + init_module_attrs(module) + + sys.modules[spec.name] = module + try: + spec.loader.exec_module(module) + except BaseException: + try: + del sys.modules[spec.name] + except KeyError: + pass + raise + +Note the following details: + + * If there is an existing module object with the given name in + :data:`sys.modules`, import have already returned it. + + * The module will exist in :data:`sys.modules` before the loader + executes the module code. This is crucial because the module code may + (directly or indirectly) import itself; adding it to :data:`sys.modules` + beforehand prevents unbounded recursion in the worst case and multiple + loading in the best. + + * If loading fails, the failing module--and only the failing module--gets + removed from :data:`sys.modules`. Any module already in the + :data:`sys.modules` cache, and any module that was successfully loaded + as a side-effect, must remain in the cache. This contrasts with + reloading where even the failing module is left in :data:`sys.modules`. + + * After the module is created but before execution, the import machinery + sets the import-related module attributes ("init_module_attrs"), as + summarized in a `later section `_. + + * Module execution is the key moment of loading in which the module's + namespace gets populated. Execution is entirely delegated to the + loader, which gets to decide what gets populated and how. + + * The module created during loading and passed to exec_module() may + not be the one returned at the end of import [#fnlo]_. + +.. versionchanged:: 3.4 + The import system has taken over the boilerplate responsibilities of + loaders. These were previously performed by the :meth:`load_module()` + method. Loaders -======= +------- -If and when a module loader is found its -:meth:`~importlib.abc.Loader.load_module` method is called, with a single -argument, the fully qualified name of the module being imported. This method -has several responsibilities, and should return the module object it has -loaded [#fnlo]_. If it cannot load the module, it should raise an -:exc:`ImportError`, although any other exception raised during -:meth:`load_module()` will be propagated. +Module loaders provide the critical function of loading: module execution. +The import machinery calls the :meth:`~importlib.abc.Loader.exec_module()` +method with a single argument, the module object to execute. Any value +returned from exec_module() is ignored. + +Loaders must satisfy the following requirements: + + * If the module is a Python module (as opposed to a built-in module or a + dynamically loaded extension), the loader should execute the module's code + in the module's global name space (``module.__dict__``). + + * If loader cannot execute the module, it should raise an + :exc:`ImportError`, although any other exception raised during + :meth:`exec_module()` will be propagated. In many cases, the finder and loader can be the same object; in such cases the -:meth:`finder.find_module()` would just return ``self``. +:meth:`finder.find_spec()` would just return a spec with the loader set +to ``self``. -Loaders must satisfy the following requirements: +For compatibility with existing loaders, the import machinery will use +the :meth:`~importlib.abc.Loader.load_module()` method of loaders if it +exists and the loader does not also implement exec_module(). However, +load_module() has been deprecated and loaders should implement +exec_module() instead. + +The load_module() method must implement all the boilerplate loading +functionality described above in addition to executing the module. All +the same constraints apply, with some additional clarification: * If there is an existing module object with the given name in :data:`sys.modules`, the loader must use that existing module. (Otherwise, @@ -338,108 +425,105 @@ not exist in :data:`sys.modules`, the loader must create a new module object and add it to :data:`sys.modules`. - Note that the module *must* exist in :data:`sys.modules` before the loader - executes the module code. This is crucial because the module code may - (directly or indirectly) import itself; adding it to :data:`sys.modules` - beforehand prevents unbounded recursion in the worst case and multiple - loading in the best. + * The module *must* exist in :data:`sys.modules` before the loader + executes the module code, to prevent unbounded recursion or multiple + loading. - If loading fails, the loader must remove any modules it has inserted into + * If loading fails, the loader must remove any modules it has inserted into :data:`sys.modules`, but it must remove **only** the failing module, and - only if the loader itself has loaded it explicitly. Any module already in - the :data:`sys.modules` cache, and any module that was successfully loaded - as a side-effect, must remain in the cache. + only if the loader itself has loaded it explicitly. - * The loader may set the ``__file__`` attribute of the module. If set, this - attribute's value must be a string. The loader may opt to leave - ``__file__`` unset if it has no semantic meaning (e.g. a module loaded from - a database). If ``__file__`` is set, it may also be appropriate to set the - ``__cached__`` attribute which is the path to any compiled version of the - code (e.g. byte-compiled file). The file does not need to exist to set this - attribute; the path can simply point to whether the compiled file would - exist (see :pep:`3147`). +Module loaders may opt in to creating the module object during loading +by implementing a :meth:`create_module()` method. It takes one argument, +the module spec, and returns the new module object to use during loading. +create_module() does not need to set any attributes on the module object. +If the loader does not define create_module(), the import machinery will +create the new module itself. - * The loader may set the ``__name__`` attribute of the module. While not - required, setting this attribute is highly recommended so that the - :meth:`repr()` of the module is more informative. +.. versionadded:: 3.4 + The create_module() method of loaders. - * If the module is a package (either regular or namespace), the loader must +.. versionchanged:: 3.4 + The load_module() method was replaced by exec_module() and the import + machinery assumed all the boilerplate responsibilities of loading. + +Module spec +----------- + +The import machinery uses a variety of information about each module +during import, especially before loading. Most of the information is +common to all modules. The purpose of a module's spec is to encapsulate +this import-related information on a per-module basis. + +Using a spec during import allows state to be transferred between import +system components, e.g. between the finder that creates the module spec +and the loader that executes it. Most importantly, it allows the +import machinery to perform the boilerplate operations of loading, +whereas without a module spec the loader had that responsibility. + +See :class:`~importlib.machinery.ModuleSpec` for more specifics on what +information a module's spec may hold. + +.. versionadded:: 3.4 + +Import-related module attributes +-------------------------------- + +.. attribute:: __name__ + + The import machinery may set the ``__name__`` attribute of the + module. While not required, setting this attribute is highly + recommended so that the :meth:`repr()` of the module is more + informative. + +.. attribute:: __loader__ + + The ``__loader__`` attribute must be set to the loader object that + loaded the module. This is mostly for introspection and reloading, + but can be used for additional loader-specific functionality, for + example getting data associated with a loader. If the attribute is + missing or set to ``None`` then the import machinery will + automatically set it **after** the module has been imported. + +.. attribute:: __package__ + + The module's ``__package__`` attribute must be set. Its value must + be a string, but it can be the same value as its ``__name__``. If + the attribute is set to ``None`` or is missing, the import system + will fill it in with a more appropriate value **after** the module + has been imported. When the module is a package, its ``__package__`` + value should be set to its ``__name__``. When the module is not a + package, ``__package__`` should be set to the empty string for + top-level modules, or for submodules, to the parent package's name. + See :pep:`366` for further details. + + This attribute is used instead of ``__name__`` to calculate explicit + relative imports for main modules, as defined in :pep:`366`. + +.. attribute:: __spec__ + + The module spec that was used when importing the module. + +.. attribute:: __path__ + + If the module is a package (either regular or namespace), the loader must set the module object's ``__path__`` attribute. The value must be iterable, but may be empty if ``__path__`` has no further significance to the loader. If ``__path__`` is not empty, it must produce strings when iterated over. More details on the semantics of ``__path__`` are given :ref:`below `. - * The ``__loader__`` attribute must be set to the loader object that loaded - the module. This is mostly for introspection and reloading, but can be - used for additional loader-specific functionality, for example getting - data associated with a loader. If the attribute is missing or set to ``None`` - then the import machinery will automatically set it **after** the module has - been imported. +.. attribute:: __file__ +.. attribute:: __cached__ - * The module's ``__package__`` attribute must be set. Its value must be a - string, but it can be the same value as its ``__name__``. If the attribute - is set to ``None`` or is missing, the import system will fill it in with a - more appropriate value **after** the module has been imported. - When the module is a package, its ``__package__`` value should be set to its - ``__name__``. When the module is not a package, ``__package__`` should be - set to the empty string for top-level modules, or for submodules, to the - parent package's name. See :pep:`366` for further details. - - This attribute is used instead of ``__name__`` to calculate explicit - relative imports for main modules, as defined in :pep:`366`. - - * If the module is a Python module (as opposed to a built-in module or a - dynamically loaded extension), the loader should execute the module's code - in the module's global name space (``module.__dict__``). - - -Module reprs ------------- - -By default, all modules have a usable repr, however depending on the -attributes set above, and hooks in the loader, you can more explicitly control -the repr of module objects. - -Loaders may implement a :meth:`module_repr()` method which takes a single -argument, the module object. When ``repr(module)`` is called for a module -with a loader supporting this protocol, whatever is returned from -``module.__loader__.module_repr(module)`` is returned as the module's repr -without further processing. This return value must be a string. - -If the module has no ``__loader__`` attribute, or the loader has no -:meth:`module_repr()` method, then the module object implementation itself -will craft a default repr using whatever information is available. It will -try to use the ``module.__name__``, ``module.__file__``, and -``module.__loader__`` as input into the repr, with defaults for whatever -information is missing. - -Here are the exact rules used: - - * If the module has a ``__loader__`` and that loader has a - :meth:`module_repr()` method, call it with a single argument, which is the - module object. The value returned is used as the module's repr. - - * If an exception occurs in :meth:`module_repr()`, the exception is caught - and discarded, and the calculation of the module's repr continues as if - :meth:`module_repr()` did not exist. - - * If the module has a ``__file__`` attribute, this is used as part of the - module's repr. - - * If the module has no ``__file__`` but does have a ``__loader__`` that is not - ``None``, then the loader's repr is used as part of the module's repr. - - * Otherwise, just use the module's ``__name__`` in the repr. - -This example, from :pep:`420` shows how a loader can craft its own module -repr:: - - class NamespaceLoader: - @classmethod - def module_repr(cls, module): - return "".format(module.__name__) - + ``__file__`` is optional. If set, this attribute's value must be a + string. The import system may opt to leave ``__file__`` unset if it + has no semantic meaning (e.g. a module loaded from a database). If + ``__file__`` is set, it may also be appropriate to set the + ``__cached__`` attribute which is the path to any compiled version of + the code (e.g. byte-compiled file). The file does not need to exist + to set this attribute; the path can simply point to whether the + compiled file would exist (see :pep:`3147`). .. _package-path-rules: @@ -464,9 +548,54 @@ attribute, and this was typically the way namespace packages were implemented prior to :pep:`420`. With the adoption of :pep:`420`, namespace packages no longer need to supply ``__init__.py`` files containing only ``__path__`` -manipulation code; the namespace loader automatically sets ``__path__`` +manipulation code; the import machinery automatically sets ``__path__`` correctly for the namespace package. +Module reprs +------------ + +By default, all modules have a usable repr, however depending on the +attributes set above, and in the module's spec, you can more explicitly +control the repr of module objects. + +If the module has a spec (``__spec__``), the import machinery will try +to generate a repr from it. If that fails or there is no spec, the import +system will craft a default repr using whatever information is available +on the module. It will try to use the ``module.__name__``, +``module.__file__``, and ``module.__loader__`` as input into the repr, +with defaults for whatever information is missing. + +For backward compatibility with Python 3.3, the module repr will be +generated by calling the loader's :meth:`module_repr()` method, if +defined, before trying either approach described above. However, the +method is deprecated. + +Here are the exact rules used: + + * If the module has a ``__loader__`` and that loader has a + :meth:`module_repr()` method, call it with a single argument, which is the + module object. The value returned is used as the module's repr. + + * If an exception occurs in :meth:`module_repr()`, the exception is caught + and discarded, and the calculation of the module's repr continues as if + :meth:`module_repr()` did not exist. + + * If the module has a ``__spec__`` attribute, the information in the spec + is used to generate the repr. The "name", "loader", "origin", and + "has_location" attributes are consulted. + + * If the module has a ``__file__`` attribute, this is used as part of the + module's repr. + + * If the module has no ``__file__`` but does have a ``__loader__`` that is not + ``None``, then the loader's repr is used as part of the module's repr. + + * Otherwise, just use the module's ``__name__`` in the repr. + +.. versionchanged:: 3.4 + Use of loader.module_repr() has been deprecated and the module spec + is now used by the import machinery to generate a module repr. + The Path Based Finder ===================== @@ -531,7 +660,7 @@ not be limited to this. As a meta path finder, the :term:`path based finder` implements the -:meth:`find_module()` protocol previously described, however it exposes +:meth:`find_spec()` protocol previously described, however it exposes additional hooks that can be used to customize how modules are found and loaded from the :term:`import path`. @@ -553,8 +682,8 @@ The :term:`path based finder` is a :term:`meta path finder`, so the import machinery begins the :term:`import path` search by calling the path -based finder's :meth:`find_module()` method as described previously. When -the ``path`` argument to :meth:`find_module()` is given, it will be a +based finder's :meth:`find_spec()` method as described previously. When +the ``path`` argument to :meth:`find_spec()` is given, it will be a list of string paths to traverse - typically a package's ``__path__`` attribute for an import within that package. If the ``path`` argument is ``None``, this indicates a top level import and :data:`sys.path` is used. @@ -585,22 +714,41 @@ argument, it should raise :exc:`ImportError`. If :data:`sys.path_hooks` iteration ends with no :term:`path entry finder` -being returned, then the path based finder's :meth:`find_module()` method +being returned, then the path based finder's :meth:`find_spec()` method will store ``None`` in :data:`sys.path_importer_cache` (to indicate that there is no finder for this path entry) and return ``None``, indicating that this :term:`meta path finder` could not find the module. If a :term:`path entry finder` *is* returned by one of the :term:`path entry hook` callables on :data:`sys.path_hooks`, then the following protocol is used -to ask the finder for a module loader, which is then used to load the module. - +to ask the finder for a module spec, which is then used when loading the +module. Path entry finder protocol -------------------------- In order to support imports of modules and initialized packages and also to contribute portions to namespace packages, path entry finders must implement -the :meth:`find_loader()` method. +the :meth:`find_spec()` method. + +:meth:`find_spec()` takes one argument, the fully qualified name of the +module being imported. :meth:`find_spec()` returns a fully populated +spec for the module. This spec will always have "loader" set (with one +exception). + +To indicate to the import machinery that the spec represents a namespace +:term:`portion`. the path entry finder sets "loader" on the spec to +``None`` and "submodule_search_locations" to a list containing the +portion. + +Legacy path entry finder protocols +---------------------------------- + +Older path entry finders may implement one of two deprecated methods +instead of :meth:`find_spec()`: :meth:`find_loader()` and +:meth:`find_module()`. The methods are still respected for the sake of +backward compatibility. Howevever, if find_spec() is implemented on the +path entry finder, the legacy methods are ignored. :meth:`find_loader()` takes one argument, the fully qualified name of the module being imported. :meth:`find_loader()` returns a 2-tuple where the @@ -626,10 +774,9 @@ The :meth:`find_module()` method on path entry finders is deprecated, as it does not allow the path entry finder to contribute portions to -namespace packages. Instead path entry finders should implement the -:meth:`find_loader()` method as described above. If it exists on the path -entry finder, the import system will always call :meth:`find_loader()` -in preference to :meth:`find_module()`. +namespace packages. If both :meth:`find_loader()` and :meth:`find_module()` +exist on a path entry finder, the import system will always call +:meth:`find_loader()` in preference to :meth:`find_module()`. Replacing the standard import system @@ -648,7 +795,7 @@ To selectively prevent import of some modules from a hook early on the meta path (rather than disabling the standard import system entirely), it is sufficient to raise :exc:`ImportError` directly from -:meth:`find_module` instead of returning ``None``. The latter indicates +:meth:`find_spec` instead of returning ``None``. The latter indicates that the meta path search should continue. while raising an exception terminates it immediately. @@ -690,6 +837,11 @@ :pep:`338` defines executing modules as scripts. +:pep:`451` adds the encapsulation of per-module import state in spec +objects. At also off-loads most of the boilerplate responsibilities of +loaders back onto the import machinery. These changes allow the +deprecation of several APIs in the import system and also addition of new +methods to finders and loaders. .. rubric:: Footnotes