classification
Title: Three inconsistent module attributes
Type: behavior Stage:
Components: Interpreter Core, Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, brett.cannon, eric.smith, eric.snow, maggyero, ncoghlan
Priority: normal Keywords:

Created on 2019-07-07 12:30 by maggyero, last changed 2019-07-15 18:59 by eric.smith.

Messages (4)
msg347470 - (view) Author: Géry (maggyero) * Date: 2019-07-07 12:30
Analysis
========

In the next two sections showing the module attributes and corresponding spec attributes of imported modules and run modules, we notice the following rule (which is in accordance with this `PEP 451 section <https://www.python.org/dev/peps/pep-0451/#attributes>`_):

    If a *spec* attribute is either ``None``, ``'built-in'`` or ``'frozen'``, then the corresponding *module* attribute is not set.

However we also notice three exceptions to this rule, that I think are unintended inconsistencies that should be corrected:

- ``module.__file__ is None`` (instead of being not set) for imported namespace packages;
- ``module.__cached__ is None`` (instead of being not set) for non-package modules run from the file system and run from standard input;
- ``module.__package__ is None`` (instead of being ``''``) for non-package modules run from the file system and run from standard input.

The first exception was introduced recently (26 February 2018) by this `pull request <https://github.com/python/cpython/pull/5481>`_, which changed the ``module.__spec__.origin`` attribute from ``namespace`` to ``None`` (which I agree with as it avoids conflicting non-namespace-package modules named ``namespace`` with namespace packages) and the ``module.__file__`` attribute from being unset to ``None`` (which I disagree with as it introduces an inconsistency and contradicts PEP 451).

Environment: CPython 3.7, MacOS 10.14.


Imported modules
================

Running the following code::

    import module

    print("MODULE")

    for attr in ["__name__", "__file__", "__cached__", "__path__", "__package__", "__loader__"]:
        print(f"{attr}:", repr(getattr(module, attr, "not set")))

    print("SPEC")

    if hasattr(module, "__spec__"):
        if module.__spec__ is None:
            print("__spec__:", repr(module.__spec__))
        else:
            for attr in ["name", "origin", "cached", "submodule_search_locations", "parent", "loader"]:
                print(f"__spec__.{attr}:", repr(getattr(module.__spec__, attr)))
    else:
        print("__spec__: not set")

where ``module`` refers to:

- a non-package module (e.g., ``pathlib``);
- a regular package (e.g., ``json``);
- a namespace package;
- a built-in module (e.g., ``time``);
- a frozen module (e.g., ``_frozen_importlib_external``)

prints the following module attributes and corresponding spec attributes of the imported ``module``:

- for a non-package module:

| MODULE
| __name__: 'pathlib'
| __file__: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pathlib.py'
| __cached__: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/__pycache__/pathlib.cpython-37.pyc'
| __path__: 'not set'
| __package__: ''
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x1018896d8>
| SPEC
| __spec__.name: 'pathlib'
| __spec__.origin: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pathlib.py'
| __spec__.cached: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/__pycache__/pathlib.cpython-37.pyc'
| __spec__.submodule_search_locations: None
| __spec__.parent: ''
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x1018896d8>

- for a regular package:

| MODULE
| __name__: 'json'
| __file__: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py'
| __cached__: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__pycache__/__init__.cpython-37.pyc'
| __path__: ['/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json']
| __package__: 'json'
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x10f9aa6d8>
| SPEC
| __spec__.name: 'json'
| __spec__.origin: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py'
| __spec__.cached: '/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__pycache__/__init__.cpython-37.pyc'
| __spec__.submodule_search_locations: ['/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json']
| __spec__.parent: 'json'
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x10f9aa6d8>

- for a namespace package:

| MODULE
| __name__: 'foobar'
| __file__: None
| __cached__: 'not set'
| __path__: _NamespacePath(['/Users/maggyero/foobar'])
| __package__: 'foobar'
| __loader__: <_frozen_importlib_external._NamespaceLoader object at 0x1074564a8>
| SPEC
| __spec__.name: 'foobar'
| __spec__.origin: None
| __spec__.cached: None
| __spec__.submodule_search_locations: _NamespacePath(['/Users/maggyero/foobar'])
| __spec__.parent: 'foobar'
| __spec__.loader: <_frozen_importlib_external._NamespaceLoader object at 0x1074564a8>

- for a built-in module:

| MODULE
| __name__: 'time'
| __file__: 'not set'
| __cached__: 'not set'
| __path__: 'not set'
| __package__: ''
| __loader__: <class '_frozen_importlib.BuiltinImporter'>
| SPEC
| __spec__.name: 'time'
| __spec__.origin: 'built-in'
| __spec__.cached: None
| __spec__.submodule_search_locations: None
| __spec__.parent: ''
| __spec__.loader: <class '_frozen_importlib.BuiltinImporter'>

- for a frozen module:

| MODULE
| __name__: '_frozen_importlib_external'
| __file__: 'not set'
| __cached__: 'not set'
| __path__: 'not set'
| __package__: ''
| __loader__: <class '_frozen_importlib.FrozenImporter'>
| SPEC
| __spec__.name: '_frozen_importlib_external'
| __spec__.origin: 'frozen'
| __spec__.cached: None
| __spec__.submodule_search_locations: None
| __spec__.parent: ''
| __spec__.loader: <class '_frozen_importlib.FrozenImporter'>


Run modules
===========

Putting the following code::

    import sys

    print("MODULE")

    for attr in ["__name__", "__file__", "__cached__", "__path__", "__package__", "__loader__"]:
        print(f"{attr}:", repr(getattr(sys.modules[__name__], attr, "not set")))

    print("SPEC")

    if hasattr(sys.modules[__name__], "__spec__"):
        if sys.modules[__name__].__spec__ is None:
            print("__spec__:", repr(sys.modules[__name__].__spec__))
        else:
            for attr in ["name", "origin", "cached", "submodule_search_locations", "parent", "loader"]:
                print(f"__spec__.{attr}:", repr(getattr(sys.modules[__name__].__spec__, attr)))
    else:
        print("__spec__: not set")

in:

- a module.py file for getting a ``module`` non-package module;
- a __main__.py file in a module directory with an __init__.py file for getting a ``module`` regular package;
- a __main__.py file in a module directory without an __init__.py file for getting a ``module`` namespace package

and running the code:

- from the file system (``python3 module.py`` for a non-package module, ``python3 module/`` for a package module);
- from standard input (``cat module.py | python3`` for a non-package module);
- from the module namespace (``python3 -m module``)

prints the following module attributes and corresponding spec attributes of the run ``module``:

- for a non-package module:

| $ python3 module.py
| MODULE
| __name__: '__main__'
| __file__: 'module.py'
| __cached__: None
| __path__: 'not set'
| __package__: None
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x1051970b8>
| SPEC
| __spec__: None
|
| $ cat module.py | python3
| MODULE
| __name__: '__main__'
| __file__: '<stdin>'
| __cached__: None
| __path__: 'not set'
| __package__: None
| __loader__: <class '_frozen_importlib.BuiltinImporter'>
| SPEC
| __spec__: None
|
| $ python3 -m module
| MODULE
| __name__: '__main__'
| __file__: '/Users/maggyero/module.py'
| __cached__: '/Users/maggyero/__pycache__/module.cpython-37.pyc'
| __path__: 'not set'
| __package__: ''
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x1056b16d8>
| SPEC
| __spec__.name: 'module'
| __spec__.origin: '/Users/maggyero/module.py'
| __spec__.cached: '/Users/maggyero/__pycache__/module.cpython-37.pyc'
| __spec__.submodule_search_locations: None
| __spec__.parent: ''
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x1056b16d8>

- for a regular package:

| $ python3 module/
| MODULE
| __name__: '__main__'
| __file__: 'module/__main__.py'
| __cached__: 'module/__pycache__/__main__.cpython-37.pyc'
| __path__: 'not set'
| __package__: ''
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x10826a550>
| SPEC
| __spec__.name: '__main__'
| __spec__.origin: 'module/__main__.py'
| __spec__.cached: 'module/__pycache__/__main__.cpython-37.pyc'
| __spec__.submodule_search_locations: None
| __spec__.parent: ''
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x10826a550>
|
| $ python3 -m module
| MODULE
| __name__: '__main__'
| __file__: '/Users/maggyero/module/__main__.py'
| __cached__: '/Users/maggyero/module/__pycache__/__main__.cpython-37.pyc'
| __path__: 'not set'
| __package__: 'module'
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x10832d278>
| SPEC
| __spec__.name: 'module.__main__'
| __spec__.origin: '/Users/maggyero/module/__main__.py'
| __spec__.cached: '/Users/maggyero/module/__pycache__/__main__.cpython-37.pyc'
| __spec__.submodule_search_locations: None
| __spec__.parent: 'module'
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x10832d278>

- for a namespace package:

| $ python3 module/
| MODULE
| __name__: '__main__'
| __file__: 'module/__main__.py'
| __cached__: 'module/__pycache__/__main__.cpython-37.pyc'
| __path__: 'not set'
| __package__: ''
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x107a06518>
| SPEC
| __spec__.name: '__main__'
| __spec__.origin: 'module/__main__.py'
| __spec__.cached: 'module/__pycache__/__main__.cpython-37.pyc'
| __spec__.submodule_search_locations: None
| __spec__.parent: ''
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x107a06518>
|
| $ python3 -m module
| MODULE
| __name__: '__main__'
| __file__: '/Users/maggyero/module/__main__.py'
| __cached__: '/Users/maggyero/module/__pycache__/__main__.cpython-37.pyc'
| __path__: 'not set'
| __package__: 'module'
| __loader__: <_frozen_importlib_external.SourceFileLoader object at 0x10fb69240>
| SPEC
| __spec__.name: 'module.__main__'
| __spec__.origin: '/Users/maggyero/module/__main__.py'
| __spec__.cached: '/Users/maggyero/module/__pycache__/__main__.cpython-37.pyc'
| __spec__.submodule_search_locations: None
| __spec__.parent: 'module'
| __spec__.loader: <_frozen_importlib_external.SourceFileLoader object at 0x10fb69240>
msg347507 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2019-07-08 19:12
PEPs actually become historical documents once they are implemented, so could you please check what the official docs say in regards to this to see if there is an inconsistency in the semantics?
msg347960 - (view) Author: Géry (maggyero) * Date: 2019-07-15 14:09
@Brett Cannon

> PEPs actually become historical documents once they are implemented

Actually the inconsistency of the values of the 3 module attributes (``__file__``, ``__cached__`` and ``__package__``) is with respect to the other values within the current implementation (not only with respect to the values specified in PEP 451). Sorry if I did not explain this well. Let me detail:

For ``__file__``, if you look at the current output of the above "Imported modules" section, you have:

- __file__: None, for an imported namespace package;
- __file__: 'not set', for an imported built-in module;
- __file__: 'not set', for an imported frozen module,

which is inconsistent: it should always be 'not set' when ``__file__`` has no meaning.

For ``__cached__``, if you look at the current output of the above "Run modules" section, you have:

- __cached__: None, for a non-package module run from the file system (``python3 module.py``) or run from standard input (``cat module.py | python3``);
- __path__: 'not set', for a non-package module run from the file system (``python3 module.py``) or run from standard input (``cat module.py | python3``),

which is inconsistent: it should always be 'not set' when ``__cached__`` has no meaning, like it is already the case for ``__path__`` and other module attributes.

For ``__package__``, if you look at the current output of the above "Run modules" section, you have:

- __package__: None, for a non-package module run from the file system (``python3 module.py``) or run from standard input (``cat module.py | python3``);
- __package__: '', for a non-package module run from the module namespace (``python3 -m module``) or a package run from the file system (``python3 module/``).

which is inconsistent: it should always be ``''`` when there is no parent package for ``__package__`` to refer to.
msg347988 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-07-15 18:59
While some of these might be inconsistent (I haven't really looked at it thoroughly yet), I think it might be problematic to change them at this point, since there's no doubt code out there that depends on the current behavior.
History
Date User Action Args
2019-07-15 18:59:48eric.smithsetmessages: + msg347988
2019-07-15 18:57:38brett.cannonsetnosy: + eric.smith
2019-07-15 14:10:45maggyerosetnosy: + eric.snow, - eric.smith
2019-07-15 14:09:52maggyerosetmessages: + msg347960
2019-07-08 19:12:53brett.cannonsetnosy: + brett.cannon
messages: + msg347507
2019-07-07 12:30:52maggyerocreate