This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: typing module adds objects to sys.modules that don't look like modules
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: barry, brett.cannon, eric.snow, gvanrossum, ncoghlan, ronaldoussoren
Priority: low Keywords:

Created on 2019-01-22 17:16 by eric.snow, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (10)
msg334222 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-01-22 17:16
tl;dr Should all objects in sys.modules look like module objects?

In #35791 Ronald described a problem with a "module" added to sys.modules that does not have all the attributes a module should have.  He also mentioned a similar problem with typing.io [1]:

    BTW. Typing.io is a namespace added to sys.modules by
    the typing module that also does not have __spec__, and
    causes similar problems. I have an simple workaround for
    that on my side.

I've verified the missing module attributes (using 3.8):

    >>> old = sorted(sys.modules)
    >>> import typing
    >>> new = sorted(sys.modules)
    >>> assert sorted(set(old) - set(new)) == []
    >>> sorted(set(new) - set(old))
    ['_collections', '_functools', '_heapq', '_locale',
     '_operator', '_sre', 'collections', 'collections.abc',
     'contextlib', 'copyreg', 'enum', 'functools', 'heapq',
     'itertools', 'keyword', 'operator', 're', 'reprlib',
     'sre_compile', 'sre_constants', 'sre_parse', 'types',
     'typing', 'typing.io', 'typing.re']
    >>> [name for name in vars(sys.modules['typing.io']) if name.startswith('__')]
    ['__module__', '__doc__', '__all__', '__dict__', '__weakref__']
    >>> [name for name in vars(sys.modules['typing.re']) if name.startswith('__')]
    ['__module__', '__doc__', '__all__', '__dict__', '__weakref__']

Per the language reference [2], modules should have the following attributes:

__name__
__loader__
__package__
__spec__

Modules imported from files also should have __file__ and __cached__.  (For the sake of completeness, packages also should have a __path__ attribute.)

As seen above, typing.io and typing.re don't have any of the import-related attributes.

So, should those two "modules" have all those attributes added?  I'm in favor of saying that every sys.modules entry must have all the appropriate import-related attributes (but doesn't have to be an actual module object).  Otherwise tools (e.g. importlib.reload(), Ronald's) making that (arguably valid) assumption break.  The right place for the change in the language reference is probably the "module cache" section. [3]  The actual entry for sys.modules [4] is probably fine as-is.


[1] https://bugs.python.org/issue35791#msg334212
[2] https://docs.python.org/3/reference/import.html#module-spec
[3] https://docs.python.org/3/reference/import.html#the-module-cache
[4] https://docs.python.org/3/library/sys.html#sys.modules
msg334224 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-01-22 17:39
It's been a somewhat well-known idiom for modules to replace themselves in sys.modules with an object that implements some special behaviors (e.g. dynamic loading). This "just works" and AFAIK people have been doing this for ages. Typically such classes just implement enough machinery so that "import foo; print(foo.bar)" works -- everything else (__file__, __doc__ etc.) is optional.

So I think tools should be robust when they inspect sys.modules and not crash or whine loudly when they find something that's missing a few advanced attributes.

That said, if there's something *useful* we could put in the special attributes for e.g. typing.io, I'm not against it. But I don't think this is a bug.
msg334231 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-01-22 19:51
The reason I filed #35791 is that I'm rewriting modulegraph[1] using importlib and run into some problems due to importlib.util.find_spec() failing for names that are already imported. 

Working around that in general probably will require reimplementing bits of importlib in my own code and that's something I'd prefer to avoid. The alternative appears to be messing with sys.modules when I run into this problem, which might cause other problem.

BTW. The lack of __spec__ on typing.io and typing.re is not a problem for me, I can use the machinery I already have to insert edges for C extensions to do the right thing for these modules as well. 

[1] https://modulegraph.readthedocs.io/en/latest/
msg334232 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-01-22 20:05
In response to my previous message: "Messing" with sys.modules appears to be good enough work around for me, at least in the limited testing I've done so far. The new version of modulegraph hasn't seen any testing beyond a largish set of unit tests though.
msg334238 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-01-22 20:44
OK, I don't see a bug here. The docs for sys.modules are pretty vague -- I don't believe they imply that all values in sys.modules must have every attribute of a Module object.
msg334429 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-01-27 14:50
Closing this without any changes contradicts the answer we gave Ronald on #35791 that it's expected behaviour for importlib.find_spec() to throw an exception for already loaded modules without a __spec__ attribute.

So if this stays closed, then we should reopen #35791, and treat it as a feature request to either:

1. add a "ignore_module_cache" option to bypass sys.modules; or
2. revert to searching for the original spec in cases where the sys.modules entry has no __spec__ attribute (which has the virtue of just working for cases of the "replace yourself in sys.modules" idiom)

That said, the typing pseudo submodules *can* populate their __spec__ with useful information by copying most of their attributes from  `typing.__spec__`, but setting their __spec__.loader attribute to one that throws an ImportError with a message saying to import `typing` instead of attempting to reload the submodule directly.
msg334436 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-01-27 20:20
Option 2 sounds best. I am also not against adding __spec__ but I think we
should support the idiom regardless, and I don’t consider this a bug in the
typing module — at best there’s a slight improvement.

On Sun, Jan 27, 2019 at 6:50 AM Nick Coghlan <report@bugs.python.org> wrote:

>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> Closing this without any changes contradicts the answer we gave Ronald on
> #35791 that it's expected behaviour for importlib.find_spec() to throw an
> exception for already loaded modules without a __spec__ attribute.
>
> So if this stays closed, then we should reopen #35791, and treat it as a
> feature request to either:
>
> 1. add a "ignore_module_cache" option to bypass sys.modules; or
> 2. revert to searching for the original spec in cases where the
> sys.modules entry has no __spec__ attribute (which has the virtue of just
> working for cases of the "replace yourself in sys.modules" idiom)
>
> That said, the typing pseudo submodules *can* populate their __spec__ with
> useful information by copying most of their attributes from
> `typing.__spec__`, but setting their __spec__.loader attribute to one that
> throws an ImportError with a message saying to import `typing` instead of
> attempting to reload the submodule directly.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue35806>
> _______________________________________
>
-- 
--Guido (mobile)
msg334437 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-01-27 21:49
Note that I will have to special case namespaces like typing.re anyway in my code, being a namespace that does not correspond to a "real" module whose code I can analyse.

#35791 was mostly about 3th-party code like apipkg that are "real" modules, but for some reason effectively erase there __spec__ attribute. In those cases find_spec could return valid value.

As I mentioned in the other issue I have a simple workaround for this, and that's something I'll have to keep around for a while anyway. 

I'm not convinced at this point that anything needs to be changed, as long as the documentation is clear on the requirement that modules should have a __spec__ attribute (which makes it easier to convince 3th-party authors to update their code). 

Special cases like typing.re will always be special, and adding __spec__ to them won't change a lot there (the only vaguely useful attribute in the ModuleSpec for typing.re would be "name" and "parent") and IMHO would be needless code churn.
msg334447 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-01-28 06:56
OK, I've filed #35839 to propose falling back to a regular search when the __spec__ attribute is missing, while treating "obj.__spec__ is None" as a true negative cache entry.
msg334489 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-01-28 20:00
Thanks!
History
Date User Action Args
2022-04-11 14:59:10adminsetgithub: 79987
2019-01-28 20:00:47gvanrossumsetmessages: + msg334489
2019-01-28 06:56:23ncoghlansetmessages: + msg334447
2019-01-27 21:49:49ronaldoussorensetmessages: + msg334437
2019-01-27 20:20:46gvanrossumsetmessages: + msg334436
2019-01-27 14:50:21ncoghlansetmessages: + msg334429
2019-01-22 20:44:57gvanrossumsetstatus: open -> closed
resolution: wont fix
messages: + msg334238

stage: test needed -> resolved
2019-01-22 20:05:15ronaldoussorensetmessages: + msg334232
2019-01-22 19:51:32ronaldoussorensetmessages: + msg334231
2019-01-22 17:39:23gvanrossumsetpriority: normal -> low

messages: + msg334224
2019-01-22 17:16:42eric.snowcreate