classification
Title: "from .__init__ import ..." syntax imports a duplicate module
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, eric.snow, indygreg, ncoghlan, serhiy.storchaka
Priority: normal Keywords:

Created on 2020-12-04 02:11 by indygreg, last changed 2020-12-08 02:12 by indygreg. This issue is now closed.

Messages (7)
msg382464 - (view) Author: Gregory Szorc (indygreg) * Date: 2020-12-04 02:11
(Rereporting from https://github.com/indygreg/PyOxidizer/issues/317.)

$ mkdir foo
$ cat > foo/__init__.py <<EOF
> test = True
> EOF
$ cat > foo/bar.py <<EOF
> from .__init__ import test
> EOF
$ python3.9
Python 3.9.0 (default, Nov  1 2020, 22:40:00)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import foo.bar
>>> import sys
>>> sys.modules['foo']
<module 'foo' from '/home/gps/tmp/pyinit-test/foo/__init__.py'>
>>> sys.modules['foo.__init__']
<module 'foo.__init__' from '/home/gps/tmp/pyinit-test/foo/__init__.py'>

I am surprised that `from .__init__` even works, as `__init__` isn't a valid module name.

What appears to be happening is the path based importer doesn't recognize the `__init__` as special and it falls back to its regular file probing technique to locate a module derive from the path. It finds the `__init__.py[c]` file and imports it.

A consequence of this is that the explicit `__init__` import/module exists as a separate module object under `sys.modules`. So you can effectively have the same file imported as 2 module objects living under 2 names. This could of course result in subtle software bugs, like module-level variables not updating when you expect them to. (This could also be a feature for code relying on this behavior, of course.)

I only attempted to reproduce with 3.9. But this behavior has likely existed for years.
msg382468 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-04 07:46
Other example is:

>>> import sys
>>> import xml
>>> import xml.__init__
>>> sys.modules['xml']
<module 'xml' from '/home/serhiy/py/cpython/Lib/xml/__init__.py'>
>>> sys.modules['xml.__init__']
<module 'xml.__init__' from '/home/serhiy/py/cpython/Lib/xml/__init__.py'>
>>> sys.modules['xml'] is sys.modules['xml.__init__']
False

I'm not sure we should do anything about it other than say "Don't do this."
msg382518 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2020-12-04 18:06
I agree with Serhiy; don't do this. The only way we could fix this would be to always set a `__init__` module for every package implicitly, but then that would break anyone who wanted to clear out a package in sys.modules as the `__init__` reference in sys.modules becomes a dangling reference.
msg382519 - (view) Author: Gregory Szorc (indygreg) * Date: 2020-12-04 18:31
I worked around this in PyOxidizer by stripping a trailing `.__init__` from module names when resolving the indexed resource data. This allows the import to work since it can find the data now, but it also preserves the double module object, which isn't ideal IMO.

My preferred solution would be to either ban `__init__` in module name components or strip trailing `.__init__` from the name in `find_spec()`, effectively normalizing it away. Either of these would be backwards incompatible. Could either of these be considered for 3.10?

It's worth noting that `__init__` could potentially occur in the interior of the module name. e.g. `foo.__init__.bar`. This would require filenames like `foo/__init__/bar.py`. I wouldn't be surprised if this exists somewhere in the wild.
msg382522 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-04 20:02
What is the problem? What real code imports __init__?
msg382684 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2020-12-07 21:29
You could propose your backwards-incompatible proposals on python-ideas, Greg, and see if you get any uptake.
msg382706 - (view) Author: Gregory Szorc (indygreg) * Date: 2020-12-08 02:12
Who uses this syntax? https://github.com/search?l=Python&q=%22from+.__init__+import%22&type=Code says a lot of random code, surprisingly/sadly.

As for python-ideas, thanks for the suggestion: I may very well craft an email!
History
Date User Action Args
2020-12-08 02:12:26indygregsetmessages: + msg382706
2020-12-07 21:29:09brett.cannonsetmessages: + msg382684
2020-12-04 20:02:40serhiy.storchakasetmessages: + msg382522
2020-12-04 18:31:08indygregsetmessages: + msg382519
2020-12-04 18:06:31brett.cannonsetstatus: open -> closed
resolution: wont fix
messages: + msg382518

stage: resolved
2020-12-04 07:46:13serhiy.storchakasetnosy: + eric.snow, serhiy.storchaka, brett.cannon, ncoghlan
messages: + msg382468
2020-12-04 02:11:52indygregcreate