Title: Treat `import a.b.c as m` as `m = sys.modules['a.b.c']`
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Pavol Lisy, Victor.Varvariuc, barry, brett.cannon, eric.snow, irdb, ncoghlan, serhiy.storchaka
Priority: normal Keywords: easy (C)

Created on 2017-04-09 11:11 by Victor.Varvariuc, last changed 2017-04-23 14:51 by serhiy.storchaka.

File name Uploaded Description Edit Victor.Varvariuc, 2017-04-09 16:36
Pull Requests
URL Status Linked Edit
PR 1264 open serhiy.storchaka, 2017-04-23 14:47
Messages (16)
msg291376 - (view) Author: Victor Varvariuc (Victor.Varvariuc) Date: 2017-04-09 11:11

Hi there.

I asked a question <> on Stackoverflow:

> (Pdb) import brain.utils.mail
> (Pdb) import brain.utils.mail as mail_utils
> *** AttributeError: module 'brain.utils' has no attribute 'mail'
> I always thought that import a.b.c as m is roughly equivalent to m = sys.modules['a.b.c']. Why AttributeError? Python 3.6

I was pointed out <> that this is a somewhat weird behavior of Python:

> The statement is not quite true, as evidenced by the corner case you met, namely if the required modules already exist in sys.modules but are yet uninitialized. The import ... as requires that the module is injected in foo namespace as the attribute bar, in addition to being in sys.modules, whereas the from ... import ... as looks for in sys.modules.

Why would `import a.b.c` work when `a.b.c` is not yet fully imported, but `import a.b.c as my_c` would not? I though it would be vice versa.

Using `import a.b.c as my_c` allows avoiding a number of circular import issues. Right now I have to use `from a.b import c as my_c` as a workaround, but this doesn't look right.

The enhancement is to treat `import x.y.z as m` as `m = importlib.import_module('x.y.z')`. I don't see how this will break anything.
msg291377 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-04-09 12:11
The background here is the change in that allows IMPORT_FROM to fall back to sys.modules when written as "from a.b import c as m", while the plain LOAD_ATTR generated for "import a.b.c as m" fails.

One potential resolution to the discrepancy that occurred to me in the python-ideas discussion was to add a new IMPORT_ATTR opcode specifically for use in the latter case.

The main downside I can see is the usual one with adding any new opcode for a niche use case like this one: it risks slowing the eval loop down in general for a marginal improvement in language consistency.

However, I don't think we should reject the possibility out of hand on that basis.
msg291380 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-04-09 15:48
Hm, the stackoverflow point to another stackoverflow ( which is due to an import cycle. Is there also an import cycle here? Because I cannot seem repro this with just empty modules (and I'm too lazy to find out if brain is a real package).
msg291384 - (view) Author: Victor Varvariuc (Victor.Varvariuc) Date: 2017-04-09 16:36
# mytest/

import sys

import mytest.mod1
assert 'mytest.mod1' in sys.modules
import mytest.mod1 as mod  # this fails, though the prev. lines work perfectly
msg291400 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-04-10 00:32
So to save others the bother to check what's in the zip file, the contents of mytest/ is as follows:

import mytest.mod2 as mod

def func():
    print('mod1.func called')

There's no (so mytest is a namespace package, PEP 420) but adding an empty makes no difference. The problem occurs with both Python 2 and 3.

The root cause is the import cycle. I played around with dis and found the following (I suspect others have already found this but the thread was hard to follow for me initially):

>>> dis.dis('import a.b')
  1           0 LOAD_CONST               0 (0)
              2 LOAD_CONST               1 (None)
              4 IMPORT_NAME              0 (a.b)
              6 STORE_NAME               1 (a)
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE

compared to

>>> dis.dis('import a.b as c')
  1           0 LOAD_CONST               0 (0)
              2 LOAD_CONST               1 (None)
              4 IMPORT_NAME              0 (a.b)
              6 LOAD_ATTR                1 (b)      <-- error here
              8 STORE_NAME               2 (c)
             10 LOAD_CONST               1 (None)
             12 RETURN_VALUE

What this shows is that the implementation of "import a.b" and "import a.b as c" are different. The former calls __import__('a.b', ...) which returns the module 'a' and stores that in the variable 'a'. In the OP's case, because of the import cycle, while sys.modules['a.b'] exists, module 'a' does not yet have the attribute 'b'. That's the reason that in the latter example, the LOAD_ATTR opcode fails.

I'm inclined not to attempt to fix this. The reason that we don't pull 'a.b' out of sys.modules at this point is that if later in the execution of a/ we get an exception, the import will be cancelled, and sys.modules['a.b'] will be *deleted*, and then the variable 'c' would have a reference to a half-loaded module.

The semantics of imports in the case of cycles are somewhat complex but clearly defined and there are only a few rules to consider, and from these rules it is possible to reason out whether any particular case is valid or not. I would prefer to keep it this way rather than add more special cases. There's a good reason why, *in general* (regardless of import cycles), "import a.b as c" is implemented as a getattr operation on a, not as an index operation on sys.modules (it is possible for module a to override its attribute b without updating sys.modules) and I'd rather keep those semantics than give them up for this particular edge case. Cyclic imports are hard. If they don't work for you, avoid them.
msg291411 - (view) Author: Victor Varvariuc (Victor.Varvariuc) Date: 2017-04-10 05:29
> I'm inclined not to attempt to fix this. The reason that we don't pull 'a.b' out of sys.modules at this point is that if later in the execution of a/ we get an exception, the import will be cancelled, and sys.modules['a.b'] will be *deleted*, and then the variable 'c' would have a reference to a half-loaded module.

I think this is solvable -- track and delete the references too.

It's your decision in the end, but I think it's not good to make the language inconsistent because of technical limitations -- the interface should be the primary source for decisions.
msg291414 - (view) Author: Victor Varvariuc (Victor.Varvariuc) Date: 2017-04-10 05:58
> it is possible for module a to override its attribute b without updating sys.modules

This sounds like a really strange application.

Even if someone overrides the attribute, when you do in other place `import a.b.c as m` you expect `a.b.c` to be path to a module, otherwise it would be hard to debug. 

The docs say `import_stmt     ::=  "import" module ["as" name]` meaning everywhere that `module` is a path to a module.

> To begin the search, Python needs the fully qualified name of the module (or package, but for the purposes of this discussion, the difference is immaterial) being imported. This name may come from various arguments to the import statement, or from the parameters to the importlib.import_module() or __import__() functions.

So if you treat `import a.b.c as m` as `import a; m = a.b.c` -- it go go in some cases against the docs.
msg291416 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-04-10 06:25
Right, this problem only arises with import cycles, and that's why we resisted making eager submodule resolution work *at all* for so long (issue 992389 was filed way back in 2004).

We only conceded the point (with issue 17636 being implemented for 3.5) specifically to address a barrier to adoption for explicit relative imports, as it turned out that "from . import bar" could fail in cases where "import" previously worked.

The best explanation I could find for that rationale in the related python-dev thread is PJE's post here:

What Victor's python-ideas thread pointed out is that there are actually *3* flavours of import where this particular circular reference problem can come up:

    # Has worked as long as Python has had packages,
    # as long as you only lazily resolve in
    # function and method implementations

    # Has worked since 3.5 due to the IMPORT_FROM
    # change that falls back to a sys.modules lookup
    from foo import bar

    # Still gives AttributeError since it
    # eagerly resolves the attribute lookup
    import as bar

While I think the architectural case for allowing this kind of circular dependency between different top level namespaces is *much* weaker than that for allowing it within packages, I do think there's a reasonable consistency argument to be made in favour of ensuring that `from foo import bar` and `import as bar` are functionally equivalent when `bar` is a submodule of `foo`, especially since the latter form makes it clearer to the reader that `bar` *is* a submodule, rather than any arbitrary attribute.

I don't think it's a big problem in practice (so I wouldn't spend any time on implementing it myself), but the notion of an IMPORT_ATTR opcode for the "import x.y.z as m" case that parallels IMPORT_FROM seems architecturally clean to me in a way that the proposed resolutions to issue 992389 weren't.
msg291431 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-04-10 17:16
OK I won't close it but I'm withdrawing from the issue. If there are devs who want to implement this or mentor some contributor into implementing it feel free, but it sounds like Nick doesn't think it's important enough to fix (i.e. not worth his time) and I feel the same.
msg291437 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-10 17:59
It is easy to implement this (just replace LOAD_ATTR with IMPORT_FROM in the compiler). The hardest part is writing tests.
msg291486 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-04-11 12:09
Huh, interesting - I'd missed that the only part of the "from a.b import c" that IMPORT_FROM implements is the LOAD_ATTR variant that falls back to sys.modules. The prior adjustment to get IMPORT_NAME to return "a.b" instead of "a" happens inside that opcode based on the fact that a non-empty from_list was passed in.

So indeed, there's no new opcode needed - as Serhiy points out, the compiler just needs to emit IMPORT_FROM instead of LOAD_ATTR for this case.
msg291926 - (view) Author: Pavol Lisy (Pavol Lisy) Date: 2017-04-20 00:41
Maybe I am wrong but don't we get another weird behavior? 

   import sys.path  # this is error now
   ModuleNotFoundError: No module named 'sys.path'; 'sys' is not a package

So we probably expect same behavior here:

   import sys.path as foo

But if we just replace LOAD_ATTR with IMPORT_FROM in the compiler then wouldn't it works like:

   from sys import path as foo

msg291947 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-20 05:02
No, because the compiler generates different arguments for the IMPORT_NAME opcode.
msg292111 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-04-22 07:52
Expanding on Serhiy's answer: the ModuleNotFoundError for "import sys.path" comes from IMPORT_NAME, not from any of the subsequent attribute lookups.

That difference is visible in the bytecode:

>>> dis("import sys.path as path")
  1           0 LOAD_CONST               0 (0)
              3 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (sys.path)
              9 LOAD_ATTR                1 (path)
             12 STORE_NAME               1 (path)
             15 LOAD_CONST               1 (None)
             18 RETURN_VALUE

For "import sys.path as path", the given module name is "sys.path", and the "from list" entry on the stack is None. This fails before it even reaches the `LOAD_ATTR` line, since "sys.path" isn't an importable module. Changing LOAD_ATTR to IMPORT_FROM would thus have no effect on its behaviour.

>>> dis("from sys import path")
  1           0 LOAD_CONST               0 (0)
              3 LOAD_CONST               1 (('path',))
              6 IMPORT_NAME              0 (sys)
              9 IMPORT_FROM              1 (path)
             12 STORE_NAME               1 (path)
             15 POP_TOP
             16 LOAD_CONST               2 (None)
             19 RETURN_VALUE

For "from sys import path", the given module name is "sys", and the "from list" entry on the stack is a tuple containing the string "path". This works, since "sys" is importable *and* it has a "path" attribute.

Hence why Serhiy's suggestion is such an elegant fix, since the only cases where the LOAD_ATTR vs IMPORT_FROM distinction matters will be those where:

1. The submodule exists, so both "import a.b" and "from a import b" will work for the IMPORT_NAME step
2. It's a circular import, so even though "a.b" exists in sys.modules, it won't be accessible as an attribute of "a" yet

Getting the tests and any related documentation updates right will actually be harder than the code change.
msg292119 - (view) Author: Victor Varvariuc (Victor.Varvariuc) Date: 2017-04-22 13:03
How can I facilitate/contribute to get the changes suggested by Serhiy into Python?
msg292167 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-23 14:51
PR 1264 implements this idea.

Seems this is a duplicate of issue23203 which contains the same idea.
Date User Action Args
2017-04-23 14:51:38serhiy.storchakasetmessages: + msg292167
stage: needs patch -> patch review
2017-04-23 14:47:00serhiy.storchakasetpull_requests: + pull_request1377
2017-04-22 13:03:01Victor.Varvariucsetmessages: + msg292119
2017-04-22 07:52:42ncoghlansetmessages: + msg292111
2017-04-20 05:02:06serhiy.storchakasetmessages: + msg291947
2017-04-20 00:41:32Pavol Lisysetnosy: + Pavol Lisy
messages: + msg291926
2017-04-11 12:09:13ncoghlansetmessages: + msg291486
2017-04-10 17:59:21serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg291437

keywords: + easy (C)
stage: needs patch
2017-04-10 17:16:15gvanrossumsetnosy: - gvanrossum
2017-04-10 17:16:06gvanrossumsetmessages: + msg291431
2017-04-10 06:25:26ncoghlansetmessages: + msg291416
2017-04-10 05:58:03Victor.Varvariucsetmessages: + msg291414
2017-04-10 05:29:39Victor.Varvariucsetmessages: + msg291411
2017-04-10 00:32:43gvanrossumsetmessages: + msg291400
2017-04-09 16:36:16Victor.Varvariucsetfiles: +

messages: + msg291384
2017-04-09 16:13:00barrysetnosy: + barry
2017-04-09 15:48:57gvanrossumsetnosy: + gvanrossum
messages: + msg291380
2017-04-09 12:11:13ncoghlansetnosy: + brett.cannon, ncoghlan, eric.snow
messages: + msg291377
2017-04-09 11:20:44irdbsetnosy: + irdb
2017-04-09 11:11:54Victor.Varvariuccreate