This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: from x import * behavior inconsistent between module types.
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, eric.smith, kaorihinata
Priority: normal Keywords:

Created on 2021-03-12 00:40 by kaorihinata, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
example.txz kaorihinata, 2021-03-12 00:40 An example of the mentioned behaviors.
Messages (10)
msg388530 - (view) Author: Thomas J. Gallen (kaorihinata) Date: 2021-03-12 00:40
I'm looking for clarification as to how `from x import *` should operate when importing file/directory-based modules versus when importing a sub-module from within a directory-based module.

While looking into a somewhat related issue with pylint, I noticed that `from x import *` appears to behave inconsistently when called from within a directory-based module on a sub-module. Whereas normally `from x import *` intentionally does not cause `x` to be added to the current namespace, when called within a directory-based module to import from a sub-module (so, `from .y import *` in an `__init__.py`, for example), the sub-module (let's say, `y`) *does* end up getting added to the importing namespace. From what I can tell, this should not be happening. If this oddity has been documented somewhere, I may have just missed it, so please let me know if it has been.

This inconsistency is actually setting off pylint (and confusing its AST handling code) when you use the full path to reference any member of the `asyncio.subprocess` submodule (for example, `asyncio.subprocess.Process`) because, according to `asyncio`'s `__init__.py` file, no explicit import of the `subprocess` sub-module ever occurs, and yet you can draw the entire path all the way to it, and its members. I've attached a generic example of the different behaviors (tested with Python 3.9) using simple modules, including a demonstration of the sub-module import.

Thomas
msg388646 - (view) Author: Thomas J. Gallen (kaorihinata) Date: 2021-03-14 02:05
I've spent a bit of time building (and rebuilding) Python 3.9 with a modified `Lib/importlib/_bootstrap.py`/regenerated `importlib.h` to give me some extra logging, and believe the answer I was looking for is `_find_and_load_unlocked`. `_find_and_load_unlocked` appears to load the module in question, and always attach it to the parent regardless of the contents of `fromlist` (`_find_and_load_unlocked` isn't even aware of `fromlist`.) The only real condition seems to be "is there a parent/are we in a package?". `Lib/importlib/_bootstrap.py` is pretty sparsely documented so it's not immediately obvious whether or not some other piece of `importlib` depends on this behavior. If the author is known, then they may be able to give some insight into why the decision was made, and what the best solution would be?
msg388650 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-03-14 02:49
"git blame" will help you identify the authors. It looks there are 5 people involved: Brett, Antoine, Nick, Eric Snow, and Dino.
msg388658 - (view) Author: Thomas J. Gallen (kaorihinata) Date: 2021-03-14 04:31
Ahh, I always forget about blame.

Though the form was different, the initial commit of `importlib` (authored by Brett, so the nosy list seems fine for the moment) behaved the same way, and had an additional comment noting that the section in question was included to maintain backwards compatibility. I checked with Python 2.x and can confirm that this was how Python 2.x behaved as well (so I assume that's what the comment was for.)

I've tested simply commenting out that section (as, at a glance, I don't believe it will have any effect on explicit imports), and for the few scripts I tested with the backtraces were actually pretty clear: a lot of places in the standard library are accidentally relying on this quirk. collections doesn't import abc, importlib doesn't import machinery, concurrent doesn't import futures, etc, etc.

The easy, temporary fix would be to just add the necessary imports, then worry about `importlib`'s innards when the time comes to cross that bridge. That said, I know of only a few of the modules which will need imports added (the ones above, essentially), so I can't really say what the full scale of the work will be.
msg388789 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2021-03-15 23:24
Sorry, I'm having a hard time following what you've written and I unfortunately don't have time to examine your (I assume) .tar.xz file. When you say "directory-based-module", do you mean a package (e.g. `__init__.py` in a directory)? It's important to be clear because not all imports come from a file system, and so a package versus a module has important distinctions.

Could you write out what you're seeing and what you're expecting? E.g.
```
# pkg/__init__.py
from pkg.submodule import *  # Expecting `submodule_attr`, getting ...
```

```
# pkg/submodule.py
submodule_attr = 0
```

I will also say you really shouldn't be using `import *`. It primarily exists to make it easier to work in the REPL, and otherwise is rather archaic and has very odd import semantics.

As for how Python 2 did things, that (luckly) doesn't matter anymore. :) To show that this is a change in semantics you would need to check Python 3 versions to see where it shifted. If this changed in 3.9 then turning it back may work. But if it's more like 3.4 then I'm afraid these are now the semantics and the risk of code breakage is possibly too high.
msg388793 - (view) Author: Thomas J. Gallen (kaorihinata) Date: 2021-03-16 00:28
Yes, a package. There isn't actually that much in the txz. Most of the files are ostensibly empty.

As an example, let's say we have the following files:

test.py
test_module/__init__.py
test_module/test_submodule.py

test.py contains:
```python
import test_module
print(test_module.test_submodule)
```

test_module/__init__.py
```python
from .test_submodule import *
```

test_module/test_submodule.py is completely empty.

Assuming `from x import *` acts like it does elsewhere, `dir()` before and after the import (in `test_module/__init__.py`) should return the same result (as there's nothing to import, and I haven't made an explicit import of module itself.) In this case, `test_module.test_submodule` is being added to the parent class anyway. That's what I was referring to above regarding `_find_and_load_unlocked`.

I was looking into this, not because I want to use it (I don't), but because the *standard library* is using it in multiple places, and not performing explicit imports. In my case, this results in a slightly more complicated series of events where pylint thinks that attempting to access any member of `asyncio.subprocess` (for example, `Process`) isn't possible because `subprocess` hasn't actually been imported into `asyncio`'s module namespace. I have an issue open for that with PyCQA/pylint already.

Based on the documentation, and how `*` imports behave most of the time, pylint would appear to be correct in that `subprocess` has not been imported into `asyncio`'s module namespace, and as such should not be accessible. Above is a generic example of that behavior. I would assume that `test.py` should fail with an AttributeError, but in this case it does not.
msg388798 - (view) Author: Thomas J. Gallen (kaorihinata) Date: 2021-03-16 02:25
parent module* rather. Just saw that typo.
msg388873 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2021-03-16 19:51
Thanks for the clarification! I think I understand what's going on now, and the logic is actually expected.

When you do `from .test_submodule import *`, Python must first import `test_pkg.test_submodule` in order to get you the object for the `import *` part (or frankly anything that comes after `import`). As part of importing `test_pkg.test_submodule`, it automatically gets attached to `test_pkg`, otherwise we wouldn't be able to cache the module in `sys.modules` and prevent redundant/duplicate imports.

As such, when you do `import test_pkg` in`test.py`, the fact that `test_pkg.__init__` has to import `test_pkg.test_submodule` means `test_pkg will automatically end up with a `test_submodule` attribute. That's why your `print()` function call succeeds.

If I'm still misunderstanding, can you please use an `assert` statement that fails because the logic doesn't work the way you expect it to be?
msg388901 - (view) Author: Thomas J. Gallen (kaorihinata) Date: 2021-03-17 00:26
Given the previous example, in test.py, replace:

```
print(test_module.test_submodule)
```

...with:

```
assert(not hasattr(test_module, "test_submodule"))
```

...because the issue is only the bottom half of `_find_and_load_unlocked`. Specifically, the chunk starting at line 1006:

```
    if parent:
        # Set the module as an attribute on its parent.
        parent_module = sys.modules[parent]
        child = name.rpartition('.')[2]
        try:
            setattr(parent_module, child, module)
        except AttributeError:
            msg = f"Cannot set an attribute on {parent!r} for child module {child!r}"
            _warnings.warn(msg, ImportWarning)
```

The issue with these lines is that nothing here was requested by the user, and the actions you mentioned (preventing redundant/duplicate imports) is not handled by anything in, or relating to this code (at least, not that I've seen.) The module and all dependencies would still be loaded into `sys.modules` despite this code. If the module has already been loaded, then we'll never make it past `_find_and_load` to `_find_and_load_unlocked` anyway, as `_NEEDS_LOADING` will no longer match.

Does that make more sense?
msg388943 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2021-03-17 17:19
Having `test_pkg.test_submodule` be set after your import based on the sequence of imports your example executes is entirely expected and a side-effect of how import is (at least now) designed. So I disagree with the assessment "that nothing here was requested by the user", since imports have side-effects, and one of those is setting submodules as attributes on packages post-import.

Thanks for the report and all the details, Thomas, but I am still going to close this as not a bug.
History
Date User Action Args
2022-04-11 14:59:42adminsetgithub: 87643
2021-03-17 17:19:26brett.cannonsetstatus: open -> closed
resolution: not a bug
messages: + msg388943
2021-03-17 00:26:13kaorihinatasetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg388901
2021-03-16 19:51:34brett.cannonsetstatus: open -> closed
resolution: not a bug
messages: + msg388873

stage: resolved
2021-03-16 02:25:42kaorihinatasetmessages: + msg388798
2021-03-16 00:28:36kaorihinatasetmessages: + msg388793
2021-03-15 23:24:08brett.cannonsetmessages: + msg388789
2021-03-14 04:31:33kaorihinatasetmessages: + msg388658
2021-03-14 02:49:17eric.smithsetmessages: + msg388650
2021-03-14 02:05:39kaorihinatasetmessages: + msg388646
2021-03-13 04:00:45eric.smithsetnosy: + brett.cannon, eric.smith
2021-03-12 00:40:38kaorihinatacreate