This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Overlapping PYTHONPATH may cause import already imported module
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: aklajnert, docs@python, eric.snow
Priority: normal Keywords:

Created on 2022-02-20 12:34 by aklajnert, last changed 2022-04-11 14:59 by admin.

Messages (7)
msg413584 - (view) Author: (aklajnert) Date: 2022-02-20 12:34
I'm not 100% sure whether it is a bug or intentional behavior but looks like a bug to me. I couldn't find anything about it here or anywhere else.

Sample project structure:
```
.
├── main.py
└── src
    ├── __init__.py
    ├── common_object.py
    ├── user_1.py
    ├── user_2.py
    └── user_3.py
```

`__init__.py` is an empty file.


```
# src/common_object.py
OBJECT = object()
```

```
# src/user_1.py
from .common_object import OBJECT
```

```
# src/user_2.py
from src.common_object import OBJECT
```

```
# src/user_3.py
from common_object import OBJECT
```

```
# main.py
import sys

sys.path.append("src")

from src import user_1, user_2, user_3


if __name__ == '__main__':
    print(user_1.OBJECT is user_2.OBJECT) # True
    print(user_1.OBJECT is user_3.OBJECT) # False
```

Since `src` package is added to `PYTHONPATH`, it is possible to import `common_object` by calling `from src.common_object` or `from common_object`.  
Both methods work, but using import without `src.` makes Python load the same module again instead of using the already loaded one.

If you extend `main.py` with the following code, you'll see a bit more:
```
modules = [
    module
    for name, module in sys.modules.items()
    if "common_object" in name
]
print(len(modules)) # 2
print(modules[0].__file__ == modules[1].__file__) # True
```

In the `sys.modules` dict there will be two separate modules - one called `common_object` and another named `src.common_object`. 
If you compare the `__file__` value for both modules you'll see that they are the same. It seems that python gets the module name wrong.
msg413713 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-02-22 16:23
When you run a Python script, the directory the script is in is automatically added to the beginning of sys.path.  This is the fundamental issue you've run into.

Basically, "src.common_object" is imported relative to the "src" that was imported relative to that automatically added sys.path entry.  However, "common_object" is a distinct module imported relative to the sys.path entry you explicitly added.

In general, adding a package's directory to sys.path is a bad idea.
msg413714 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-02-22 16:23
Here is more detail on what happens when "from src import user_1, user_2, user_3" is executed in main.py:

1. the "src" module is imported
   a. not found in sys.modules
   b. file found on a sys.path entry (the directory main.py is in)
   c. the "src" module is created with __path__ set to the src directory
   d. the module is added to sys.modules
   e. src/__init__.py is executed
2. the "src.user_1" module is imported
   a. not found in sys.modules
   b. file found relative to src.__path__
   c. module created
   d. added to sys.modules
   e. executed
3. "from .common_object import OBJECT" is resolved to "from src.common_object import OBJECT"
4. "src.common_object" is imported (see 2)
5. src.common_object.OBJECT is created
6. "src.user_2" is imported (see 2)
7. "src.common_object" is already found in sys.modules and used
8. "src.user_3" is imported (see 2)
9. "common_object" is imported
   a. not found in sys.modules
   b. file found on a sys.path entry (the one you added in main.py)
   c. module created
   d. added to sys.modules
   e. executed
10. common_object.OBJECT is created

So the module created at (4) is different than the one at (9), even though they are imported from the same file.  Consequently, the OBJECT in each is likewise distinct.
msg413716 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-02-22 16:25
I'm leaving this "pending" in case there may be some improvement we can make to the documentation to address this.
msg413772 - (view) Author: (aklajnert) Date: 2022-02-23 05:30
I agree that adding a package directory to PYTHONPATH is not a good idea, however, I've stumbled on it in two completely independent companies' codebases. So it makes me suspect that this is may not be that rare case.

In the previous company we got bitten by this problem, and debugging took quite some time as this issue usually doesn't reveal immediately.

If the relative path is resolved to the same module as not relative, then the behavior when the same file's path but referenced in a slightly different way isn't seems at least inconsistent. Note that the absolute path to the module is exactly the same, the only thing that is different is how you reference it.

I'm happy to make an attempt to fix it if it gets acknowledged as a bug (a little guidance would be also helpful).
msg413813 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-02-23 15:42
FYI, a technical solution has been discussed before: bpo-13475 and PEP 395.  However, that does not help so much if the default behavior isn't changed.  That would require a PEP but I expect it would be rejected because so many scripts already rely on the current behavior and the current behavior is useful in some cases.

PEP 395 also has a good discussion of the various pitfalls related to sys.path[0] initialization.  Furthermore, the topic is discussed in quite a few issues, such as bpo-44132 and bpo-29929.

Probably the best use of your time on this would be to improve the documentation so people will more easily avoid the problem, or at least more easily diagnose the situation when they stumble on it.  Again, PEP 395 is a good guide for this.
msg414146 - (view) Author: (aklajnert) Date: 2022-02-27 10:45
Honestly, it seems to me that the documents that you mentioned are discussing different problems, that may be related, but are not the same as the one I've described here.

I'm not arguing - you're clearly more experienced and that's not my area of expertise, but out of curiosity - can you mention some example use cases where the behavior I've described is useful?
History
Date User Action Args
2022-04-11 14:59:56adminsetgithub: 90962
2022-02-27 10:45:32aklajnertsetmessages: + msg414146
2022-02-23 15:42:48eric.snowsetmessages: + msg413813
2022-02-23 05:30:12aklajnertsetstatus: pending -> open

messages: + msg413772
2022-02-22 16:25:44eric.snowsetstatus: open -> pending

nosy: + docs@python
messages: + msg413716

components: + Interpreter Core
2022-02-22 16:23:54eric.snowsetmessages: + msg413714
2022-02-22 16:23:46eric.snowsetnosy: + eric.snow
messages: + msg413713
2022-02-20 12:52:05aklajnertsettitle: Overlapping PYTHONPATH may cause -> Overlapping PYTHONPATH may cause import already imported module
2022-02-20 12:34:47aklajnertcreate