This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Riccardo Coccioli
Recipients Riccardo Coccioli
Date 2019-03-13.23:47:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1552520828.65.0.116670090756.issue36284@roundup.psfhosted.org>
In-reply-to
Content
It seems that importlib.import_module() is not thread-safe if the loaded module raises an Exception on Python 3.4 and 3.5. I didn't find any thread-unsafe related information in Python's documentation.
The frequency of the failure appears to be random.

This is the setup to reproduce the issue:

#----- FILES STRUCTURE
├── fail.py
└── test.py
#-----

#----- CONTENT OF fail.py
ACCESSIBLE = 'accessible'

import nonexistent  # raise RuntimeError('failed') is basically the same

NOT_ACCESSIBLE = 'not accessible'
#-----

#----- CONTENT OF test.py
import importlib
import concurrent.futures


def f():
    try:
        mod = importlib.import_module('fail')
        # importlib.reload(mod)  # WORKAROUND

        try:
            val = mod.NOT_ACCESSIBLE
        except AttributeError as e:
            val = str(e)

        return (mod.__name__, type(mod), mod.ACCESSIBLE, val)
    except ImportError as e:
        return str(e)


with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(f) for i in range(5)]
    for future in concurrent.futures.as_completed(futures):
        print(future.result())
#-----

Expected result:
#-----
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
#-----

Actual result:
#-----
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
('fail', <class 'module'>, 'accessible', "'module' object has no attribute 'NOT_ACCESSIBLE'")
('fail', <class 'module'>, 'accessible', "'module' object has no attribute 'NOT_ACCESSIBLE'")
#-----

In the unexpected output lines, the module has been "partially" imported. The 'mod' object contains a module object, and trying to access an attribute defined before the import that raises Exception works fine, but trying to access an attribute defined after the failing import, fails.
It seems like the Exception was not properly raised at module load time, but at the same time the module is only partially loaded up to the failing import.

The actual number of half-imported modules varies between runs and picking different values for max_workers and range() and can also be zero (normal behaviour). Also the frequency of the issue varies.
Using multiprocessing.pool.ThreadPool() and apply_async() instead of concurrent.futures.ThreadPoolExecutor has the same effect.

I was able to reproduce the issue with the following Python versions and platforms:
- 3.4.2 and 3.5.3 on Linux Debian
- 3.4.9 and 3.5.6 on macOS High Sierra 10.13.6

While the issue doesn't show up at the best of my knowledge on:
- 3.6.7 and 3.7.2 on macOS High Sierra 10.13.6

Thanks to a colleague suggestion I also found a hacky workaround. Uncommenting the line in test.py marked as 'WORKAROUND' a reload of the module is forced. With that modification the actual result is:
#-----
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
module fail not in sys.modules
module fail not in sys.modules
#-----

While this doesn't solve the issue per se, it actually raises the same ImportError that the module was supposed to raise in the first place, just with a different message, allowing the code to continue it's normal execution.
History
Date User Action Args
2019-03-13 23:47:08Riccardo Cocciolisetrecipients: + Riccardo Coccioli
2019-03-13 23:47:08Riccardo Cocciolisetmessageid: <1552520828.65.0.116670090756.issue36284@roundup.psfhosted.org>
2019-03-13 23:47:08Riccardo Cocciolilinkissue36284 messages
2019-03-13 23:47:08Riccardo Cocciolicreate