Issue 20321: ImportError when a module is created after a catched ImportError

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/64520

classification

Title:	ImportError when a module is created after a catched ImportError
Type:	behavior	Stage:	test needed
Components:	Interpreter Core	Versions:	Python 3.3

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:	brett.cannon	Nosy List:	berdario, brett.cannon, eric.snow, ezio.melotti, ncoghlan
Priority:	normal	Keywords:

Created on 2014-01-21 04:03 by berdario, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
bbug.py	berdario, 2014-01-21 16:13
bug_with_invalidatecaches.py	berdario, 2014-01-21 16:53
issue20321.py	ezio.melotti, 2014-01-22 11:43

Messages (7)
msg208608 - (view)	Author: (berdario)	Date: 2014-01-21 04:03
This small script errors on both Python2.7, Python3.3 and Pypy2.0 (I just reduced the test case supplied to me by ezio)
msg208645 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2014-01-21 13:45
I'm not sure why you think the example code as-is should work. The first entry on sys.path is the current directory ('' or the absolute path, depending if you are running from the interpreter prompt or specifying a file on the command-line). Stripping off sys.path[0] guarantees the example code will not work. And as for why adding in '.' works on PyPy and not Python 3.3, it's because you didn't call importlib.invalidate_caches() to clear out the directory modification, so Python didn't notice that the file was added because the mtime granularity for directories it larger than the time it took to have the import for it_does_not_exist fail, write the impfile.py file, and to try importing again.
msg208663 - (view)	Author: (berdario)	Date: 2014-01-21 16:13
yes, sorry... I tried to "simplify" and generalize it too much (I tried to avoid creating a new directory in the test, assuming that the same behavior could make sense by only creating a new module in the current directory) I'll reupload the correct version of the failing code, and I'll try to understand importlib.invalid_caches before eventually reopening it
msg208666 - (view)	Author: (berdario)	Date: 2014-01-21 16:25
Ok, the bug is unrelated with timings and the finder caches apparently
msg208672 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2014-01-21 16:57
It actually is a caching issue, but not with the caches in the finder but the cache of finders. Because you inserted LIBDIR before it existed, import noticed it didn't exist and so put None into sys.path_importer_cache[LIBDIR] (or imp.NullImporter prior to Python 3.3). If you del sys.path_importer_cache[LIBDIR] just before trying to import impfile then it works. If you leave the directory around but clear out its contents then importlib.invalidate_caches() would have been needed. As you have noticed, dynamically mucking around import is rather delicate. There are various caches and tricks used in order to speed it up since it is such a common operation. If you are trying to just load a single file that you dynamically wrote you can load the file directly using http://docs.python.org/3/library/importlib.html#importlib.machinery.SourceFileLoader (or if you need to support Python 2.7 as well, http://docs.python.org/2.7/library/imp.html#imp.load_module). Do let me know if you are trying to just load a single file. I'm contemplating adding a utility function to help with that use-case in importlib.
msg208784 - (view)	Author: Ezio Melotti (ezio.melotti) *	Date: 2014-01-22 11:43
> Because you inserted LIBDIR before it existed, import noticed it didn't > exist and so put None into sys.path_importer_cache[LIBDIR] I'm not familiar with sys.path_importer_cache, but what happens if instead of putting None, sys.path_importer_cache[LIBDIR] is not created at all? Will it check for LIBDIR next time an import is performed (possibly finding it if the dir has been created in the meanwhile)? If this check happens every time, I guess it will defeat the purpose of the cache in case of missing dirs (especially if the dir is never created -- but maybe this is an uncommon case and the optimization can be removed?). Another option could be to recheck empty dirs in the cache in case of ImportError, and then update the cache entry and retry the import if a new dir that didn't exist before is found. This would still have an overhead, but even in this case the slow down might be acceptable. If we can't find a compromise I guess we could add a note to the documentation, even though I'm not sure where. FTR in the original report the user was adding a dir to sys.path before extracting the content of a tar file (thus creating the added dir), and adding the dir after the extraction fixes the problem. I'm also attaching my version of the test case, that shows two possible "fixes" to the problem and is closer to the original report.
msg208828 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2014-01-22 16:55
So those semantics have existed as long as PEP 302 has been around, which is Python 2.3 (that PEP itself is over a decade old), so changing them now would break code. And honestly I wouldn't change it anyway. On some filesystems, stat calls are extremely costly (e.g. NFS). Even if it was only on failed imports it would still have a cost. And considering a way to stay compatible between Python 2 and 3 is to catch ImportError and then import a module whose named changed, it would still be a costly change. Now if you personally really want the semantics you are after you could have a sys.meta_path importer which cleared out sys.path_importer_cache and tried the import again. As for documentation, it's explained in the language reference: http://docs.python.org/3/reference/import.html#path-entry-finders . But otherwise there isn't another place unless someone writes a HOWTO on this, but that probably isn't a good thing as import is something you really should be weary of mucking with. For someone trying to import the contents of a tarfile, they would be better served by a tarfile importer than unpacking the tarfile and then adding a path to sys.path. But someone has to write that tarfile importer first. =) Maybe some day: issue #17630.

History
Date	User	Action	Args
2022-04-11 14:57:57	admin	set	github: 64520
2014-01-22 16:55:08	brett.cannon	set	status: open -> closed resolution: not a bug messages: + msg208828
2014-01-22 11:43:54	ezio.melotti	set	files: + issue20321.py messages: + msg208784
2014-01-21 16:57:25	brett.cannon	set	messages: + msg208672
2014-01-21 16:53:25	berdario	set	files: + bug_with_invalidatecaches.py
2014-01-21 16:25:47	berdario	set	status: closed -> open resolution: not a bug -> (no value) messages: + msg208666 versions: - 3rd party, Python 2.7
2014-01-21 16:13:30	berdario	set	files: + bbug.py messages: + msg208663
2014-01-21 16:12:01	berdario	set	files: - bbug.py
2014-01-21 13:45:38	brett.cannon	set	status: open -> closed assignee: brett.cannon resolution: not a bug messages: + msg208645
2014-01-21 04:05:26	ezio.melotti	set	nosy: + brett.cannon, ncoghlan, eric.snow type: behavior stage: test needed
2014-01-21 04:03:08	berdario	create