This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Pascal.Chambon
Recipients Pascal.Chambon
Date 2013-04-13.16:04:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1365869097.02.0.582677403017.issue17716@psf.upfronthosting.co.za>
In-reply-to
Content
Hello,

we've encountered several times a very nasty bug on our framework, several times tests or even production code (served by mod_wsgi) ended up in a broken state, where imports like "from . import processing_exceptions", which were NOT in circular imports and were 100% existing submodules, raised exceptions like "ImportError: cannot import name processing_exceptions". Restarting the test/server fixed it, and we never knew what happened.

I've crossed several forum threads on similar issues, only recently did I find one which gave a way to reproduce the bug:
http://stackoverflow.com/questions/12830901/why-does-import-error-change-to-cannot-import-name-on-the-second-import

So here attached is a python2 sample (python3 has the same pb), showing the bug (just run their test_import.py)

What happens here, is that a package "mypkg" fails to get imported due to an exception (eg. temporarily failuure of DB), but only AFTER successfully importing a submodule mypkg.module_a.
Thus, "mypkg.module_a" IS loaded and stays in sys.modules, but "mypkg" is erased from sys.modules (like the doc on python imports describes it).

The next time we try, from within the same application, to import "mypkg", and we cross "from mypkg import module_a" in the mypkg's __init__.py code, it SEEMS that the import system checks sys.modules, and seeing "mypkg.module_a" in it, it THINKS that necessarily mypkg is already initialized and contains a name "module_a" in its global namespace. Thus the "cannot import name processing_exceptions" error.

Importing "module_a" as an absolute or relative import changes nothing, however doing "import mypkg.module_a" solves the problem (dunno why).

Another workaround is to cleanup sys.modules in mypkg/__init__.py, to ensure that a previously failed attempt at importing the package modules doesn't hinder us.
    
    # on top of "mypkg/__init__.py"
    exceeding_modules = [k for k in sys.modules.keys() if k.startswith("mypkg.")]
    for k in exceeding_modules:
        del sys.modules[k]
        
Anyway, I don't know enough python's import internals to understand why, exactly, on second import attempt, the system tries a kind of faulty getattr(mypkg, "module_a"), instead of simply returning sys.modules["mypkg.module_a"] which exists.
Could anyone help with that ? 
That's a very damaging issue, imo, since webserver workers can reach a completely broken state because of that.

PS: more generally, I guess python users lack insight on the behaviour of "from xxx import yyy", especially when yyy is both a real submodule of xxx and a variable initialized in xxx/__init__.py (it seems the real module overrides the variable), or when the __all__ list of xxx could prevent the import of a submodule of xxx by not including it.
Provided I better understand the workflow of all these stuffs - that have quite moved recently I heard - I'd be willing to summarize it for the python docs.
History
Date User Action Args
2013-04-13 16:04:57Pascal.Chambonsetrecipients: + Pascal.Chambon
2013-04-13 16:04:57Pascal.Chambonsetmessageid: <1365869097.02.0.582677403017.issue17716@psf.upfronthosting.co.za>
2013-04-13 16:04:56Pascal.Chambonlinkissue17716 messages
2013-04-13 16:04:55Pascal.Chamboncreate