classification
Title: import crashes on circular imports in ext modules
Type: behavior Stage: needs patch
Components: Documentation, Interpreter Core Versions: Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Arfrever, Omer.Katz, asvetlov, docs@python, eric.snow, loewis, pitrou, r.david.murray, scoder
Priority: normal Keywords: patch

Created on 2012-11-03 12:42 by scoder, last changed 2020-01-29 00:38 by brett.cannon.

Files
File name Uploaded Description Edit
cystdlib.py scoder, 2012-11-03 12:42 setup.py file to compile the stdlib with Cython (post-0.17 github master)
reimport.c scoder, 2012-11-03 20:11 single C module that triggers the import crash
cystdlibbug.py scoder, 2012-11-03 20:13 simplified setup.py script that only compiles os.py and posixpath.py
modulecreate.patch pitrou, 2012-11-03 22:40 review
Messages (21)
msg174612 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-03 12:42
After compiling the stdlib with Cython with the attached script, modules that use circular imports fail to initialise. That includes os and posixpath, as well as shutil and tarfile. Example:

$ ./python -c 'import shutil'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "tarfile.py", line 44, in init tarfile (tarfile.c:44135)
    import shutil
  File "shutil.py", line 14, in init shutil (shutil.c:22497)
    import tarfile
  File "<frozen importlib._bootstrap>", line 1556, in _find_and_load
RuntimeError: maximum recursion depth exceeded

I've tried this with the latest CPython 3.4 hg version, but I'm pretty sure it fails in Py3.3 as well.
msg174649 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-11-03 16:46
I don't mean for this to sound rude, but this seems like a Cython issue and not one for the stdlib. Can you reproduce the problem without using Cython? The new per-module locking mechanism Antoine prevents this from being a problem normally, so it makes me think Cython is at fault here.
msg174651 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-03 16:58
Well, it might be a legitimate issue, but due to the setup needed to reproduce, I would hope a Cython developer could do the diagnosis and possibly submit a patch.
msg174684 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-03 20:11
Well, it's not like the setup was all that difficult. 1) Install the latest github master of Cython (they provide one-click archives that pip can install for you), 2) change into the CPython stdlib directory and run the script I attached, 3) execute "import os" in Python. You need to install Cython rather than just unpacking it because it uses 2to3 for installation in Py3.

Anyway, I ran gdb on it and it turns out that the exception is correct, there is an infinite recursion taking place. According to the (otherwise not very interesting) stack trace at the point where it raises the RuntimeError, the module init function of the first Cython module (say, "os") calls the builtin "__import__()" to import "posixpath". That triggers the load of that shared library and the execution of its module init function. Fine so far. However, that module init function then executes an import of "os" through "__import__()", which then runs the module init function of the "os" module again. Bug right here. It shouldn't try to reimport a module that it is already importing.

I could reduce the test case down to one line:

  # reimport.py
  import reimport

Compiling that with Cython gives the C code I attached. Build it, import it, see it fail. However, remember that fixing only this isn't enough, the import cycle might be nested arbitrarily deep.
msg174696 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-03 21:11
Since it's quite possible that this has nothing to do with the frozen part of the importlib specifically, I'm removing that bit from the ticket title.
msg174704 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-03 21:50
Are you sure this worked before 3.3?

Regardless, it seems you could fix this issue by adding a single line just after the PyModule_Create() call:

  __pyx_m = PyModule_Create(&__pyx_moduledef);
  #endif
  if (PyDict_SetItemString(PyImport_GetModuleDict(),
                           __Pyx_NAMESTR("reimport"), __pyx_m))
    goto __pyx_L1_error;


The fundamental issue is that an extension is inserted into sys.modules 
only after its initialization function returns, so importing it recursively won't detect that it already exists.

(by contrast, a Python module is inserted into sys.modules before its code is executed inside the module's global namespace)
msg174706 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-03 21:53
A simple test seems to confirm the problem already existed in 3.2.
msg174709 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-03 22:01
Note that Py_InitModule4 in Python 2 did add the module object to sys.modules (by calling PyImport_AddModule) before returning, but PyModule_Create in Python 3 doesn't.

(PEP 3121 doesn't seem to mention PyModule_Create at all, strangely; it seems the PEP doesn't follow the implementation)
msg174713 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-03 22:18
Hmm, we already do that for packages (i.e. when compiling __init__.py). Looks like this just needs to be done for all modules in Py3. And unless there is a compelling reason for Py_InitModule4() not to do it, I think it should be reverted to its Py2 behaviour.
msg174717 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-03 22:40
I have detected a compatibility issue when reverting to the 2.x behaviour: extension modules which lie about their name in their PyModuleDef are broken. There are two such modules in 3.3: _io and _decimal.

Patch attached, anyway.
msg174719 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-03 22:49
Now that you mention it - wouldn't that suffer from not actually knowing the FQMN? There's that hack in the dynlib loader that stores the package context in a global variable so that the module creation function can figure it out. That might be a reason not to register the module automatically. Only the loader would know the correct package, not the module creation code.

Also see issue 13429.
msg174734 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-04 01:18
> Now that you mention it - wouldn't that suffer from not actually
> knowing the FQMN? There's that hack in the dynlib loader that stores
> the package context in a global variable so that the module creation
> function can figure it out. That might be a reason not to register the 
> module automatically.

That's possible (although I would expect a module to know in which package it's supposed to live).
Another solution would have been to pass the module object to the init function, instead of letting the init function create it, but it's a bit too late (or too early :-)) to change the extension module init signature again.
msg174764 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-04 08:30
The problem is a) that the module does not necessarily know to which place it eventually gets installed (Cython relies on the distutils Extension not lying to it, for example, which people do from time to time), and b) that the call to Py_InitModule() only receives the plain module name, not the package path. And yes, as mentioned in the other issue, passing a pointer to a context description struct into the module init function would have been the right thing to change for Py3 and still is the right thing to change for Py4.

BTW, I can confirm that registering the module in sys.modules explicitly right after creation works around this issue. Given that Cython needs to know the FQMN at compile time anyway, this works for us. It still leaves the problem open for other extension code.
msg174811 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-11-04 15:06
It sounds like Cython has its fix and CPython knows what should eventually be changed in Python 4 to bring extension module initialization closer to how Python module code is initialized.

Maybe we should add a warning in some documentation somewhere about this and then close the issue?
msg174939 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-05 20:33
> Maybe we should add a warning in some documentation somewhere about
> this and then close the issue?

I don't really know where to add it, in the C API docs perhaps,
msg174966 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-11-06 14:20
I was thinking somewhere in http://docs.python.org/3/c-api/import.html since this only comes up when you try to execute an import during extension module initialization that involves a circular import.
msg175136 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012-11-08 07:25
Agreed. Since it doesn't really fit into any specific function documentation, I would place it right at the top. Something like this:

"""
The following functions can be used by C code to call into Python's import machinery.

Note that Python 3 does not automatically register an extension module in sys.modules on creation (see module.html#initializing-c-modules). It is only added after running through the whole module init function. This means that a request to import the current module while its init function is still running (either directly or transitively by other modules) will try to reimport the module. If you cannot be sure that this will not happen, you have to register the newly created module yourself as follows, using the fully qualified module name::

    PyObject *the_module = PyModule_Create(py_module_def);
    if (the_module == NULL) { /* failure ! */ };

    PyObject *sys_modules = PyImport_GetModuleDict();
    if (sys_modules == NULL) { /* failure ! */ };

    if (PyDict_SetItemString(modules, "the.package.and.module", the_module) < 0) { /* failure ! */ };
"""

Maybe it should add another comment that this is a major quirk in the runtime and say "sorry for being stupid here - I hope you can do better". Requiring the user to know the FQMN at build time because the import machinery fails to set it automatically is just embarrissing.

Then, after the first sentence in the module.html#initializing-c-modules section, I'd add this:

"""
Note that, starting with Python 3.0, the module creation functions no longer register the module in sys.modules. The import machinery will only do this after the module init function has run. If you need to run imports as part of your module init function and happen to know the fully qualified module name in your code, it is best to register the module yourself after creating it.
"""

I wonder if the code example shouldn't go on the "module" page.
msg175170 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-11-08 14:56
The fully qualified name requirement is definitely a design flaw where init functions should just be given the module object with everything already set, just like what @importlib.util.module_for_loader does. Hopefully we can come up with a solution through your current discussion on python-dev.

As for the comment, yes at the top of the page makes sense. Whether example code should go somewhere else I don't know.
msg242755 - (view) Author: Omer Katz (Omer.Katz) Date: 2015-05-08 11:28
Is this issue resolved in any way?
Has there been a decision made on how to resolve it?
What's the status here?
msg242761 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-05-08 13:38
https://www.python.org/dev/peps/pep-0489/ is trying to come up with a redesign of extension module loading and no one has submitted a patch for the documentation (although Stefan has inlined proposed wording in a comment).
msg242764 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-05-08 14:03
Yes, the resolution of this issue will be to add documentation.  Someone should turn Stefan's comment into a patch.
History
Date User Action Args
2020-01-29 00:38:30brett.cannonsetnosy: - brett.cannon
2015-05-08 14:03:49r.david.murraysetassignee: docs@python
components: + Documentation
versions: + Python 3.5, - Python 3.2, Python 3.3
nosy: + r.david.murray, docs@python

messages: + msg242764
stage: needs patch
2015-05-08 13:38:19brett.cannonsetmessages: + msg242761
2015-05-08 11:28:05Omer.Katzsetnosy: + Omer.Katz
messages: + msg242755
2013-08-23 09:13:45Arfreversetnosy: + Arfrever
2012-11-15 15:39:53asvetlovsetnosy: + asvetlov
2012-11-13 02:52:49eric.snowsetnosy: + eric.snow
2012-11-08 14:56:12brett.cannonsetmessages: + msg175170
2012-11-08 07:25:59scodersetmessages: + msg175136
2012-11-06 14:20:45brett.cannonsetmessages: + msg174966
2012-11-05 20:33:07pitrousetmessages: + msg174939
2012-11-04 15:06:52brett.cannonsetmessages: + msg174811
2012-11-04 08:30:46scodersetmessages: + msg174764
2012-11-04 01:18:11pitrousetmessages: + msg174734
2012-11-03 22:49:12scodersetmessages: + msg174719
2012-11-03 22:40:51pitrousetfiles: + modulecreate.patch
keywords: + patch
messages: + msg174717
2012-11-03 22:18:45scodersetmessages: + msg174713
2012-11-03 22:01:06pitrousetnosy: + loewis

messages: + msg174709
versions: + Python 3.2
2012-11-03 21:53:42pitrousetmessages: + msg174706
2012-11-03 21:50:42pitrousetmessages: + msg174704
2012-11-03 21:11:28scodersetmessages: + msg174696
title: frozen importlib crashes on circular imports in ext modules -> import crashes on circular imports in ext modules
2012-11-03 20:13:25scodersetfiles: + cystdlibbug.py
2012-11-03 20:11:51scodersetfiles: + reimport.c

messages: + msg174684
versions: + Python 3.3
2012-11-03 16:58:18pitrousetmessages: + msg174651
2012-11-03 16:46:25brett.cannonsetnosy: + brett.cannon, pitrou
messages: + msg174649
2012-11-03 12:42:34scodercreate