classification
Title: imp.load_dynamic imports wrong module when called several times on a multi-module .so
Type: behavior Stage: resolved
Components: Documentation Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Arfrever, amaury.forgeotdarc, asvetlov, brett.cannon, docs@python, eric.snow, eudoxos, ncoghlan, python-dev, r.david.murray
Priority: normal Keywords: easy, patch

Created on 2012-10-11 11:11 by eudoxos, last changed 2012-11-29 17:54 by asvetlov. This issue is now closed.

Files
File name Uploaded Description Edit
load_dynamic-test.zip eudoxos, 2012-10-11 11:11 Minimal code for reproducing the bug; use "make py2" or "make py3"
py2_many-modules-in-one-so_1.diff eudoxos, 2012-11-06 15:13 Patch against hg tip for the 2.7 branch review
issue16194.diff asvetlov, 2012-11-29 15:27
Messages (21)
msg172632 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-11 11:11
I have several compiled modules linked into one .so file and import them using imp.load_dynamic.

Only the first module imported with load_dynamic is imported properly, all subsequent calls of load_dynamic on the same file ignore the first argument (name) and return the first module again. The init function is also called only for the first module imported by load_dynamic.

The bug is reproducible for python 2.7.3 and 3.2.2. Test case is attached.

Here inline simplified source for 2.7:

foo.c:

	#include<stdio.h>
	#include<Python.h>
	PyMODINIT_FUNC initfoo(){
		(void) Py_InitModule("foo",NULL);
		printf("initfoo()\n");
	}
	PyMODINIT_FUNC initbar(void){
		(void) Py_InitModule("bar",NULL);
		printf("initbar()\n");
	}
	PyMODINIT_FUNC initbaz(void){
		(void) Py_InitModule("baz",NULL);
		printf("initbaz()\n");
	}

test.py:

	import sys,imp
	# import foo using the normal machinery
	sys.path.append('.')
	import foo
	# this is OK
	print imp.load_dynamic('bar','foo.so')
	# this imports *bar* again, but should import baz
	print imp.load_dynamic('baz','foo.so')
	# this imports *bar* again, although the module is not defined at all
	print imp.load_dynamic('nonsense','foo.so')

Compiled with

         gcc -shared -fPIC foo.c -o foo.so `pkg-config python --cflags --libs`

I get when running "python test.py" output:

        initfoo()
        initbar()
        <module 'bar' from 'foo.so'>
        <module 'bar' from 'foo.so'>
        <module 'bar' from 'foo.so'>

The module 'bar' is imported 3 times, although the 2nd import should import *baz* and the third import should fail ("nonsense" module does not exist).
msg172637 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-10-11 12:33
Did this actually work in a previous version of Python, and if so what version?
msg172647 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-11 15:04
I tried with python 2.4.5 and 2.5.2 in chroot (using ubuntu hardy, which packaged both of them) and the result is exactly the same for both. I doubt I am able to install anything older in a sensible manner.
msg172649 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-10-11 15:10
This is an enhancement request, then.
msg172650 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-11 15:12
No, it is an old bug, since the behavior does something else than documented (http://docs.python.org/library/imp.html#imp.load_dynamic) and reasonably expected -- imp.load_dynamic("baz","foo.so") imports the "foo" module under some circumstances.
msg172651 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-10-11 15:21
It's actually a documentation bug.
msg172653 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-11 15:32
While I understand that this behavior went unnoticed for ages and can be seen therefore as unimportant, designating this as documentation bug is quite absurd; perhaps the following wording would be appropriate:

.. note::

    If this function is called multiple times on the same file (in terms of inode; symlink pointing to same file is fine), it will return the module which was first imported via `load_dynamic` instead of the requested module, without reporting any error. The previous call to `load_dynamic` may not be in the same part of the code, but it must happen within the same interpreter instance.
msg172654 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-10-11 16:11
Before this gets out of control I want to clarify that it is not "quite absurd" to label this a documentation bug and that is the proper classification for this bug. The documentation was not clear enough for you to understand what the behavior would be, so it should be clarified. But the semantics of the function are not going to change at this point since Python 2.7 is only accepting bug fixes and imp.load_dynamic() is no longer documented as of Python 3.2 and thus not strictly considered a public API any longer.
msg172656 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-11 16:19
I found the cause of the behavior (perhaps it is common knowledge, but I am new to python source); imp.load_dynamic calls the following functions

     Python/import.c: imp_load_dynamic (http://hg.python.org/cpython/file/ad51ed93377c/Python/import.c#l1777)
     Python/importdl.c: _PyImport_LoadDynamicModule (http://hg.python.org/cpython/file/ad51ed93377c/Python/importdl.c#l23)
     Python/import.c: _PyImport_FindExtensionObject (http://hg.python.org/cpython/file/ad51ed93377c/Python/import.c#l525)

where the last one uses the extensions object (http://hg.python.org/cpython/file/ad51ed93377c/Python/import.c#l32), which is explained at http://hg.python.org/cpython/file/ad51ed93377c/Python/import.c#l449

       Magic for extension modules (built-in as well as dynamically
       loaded).  To prevent initializing an extension module more than
       once, we keep a static dictionary 'extensions' keyed by module name
       (for built-in modules) or by filename (for dynamically loaded
       modules), containing these modules. A copy of the module's
       dictionary is stored by calling _PyImport_FixupExtensionObject()
       immediately after the module initialization function succeeds.  A
       copy can be retrieved from there by calling
       _PyImport_FindExtensionObject().



The fact that extensions are keyed by file name explains why opening the .so through symlink does not return the old extension object:

     # foo.so
     # bar.so -> foo.so (works for both symlinks and hardlinks)
     imp.load_dynamic("foo","foo.so")
     imp.load_dynamic("bar","bar.so") # will return the bar module

I will investigate whether marking the module as capable of multiple initialization could be a workaround for the issue -- since the quoted comment further says (http://hg.python.org/cpython/file/ad51ed93377c/Python/import.c#l459):

       Modules which do support multiple initialization set their m_size
       field to a non-negative number (indicating the size of the
       module-specific state). They are still recorded in the extensions
       dictionary, to avoid loading shared libraries twice.

To fix the issue, I suggest that the *extensions* dict is keyed by (filename,modulename) tuple for dynamically loaded modules. This would avoid any ambiguity. Grepping through the code shows that the *extensions* object is only accessed from Python/import.c, therefore regressions should be unlikely. What do you think?
msg172659 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-11 16:27
I did not notice it was not documented in python 3.3 anymore -- my fault, sorry.

In case there is no functional replacement for it, I will try to raise it on the ML. I am currently writing some code in 2.7 which relies on it (I don't see another way of packing multiple compiled modules into one file without using symlinks, which won't work under windows; it saves me lots of trouble with cross-module symbol dependencies and such, avoids RTLD_GLOBAL, rpath and such nasty stuff), and don't want to throw it away with future migration to 3k.
msg172662 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-10-11 16:45
The new functional equivalent is importlib.machinery.ExtensionFileLoader (http://docs.python.org/dev/py3k/library/importlib.html#importlib.machinery.ExtensionFileLoader), but underneath the hood it uses the same code as imp.load_dynamic() did.
msg173105 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-10-16 22:12
"""To prevent initializing an extension module more than
   once, we keep a static dictionary 'extensions' keyed by module name
   (for built-in modules) or by filename (for dynamically loaded
   modules), containing these modules.
"""
So there can be only one module per filename.
But what if this dictionary was keyed by tuple(name, filename) instead?
msg173142 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-10-17 08:22
Yes, that's what I suggested at the end of msg172656 - including modulename in the key.

Brett, would it be OK if I make patch against 3.3 (or head, if you prefer) to key by (modulename,filename) for compiled modules?

I had a look at importlib.machinery.ExtensionFileLoader and it will suffer from the same issue.

Should I open a separate bug for the post-3.3 patch, and keep this as documentation bug for 2.7?
msg173161 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-10-17 12:05
Yes, I think keeping this bug as the doc bug and opening a new one for the enhancement is the best way to go.
msg174979 - (view) Author: Václav Šmilauer (eudoxos) * Date: 2012-11-06 15:13
issue16421 was opened for py3k. Just for the sport of writing, I fixed that in python 2.7 (tip) as well, though other seemed to defend the view it was not a bug, hence not fixable in 2.7.
msg176652 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-11-29 14:32
I think it should not be fixed in 2.7, so I guess to close the issue as wontfix.
msg176653 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-11-29 15:08
The behaviour won't change in 2.7, but the docs at http://docs.python.org/2/library/imp.html#imp.load_dynamic still need to be clarified.

e.g. add a note like:

Note: the import internals identify extension modules by filename, so doing ``foo = load_dynamic("foo", "mod.so")`` and ``bar = load_dynamic("bar", "mod.so")`` will result in both foo and bar referring to the same module, regardless of whether or not ``mod.so`` exports an ``initbar`` function. On systems which support them, symlinks can be used to import multiple modules from the same shared library, as each reference to the module will use a different file name.

(probably flagged as a CPython implementation detail, since it's really an accident of the implementation rather than a deliberately considered language feature)
msg176655 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-11-29 15:27
Pushed doc patch.
Nick, is it good for you?
msg176663 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-11-29 16:56
The doc patch LGTM.
msg176668 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-29 17:52
New changeset 94de77bd0b4b by Andrew Svetlov in branch '2.7':
Issue #16194: document imp.load_dynamic problems
http://hg.python.org/cpython/rev/94de77bd0b4b
msg176669 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-11-29 17:54
Documentation is fixed, behavior cannot be changed.
Close the issue
History
Date User Action Args
2012-11-29 17:54:11asvetlovsetstatus: open -> closed
resolution: fixed
messages: + msg176669

stage: needs patch -> resolved
2012-11-29 17:52:31python-devsetnosy: + python-dev
messages: + msg176668
2012-11-29 16:56:34brett.cannonsetmessages: + msg176663
2012-11-29 15:27:21asvetlovsetfiles: + issue16194.diff

messages: + msg176655
2012-11-29 15:08:52ncoghlansetmessages: + msg176653
2012-11-29 14:32:28asvetlovsetnosy: + asvetlov
messages: + msg176652
2012-11-13 07:23:28eric.snowsetnosy: + eric.snow
2012-11-06 15:13:44eudoxossetfiles: + py2_many-modules-in-one-so_1.diff
keywords: + patch
messages: + msg174979

versions: - Python 3.2, Python 3.3
2012-10-17 20:15:12Arfreversetnosy: + Arfrever
2012-10-17 12:05:48r.david.murraysetmessages: + msg173161
versions: + Python 3.2, Python 3.3
2012-10-17 08:22:46eudoxossetmessages: + msg173142
2012-10-16 22:12:05amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg173105
2012-10-11 16:45:47brett.cannonsetmessages: + msg172662
2012-10-11 16:27:40eudoxossetmessages: + msg172659
2012-10-11 16:19:24eudoxossetmessages: + msg172656
2012-10-11 16:11:10brett.cannonsetmessages: + msg172654
components: + Documentation, - Library (Lib)
2012-10-11 15:32:19eudoxossetnosy: + ncoghlan
messages: + msg172653
components: + Library (Lib), - Documentation
2012-10-11 15:21:38brett.cannonsetnosy: + docs@python
messages: + msg172651

assignee: docs@python
components: + Documentation, - Library (Lib)
keywords: + easy
2012-10-11 15:12:50eudoxossettype: enhancement -> behavior
messages: + msg172650
versions: + Python 2.7, - Python 3.4
2012-10-11 15:10:02r.david.murraysetversions: + Python 3.4, - Python 2.7
nosy: + r.david.murray

messages: + msg172649

type: behavior -> enhancement
stage: needs patch
2012-10-11 15:04:56eudoxossetstatus: pending -> open

messages: + msg172647
2012-10-11 12:33:28brett.cannonsetstatus: open -> pending
nosy: + brett.cannon
messages: + msg172637

2012-10-11 11:11:10eudoxoscreate