classification
Title: modulefinder no longer finds all required modules for Python itself, due to use of __import__ in sysconfig
Type: Stage:
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: adamwill, brett.cannon, doko
Priority: normal Keywords:

Created on 2016-12-30 02:16 by adamwill, last changed 2017-01-10 22:52 by doko.

Messages (3)
msg284304 - (view) Author: Adam Williamson (adamwill) Date: 2016-12-30 02:16
I'm not sure if this is really considered a bug or just an unavoidable limitation, but as it involves part of the stdlib operating on Python itself, I figured it was at least worth reporting.

In Fedora we have a fairly simple little script called python-deps:

https://github.com/rhinstaller/anaconda/blob/master/dracut/python-deps

which is used to figure out the dependencies of a couple of Python scripts used in the installer's initramfs environment, so the necessary bits of Python (but not the rest of it) can be included in the installer's initramfs.

Unfortunately, with Python 3.6, this seems to be broken for the core of Python itself, because of this change:

https://github.com/python/cpython/commit/a6431f2c8cf4783c2fd522b2f6ee04c3c204237f

which changed sysconfig.py from doing "from _sysconfigdata import build_time_vars" to using __import__ . I *think* that modulefinder can't cope with this use of __import__ and so misses that sysconfig requires "_sysconfigdata_m_linux_x86_64-linux-gnu" (or whatever the actual name is on your particular platform and arch).

This results in us not including the platform-specific module in the installer initramfs, so Python blows up on startup when the 'site' module tries to import the 'sysconfig' module.

We could work around this one way or another in the python-deps script, but I figured the issue was at least worth an upstream report to see if it's considered a significant issue or not.

You can reproduce the problem quite trivially by writing a test script which just does, e.g., "import site", and then running the example code from the ModuleFinder docs on it:

from modulefinder import ModuleFinder

finder = ModuleFinder()
finder.run_script('test.py')

print('Loaded modules:')
for name, mod in finder.modules.items():
    print('%s: ' % name, end='')
    print(','.join(list(mod.globalnames.keys())[:3]))

if you examine the output, you'll see that the 'sysconfig' module is included, but the site-specific module is not.
msg285145 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-01-10 19:19
The limitation is unavoidable as modulefinder inspects bytecode for its inferencing, so any code that calls __import__() or importlib.import_module() will simply not work. So unless sysconfig can be updated reasonably back to a statically defined import this is just how it will be (I'll let doko comment on whether updating is possible and thus close this issue).
msg285162 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2017-01-10 22:52
the idea is that we load a different _sysconfigdata module when we are cross building packages.  So we don't know the name in advance.  An ugly alternative would be a big if statement with conditional imports for all known cross build targets. Not sure if this is the better solution.
History
Date User Action Args
2017-01-10 22:52:29dokosetmessages: + msg285162
2017-01-10 19:19:23brett.cannonsetnosy: + brett.cannon
messages: + msg285145
2016-12-30 03:38:29ned.deilysetnosy: + doko

versions: + Python 3.7
2016-12-30 02:16:05adamwillcreate