Title: Allow built-in module in package
Type: enhancement Stage:
Components: Build, Interpreter Core Versions: Python 3.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Daniel Shaulov, brett.cannon, eric.snow, ncoghlan
Priority: normal Keywords: patch

Created on 2016-03-26 20:20 by Daniel Shaulov, last changed 2016-04-01 22:14 by brett.cannon.

File name Uploaded Description Edit
builtin_package.patch Daniel Shaulov, 2016-03-26 20:20 Patch for everything except the importlib.h changes review
importlib_h.patch Daniel Shaulov, 2016-03-26 20:26 The importlib.h changes review Daniel Shaulov, 2016-04-01 21:58
Messages (4)
msg262495 - (view) Author: Daniel Shaulov (Daniel Shaulov) * Date: 2016-03-26 20:20

I was trying to build a python interpreter that has the cpp part of msgpack as a built-in module. I thought that I could just copy the msgpack folder to the Modules folder and add this 2 lines to Modules/Setup.local:

msgpack._packer msgpack/_packer.cpp
msgpack._unpacker msgpack/_unpacker.cpp

I had a few obstacles, the attached patch fixes them all.

The first - makesetup has a list of regexes to match and it has *.* after going through all known extensions to throw a "bad word" error.
I removed the check. All those things will now be assumed to be a module with a package.
Now the actual init function name is PyInit__packer and doesn't have msgpack in it, so I also added code in makesetup to use the full name as the name and only the module name for the PyInit function.

The second is that in Lib/importlib/ in BuiltinImporter.find_spec there is a specific case to ignore modules that are part of a package.
Is there a reason to forbid it? I removed that check.
There were also unit tests that checked this behavior which I deleted.
I added tests that check module in package instead of them.
Changing also changes Python/importlib.h (it is the frozen importlib), I added a separate patch with that change, to not clutter the main patch.

The third - the __name__ didn't have the package prefix.
Digging around I found a comment in PyModule_Create2 that says that the shared library loader stores the full name _Py_PackageContext before loading the module, so I did the same in _imp_create_builtin that is done in _PyImport_LoadDynamicModuleWithSpec to set _Py_PackageContext and the __name__ was fixed too.

(If anyone tries to do this with msgpack and wants to see that it works - you also need to copy the msgpack directory to the Lib directory, and in, where it catches the import error and goes to the fallback, just reraise the exception instead of letting it go to the fallback)

Do note that this does not allow for built-in packages, only build-it module in package. If we want to allow built-in packages, we will need to decide on some syntax to distinguish it in the Setup files and some way to distinguish them in PyImport_Inittab (for example - an asterix before the name of the package?)

Thanks, Daniel.
msg262529 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-03-27 17:59 is also related to this.
msg262775 - (view) Author: Daniel Shaulov (Daniel Shaulov) * Date: 2016-04-01 21:58
Hi Brett,

I don't think that the patch from that issue is relevant anymore.
I did take the test case that was proposed in that issue and I am attaching a fixed version here.

I did realize from the discussion that my patch probably doesn't work on Windows (I think the change itself is fine - It's just won't have the test module), I will try to get a working Windows environment and make the appropriate changes tomorrow.

Also, the other issue was also asking for built-in packages and not just built-in submodules. I already have a note about that in my original message.
Can we move forward as-is or do you want me to add support for built-in packages as well?
msg262777 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-04-01 22:14
Referring to the other issue was more about tying the two issues together than necessarily expecting the other patch to work.

As for supporting packages as well as submodules, I'm not sure. It seems a little odd not to support the idea, but then again if something is built-in then there really isn't a need to support the concept of a directory containing more modules. But then again since there is a difference in terms of how a module vs package looks it should probably be supported.

And if you do try and support it, there might be nothing more needed than to have the module be named pkg.__init__ and have some special handling to strip out the '__init__' part of the name and to set `__path__ = []` in importlib. Then importlib can simply try for `name` and `name.__init__` and then use that to figure out if the module should be considered a package or not.

This whole issue is unfortunately one of those things where it's uncommon enough to have to think about all edge cases and the overall impact on other users if the suggested changes are made and to not have a clear-cut answer.
Date User Action Args
2016-04-01 22:14:54brett.cannonsetmessages: + msg262777
2016-04-01 21:58:08Daniel Shaulovsetfiles: +

messages: + msg262775
2016-03-27 17:59:20brett.cannonsetmessages: + msg262529
2016-03-26 20:26:37Daniel Shaulovsetfiles: + importlib_h.patch
2016-03-26 20:20:28Daniel Shaulovcreate