classification
Title: make importlib documentation easier to use
Type: enhancement Stage:
Components: Documentation Versions: Python 3.3, Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Julian, barry, brett.cannon, chris.jerdonek, docs@python, eric.snow, ncoghlan, paul.moore
Priority: normal Keywords:

Created on 2012-09-06 01:22 by Julian, last changed 2013-03-29 03:27 by barry.

Messages (5)
msg169901 - (view) Author: Julian Berman (Julian) * Date: 2012-09-06 01:22
I find that the importlib documentation is a bit too low level. Import hooks are not that common, so requiring a bit of reading is OK with me, but I somewhat *understand* PEP 302, so I have a general idea of *what* I want to do and what kind of objects can accomplish that, and still find it hard to figure out how to take the pieces provided by importlib and use them to do what I want.

If I provide a specific example, I want to create a path hook that "just" does the usual import behavior, but before executing the module, does some transformation on the source code, say. In trying to figure out the best way to do that I had a hard time using the docs to figure out which pieces I should assemble to do that. I'm going to just describe the rough series of steps I went through, and hopefully that will help give a picture of where in the docs I had trouble.

`importlib.abc` has a few things that would appear to help. None of the things seem like an exact match, so immediately I'm confused -- after reading PEP 302 I'd have expected to need to find an object to implement one or both of `get_code` or `get_source` on, or one that has that implemented that I can subclass and extend. The closest thing there that I find is PyPycLoader, which seems to be saying it implements the standard import behavior, but the docs say its deprecated and to use SourceLoader. At this point, after checking out `SourceLoader` and seeing that it has me implementing two methods whose purpose I don't quite understand, even after reading the short descriptions of them, at least not in the context of what I want to do, I begin to suspect that what I really want is to combine SourceLoader with some things from the `imp` module, or maybe `importlib.__import__`, but am left wondering again how much I need to implement before I just can use that. I then notice `importlib.util.module_for_loader`, and add that to the simple loader I've written which I'm still waiting to plug `imp` into, before realizing that that decorator eats the `fullname` attribute and *only* passes along the module, which confuses me, since now I don't know how to retrieve the source for the module object that I'm being passed -- so I save the path name in `__init__` for the class, and assume that's what I should be doing, despite not seeing an example doing that. Assuming that's even correct as-is, it took me quite a bit to put those pieces together.

So I apologize for rambling -- I think essentially what'd improve things is providing more examples, or perhaps a HOWTO entry, that targeted assembling the pieces provided in the module into a few clear, complete examples of finders, loaders and importers.
msg169923 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2012-09-06 14:44
As far as the import system goes, Barry Warsaw added a really nice page to the language reference[1].  However, it sounds like your concern is with taking advantage of the tools that importlib provides.

First of all, a good thing to recognize is that importlib Python 3.3 exposes a much richer public API, including exposing the "default" import machinery.  While importlib.abc is meaningful, importlib.machinery is more useful in your case.  In versions prior to 3.3, things aren't as easy.

In other Python 3 versions one approach you could take is to **carefully** use the private importlib APIs to get what you want.  If you do that, I'd recommend that your code target the public 3.3 APIs and then write a wrapper around the earlier private APIs to get them to be compatible.  There really shouldn't be a lot of difference.  The key is to target Python 3.3's importlib.

For Python 2, I'd still recommend targeting 3.3's importlib API and writing wrappers to make that work.  This would likely involve more effort to backport whole chunks of the 3.3 importlib implementation.  Better to simply port your code to Python 3.  :)

Secondly, the import system is a complex piece of machinery.  The new reference that Barry did helps pull it all together, but there are simply a lot of moving parts in various layers.  Adding examples for the importlib API may help for working with that API, but any activities in hooking into the import system should be done with a firm understanding of how it works since it's easy to break things.  Currently there isn't any easy way around that and I doubt any of that will change for a long time (due to the effort required).

Lastly, take a look at pylt[2].  It's a project I'm working on for source-to-source translation that hooks into the import system.  Though it isn't finished yet, the import side of things is mostly done.  Hopefully I'll have all the tests done in the next few days.

For pylt I've made use of the 3.3 importlib API along with a couple of things that we should see in 3.4 that I've "backported" [3].  The code there should give you an idea of how I've done essentially what you are looking to do.

Ultimately, any recommendations you can give on making the import system more approachable would be awesome.  Though it will take a lot of thought, discussion, and effort to make the import system "easy", there is still a lot of room for improvement in making it understandable.  Your perspective would be meaningful in working toward a sensible improvement.


[1] http://docs.python.org/dev/reference/import.html
[2] https://bitbucket.org/ericsnowcurrently/pylt/
[3] http://bugs.python.org/issue15627
msg169926 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-09-06 15:54
First off, what you want to do isn't easy to begin with. =) You are right that you want get_code() and that SourceLoader is what you want. The problem is that importlib inherited PEP 302s APIs, and there are so many that the docs don't repeat themselves in terms of what methods each ABC implements in order to keep the docs readable. That makes it a little difficult to realize what ABCs implement what without reading the class description and/or looking at the class hierarchy layout to realize that SourceLoader implements ResourceLoader which specifies get_code().

Second, import is just plain hard. It took me over 5 years to write importlib and get it to where it is now, and most of that work was just trying to keep it all straight in my head. This also makes writing an example or two difficult as it becomes a massive undertaking very quickly. And there is the simple issue that everyone wants something different, e.g. you want to transform source while others want an alternative back-end storage solution. That means coming up with the right examples is hard in and of itself.

Third, in Python 3.4 your desire to transform source will be much easier to achieve thanks to http://bugs.python.org/issue15627 .

IOW, I understand your pain but solving the problem is hard without writing a book on the subject (which who knows, maybe I'll do someday as a $1 ebook or something).
msg169964 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-09-07 01:29
One specific change we should make is that the "see also" at the start of the 3.3 importlib docs should link to the new section of the language reference, rather than Guido's packaging essay. We can probably also cull that long list of PEPs, moving it to the end of the language reference section.

Other than that, yeah, we've been working away at this for years, trying to bring it down to a more manageable level of complexity. Brett's efforts in finally purging the last remnants of the old pre-PEP 302 import systems are what made it possible for Barry to finally create a coherent spec for the import system (previously, none of us really wanted to write such a thing, as there would have been *way* too many caveats needed in attempting to grandfather in the legacy import mechanics).

Now that that is done, 3.4 will likely include a number of improvements to importlib to make it easier to use as a basis for custom import hooks (as Brett and Eric noted, better support for customisation of the source -> bytecode compilation step will definitely be one of them.

As far as books go, I think the evolution of Python's import system might make an interesting entry if they do a third edition of The Architecture of Open Source Applications [1] :)

[1] http://www.aosabook.org/en/index.html
msg169987 - (view) Author: Julian Berman (Julian) * Date: 2012-09-07 16:22
Eric: Yeah I've seen that, it's the one thing that I kept open as I was turning back and forth through the various parts of importlib. So yeah I like that document certainly at least a bit :). Also thanks to both you and Brett for linking that issue, that's certainly something I'll keep an eye on.

Just to repeat the specific things that perhaps we could work on -- I strongly agree that the top of importlib's docs would benefit from reworking the See Also at the top. Also, perhaps that monumental undertaking would be a thing that could be wrangled :P -- like you said, import hooks seem to have two broad use cases: changing *where* a module comes from away from a simple file containing Python source code on a filesystem, and secondly changing what happens when a module is being imported. So I guess what I would love to have would be an example for each of those. An example of a Loader that loaded from somewhere else other than a file, and an example of an Importer that did something else when executing. I'm sure you'll correct me if I've missed an important one. If that's reasonable sounding and I manage to succeed in my own use case, perhaps I'll take a shot at that.

One thing I certainly understand here is that usually I (we) wouldn't have this problem since blog posts and other third party documentation and code can provide examples that are helpful enough to help developers get a sense of what they need to write. The thing for me here was that I didn't really find anything helpful in that sector. importlib is new obviously, so maybe what's given me trouble will be mitigated after 3.3 gets released and a few people write up some examples on their own.

I recognize that there was a huge undertaking here, and that it's still being honed, I hope in no way did this sound disparaging :). Also I hope it didn't sound like a misplaced StackOverflow post -- although certainly the confirmation that I was on the right track should help me finish this off quite easily, so thanks for that as well :).
History
Date User Action Args
2013-03-29 03:27:08barrysetnosy: + barry
2013-03-28 14:25:02paul.mooresetnosy: + paul.moore
2012-09-07 16:22:32Juliansetmessages: + msg169987
2012-09-07 13:27:50chris.jerdoneksettitle: It's hard to decypher how to build off of the provided objects from the importlib docs -> make importlib documentation easier to use
2012-09-07 01:29:27ncoghlansetmessages: + msg169964
2012-09-06 22:51:59chris.jerdoneksetnosy: + chris.jerdonek
2012-09-06 15:54:14brett.cannonsetmessages: + msg169926
2012-09-06 14:44:28eric.snowsetnosy: + eric.snow
messages: + msg169923
2012-09-06 13:11:03pitrousetnosy: + brett.cannon, ncoghlan
2012-09-06 01:24:44Juliansettype: enhancement
2012-09-06 01:22:43Juliancreate