msg62510 - (view) |
Author: Douglas Greiman (dgreiman) * |
Date: 2008-02-17 23:11 |
This patch reorganizes import.c to move functionality into two new PEP
302-style Importer objects. I attempted to change as little as
feasible, but the patch is still ~4700 lines long, about 1000 of which
is new tests.
BuiltinImporter: handles C_BUILTIN and PY_FROZEN modules
DirectoryImporter: handles PY_SOURCE, PY_COMPILED, PKG_DIRECTORY,
C_EXTENSION modules
BuiltinImporter is put on sys.meta_path, DirectoryImporter is put on
sys.path_hooks.
To preserve backward compatibility of methods like imp.find_module(),
they use new variables sys.builtin_meta_path and sys.builtin_path_hooks
which are analogous to sys.meta_path and sys.path_hook but contain only
the two importer objects above.
Character encoding issues were substantial. The existing code was
somewhat inscrutable in this regard. The tests disabled in issue 1377
were re-added with more safeguards and harder tests. It is possible to
import modules with non-ascii names from non-ascii paths. Tested on
Windows XP and Linux.
Areas for discussion:
Naming: Names of the importer classes, names of variables, etc
sys: I added three variables to sys. Is there a better alternative?
ModuleInfo: The importers use a somewhat tricky way to store an open
file between calls to importer.find_module() and importer.load_module(),
so that the new code doesn't do any more system calls that the old code.
semantics: There are many import corner cases where the semantics are
unknown or unclear.
Frozen packages: the __path__ of a frozen package is a string rather
than a list, and it is possible to replace that string to change how
imports inside that package are resolved. I have attempted to keep that
functionality intact but it's not clear that it's a feature rather than
a bug.
|
msg62697 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-02-22 20:38 |
I've noticed that the proposed patch adds const qualifier to many C-API
functions' arguments. While at the first glance these changes are
reasonable, they add clutter to the already massive patch.
I would recommend to separate const qualifier changes into a separate
patch that can be reviewed separately from the substantive changes.
I promise, I will make substantive comments soon. :-)
|
msg62698 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-02-22 20:47 |
The patch breaks test_import on Mac OS 10.4:
$ ./python.exe -E -tt -bb ./Lib/test/regrtest.py test_import
test_import
test test_import failed -- Traceback (most recent call last):
File "./Lib/test/regrtest.py", line 1199, in <module>
main()
File "./Lib/test/regrtest.py", line 405, in main
huntrleaks)
File "./Lib/test/regrtest.py", line 562, in runtest
huntrleaks, debug)
File "./Lib/test/regrtest.py", line 614, in runtest_inner
print("test", test, "failed --", msg)
File "/Users/sasha/Work/python-svn/py3k/Lib/io.py", line 1247, in
write
b = encoder.encode(s)
File "/Users/sasha/Work/python-svn/py3k/Lib/encodings/ascii.py", line
22, in encode
return codecs.ascii_encode(input, self.errors)[0]
UnicodeEncodeError: 'ascii' codec can't encode characters in position
211-214: ordinal not in range(128)
|
msg62704 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2008-02-22 21:23 |
Sorry I have not commented on this sooner; been swamped.
First, the error Alexander is seeing is probably caused by a source file
that has an encoding other than ASCII (which is fine as the default
encoding in Python 3.0 is UTF-8). But chances are the file has an
encoding marker at the top and that is not being picked up by Douglas'
code. Look at PyTokenizer_FindEncoding() for a way to find out the encoding.
Second, I am the wrong person to be reviewing this as I have something
under development that competes with this. =) I am trying to get my pure
Python implementation bootstrapped into Python 3.0 to completely replace
the C code. But I don't know if I will pull this off in time so this
work could still be useful. But if I have to choose time between this
patch and my stuff, I am going to be biased. =)
|
msg62706 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-02-22 21:46 |
First, let me state that I like the idea of a uniform approach to
importing, but given the complexity of the task, Brett's approach (which
I understand as implementing the builtin importer classes in Python)
would make sense (as long as he can solve the chicken and egg problem).
This said, here is my first question:
Is there a reason why new importer classes are not subclassable while
zipimporter is?
>>> class x(imp.BuiltinImporter):pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: type 'imp.BuiltinImporter' is not an acceptable base type
>>> class x(imp.DirectoryImporter): pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
>>> class x(zipimport.zipimporter): pass
...
Users might want to overload find_module while reusing load_module from
a built-in importer class.
In fact, I see a great deal of code duplication between the new importer
classes that can be reduced by deriving from a common base class.
|
msg62710 - (view) |
Author: Douglas Greiman (dgreiman) * |
Date: 2008-02-23 01:21 |
Brett,
I wrote my patch thinking that the next step would be to rewrite
DirectoryImporter in Python. If you're already working on that and
just want to skip straight from point A to point C and skip this point
B, I'm fine with that. Basically, tell me if you want me to pursue
this further or drop it. Are you also reimplementing the code for
builtin and frozen modules in Python?
Alexander,
The const thing: I went on a bit of a const rampage after an
especially irritating debugging session involving function side
effects. They could be removed.
The non-subclassable aspect was an oversight. I thought about having
a common base class but ended up creating the ModuleInfo helper class
instead. There's still a few bits of code duplication between the two
importer classes. 16x2 lines in *_find_module() could be pulled into
a common place, 4x2 lines of *_require_module(), and a few lines in
the various methods. I wanted to err on the side of introducing as
few new classes as possible to minimize the general scariness of this
patch.
I don't have a Mac setup, but I think the error you're seeing is
actually an error in my test code.
|
msg62714 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2008-02-23 02:16 |
On Fri, Feb 22, 2008 at 5:21 PM, Douglas Greiman <report@bugs.python.org> wrote:
>
> Douglas Greiman added the comment:
>
> Brett,
>
> I wrote my patch thinking that the next step would be to rewrite
> DirectoryImporter in Python. If you're already working on that and
> just want to skip straight from point A to point C and skip this point
> B, I'm fine with that. Basically, tell me if you want me to pursue
> this further or drop it.
You can drop it if you want, but I just don't know how long it will
take for me to finish the bootstrap work (for instance I am having to
rewrite the warnings module in C). So there is a chance it might still
be useful, but I can't guarantee it.
> Are you also reimplementing the code for
> builtin and frozen modules in Python?
>
I use the imp module to handle the actual loading, but the importers
and loaders are all in Python. You can see the implementation in the
sandbox under import_in_py.
|
msg62716 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2008-02-23 02:23 |
Interesting - I'll try to find the time to take a look at this. (I also
added PJE to the nosy list - hopefully he will get a chance to look at it)
|
msg65667 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2008-04-22 00:58 |
Closed at Doug's request. My importlib work, once it lands, will supplant
all of this.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:30 | admin | set | github: 46388 |
2021-04-03 22:26:07 | brett.cannon | set | pull_requests:
+ pull_request23910 |
2008-04-22 00:58:24 | brett.cannon | set | status: open -> closed resolution: rejected messages:
+ msg65667 |
2008-02-23 02:23:50 | ncoghlan | set | nosy:
+ ncoghlan, pje messages:
+ msg62716 |
2008-02-23 02:16:24 | brett.cannon | set | messages:
+ msg62714 |
2008-02-23 01:21:34 | dgreiman | set | messages:
+ msg62710 |
2008-02-22 21:46:06 | belopolsky | set | messages:
+ msg62706 |
2008-02-22 21:23:44 | brett.cannon | set | assignee: brett.cannon -> messages:
+ msg62704 |
2008-02-22 20:47:19 | belopolsky | set | messages:
+ msg62698 |
2008-02-22 20:38:16 | belopolsky | set | nosy:
+ belopolsky messages:
+ msg62697 |
2008-02-19 09:07:41 | christian.heimes | set | priority: normal assignee: brett.cannon keywords:
+ patch nosy:
+ christian.heimes, brett.cannon |
2008-02-17 23:11:18 | dgreiman | create | |