Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure import.c into PEP 302 importer objects #46388

Closed
duggelz mannequin opened this issue Feb 17, 2008 · 9 comments
Closed

Restructure import.c into PEP 302 importer objects #46388

duggelz mannequin opened this issue Feb 17, 2008 · 9 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@duggelz
Copy link
Mannequin

duggelz mannequin commented Feb 17, 2008

BPO 2135
Nosy @brettcannon, @pjeby, @ncoghlan, @abalkin, @tiran, @duggelz
PRs
  • bpo-42135: Deprecate implementations of find_module() and find_loader() #25169
  • Files
  • builtin_importer-20080217.diff: Patch against py3k r60881
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-04-22.00:58:24.675>
    created_at = <Date 2008-02-17.23:11:18.817>
    labels = ['interpreter-core', 'type-bug']
    title = 'Restructure import.c into PEP 302 importer objects'
    updated_at = <Date 2021-04-03.22:26:07.839>
    user = 'https://github.com/duggelz'

    bugs.python.org fields:

    activity = <Date 2021-04-03.22:26:07.839>
    actor = 'brett.cannon'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-04-22.00:58:24.675>
    closer = 'brett.cannon'
    components = ['Interpreter Core']
    creation = <Date 2008-02-17.23:11:18.817>
    creator = 'dgreiman'
    dependencies = []
    files = ['9453']
    hgrepos = []
    issue_num = 2135
    keywords = ['patch']
    message_count = 9.0
    messages = ['62510', '62697', '62698', '62704', '62706', '62710', '62714', '62716', '65667']
    nosy_count = 6.0
    nosy_names = ['brett.cannon', 'pje', 'ncoghlan', 'belopolsky', 'christian.heimes', 'dgreiman']
    pr_nums = ['25169']
    priority = 'normal'
    resolution = 'rejected'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue2135'
    versions = ['Python 3.0']

    @duggelz
    Copy link
    Mannequin Author

    duggelz mannequin commented Feb 17, 2008

    This patch reorganizes import.c to move functionality into two new PEP-302-style Importer objects. I attempted to change as little as
    feasible, but the patch is still ~4700 lines long, about 1000 of which
    is new tests.

    BuiltinImporter: handles C_BUILTIN and PY_FROZEN modules
    DirectoryImporter: handles PY_SOURCE, PY_COMPILED, PKG_DIRECTORY,
    C_EXTENSION modules

    BuiltinImporter is put on sys.meta_path, DirectoryImporter is put on
    sys.path_hooks.

    To preserve backward compatibility of methods like imp.find_module(),
    they use new variables sys.builtin_meta_path and sys.builtin_path_hooks
    which are analogous to sys.meta_path and sys.path_hook but contain only
    the two importer objects above.

    Character encoding issues were substantial. The existing code was
    somewhat inscrutable in this regard. The tests disabled in bpo-1377
    were re-added with more safeguards and harder tests. It is possible to
    import modules with non-ascii names from non-ascii paths. Tested on
    Windows XP and Linux.

    Areas for discussion:

    Naming: Names of the importer classes, names of variables, etc

    sys: I added three variables to sys. Is there a better alternative?

    ModuleInfo: The importers use a somewhat tricky way to store an open
    file between calls to importer.find_module() and importer.load_module(),
    so that the new code doesn't do any more system calls that the old code.

    semantics: There are many import corner cases where the semantics are
    unknown or unclear.

    Frozen packages: the __path__ of a frozen package is a string rather
    than a list, and it is possible to replace that string to change how
    imports inside that package are resolved. I have attempted to keep that
    functionality intact but it's not clear that it's a feature rather than
    a bug.

    @duggelz duggelz mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Feb 17, 2008
    @abalkin
    Copy link
    Member

    abalkin commented Feb 22, 2008

    I've noticed that the proposed patch adds const qualifier to many C-API
    functions' arguments. While at the first glance these changes are
    reasonable, they add clutter to the already massive patch.

    I would recommend to separate const qualifier changes into a separate
    patch that can be reviewed separately from the substantive changes.

    I promise, I will make substantive comments soon. :-)

    @abalkin
    Copy link
    Member

    abalkin commented Feb 22, 2008

    The patch breaks test_import on Mac OS 10.4:

    $ ./python.exe -E -tt -bb ./Lib/test/regrtest.py test_import
    test_import
    test test_import failed -- Traceback (most recent call last):
      File "./Lib/test/regrtest.py", line 1199, in <module>
        main()
      File "./Lib/test/regrtest.py", line 405, in main
        huntrleaks)
      File "./Lib/test/regrtest.py", line 562, in runtest
        huntrleaks, debug)
      File "./Lib/test/regrtest.py", line 614, in runtest_inner
        print("test", test, "failed --", msg)
      File "/Users/sasha/Work/python-svn/py3k/Lib/io.py", line 1247, in 
    write
        b = encoder.encode(s)
      File "/Users/sasha/Work/python-svn/py3k/Lib/encodings/ascii.py", line 
    22, in encode
        return codecs.ascii_encode(input, self.errors)[0]
    UnicodeEncodeError: 'ascii' codec can't encode characters in position 
    211-214: ordinal not in range(128)

    @brettcannon
    Copy link
    Member

    Sorry I have not commented on this sooner; been swamped.

    First, the error Alexander is seeing is probably caused by a source file
    that has an encoding other than ASCII (which is fine as the default
    encoding in Python 3.0 is UTF-8). But chances are the file has an
    encoding marker at the top and that is not being picked up by Douglas'
    code. Look at PyTokenizer_FindEncoding() for a way to find out the encoding.

    Second, I am the wrong person to be reviewing this as I have something
    under development that competes with this. =) I am trying to get my pure
    Python implementation bootstrapped into Python 3.0 to completely replace
    the C code. But I don't know if I will pull this off in time so this
    work could still be useful. But if I have to choose time between this
    patch and my stuff, I am going to be biased. =)

    @brettcannon brettcannon removed their assignment Feb 22, 2008
    @abalkin
    Copy link
    Member

    abalkin commented Feb 22, 2008

    First, let me state that I like the idea of a uniform approach to
    importing, but given the complexity of the task, Brett's approach (which
    I understand as implementing the builtin importer classes in Python)
    would make sense (as long as he can solve the chicken and egg problem).

    This said, here is my first question:

    Is there a reason why new importer classes are not subclassable while
    zipimporter is?

    >>> class x(imp.BuiltinImporter):pass
    ... 
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: type 'imp.BuiltinImporter' is not an acceptable base type
    
    >>> class x(imp.DirectoryImporter): pass
    ... 
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    
    >>> class x(zipimport.zipimporter): pass
    ... 

    Users might want to overload find_module while reusing load_module from
    a built-in importer class.

    In fact, I see a great deal of code duplication between the new importer
    classes that can be reduced by deriving from a common base class.

    @duggelz
    Copy link
    Mannequin Author

    duggelz mannequin commented Feb 23, 2008

    Brett,

    I wrote my patch thinking that the next step would be to rewrite
    DirectoryImporter in Python. If you're already working on that and
    just want to skip straight from point A to point C and skip this point
    B, I'm fine with that. Basically, tell me if you want me to pursue
    this further or drop it. Are you also reimplementing the code for
    builtin and frozen modules in Python?

    Alexander,

    The const thing: I went on a bit of a const rampage after an
    especially irritating debugging session involving function side
    effects. They could be removed.

    The non-subclassable aspect was an oversight. I thought about having
    a common base class but ended up creating the ModuleInfo helper class
    instead. There's still a few bits of code duplication between the two
    importer classes. 16x2 lines in *_find_module() could be pulled into
    a common place, 4x2 lines of *_require_module(), and a few lines in
    the various methods. I wanted to err on the side of introducing as
    few new classes as possible to minimize the general scariness of this
    patch.

    I don't have a Mac setup, but I think the error you're seeing is
    actually an error in my test code.

    @brettcannon
    Copy link
    Member

    On Fri, Feb 22, 2008 at 5:21 PM, Douglas Greiman <report@bugs.python.org> wrote:

    Douglas Greiman added the comment:

    Brett,

    I wrote my patch thinking that the next step would be to rewrite
    DirectoryImporter in Python. If you're already working on that and
    just want to skip straight from point A to point C and skip this point
    B, I'm fine with that. Basically, tell me if you want me to pursue
    this further or drop it.

    You can drop it if you want, but I just don't know how long it will
    take for me to finish the bootstrap work (for instance I am having to
    rewrite the warnings module in C). So there is a chance it might still
    be useful, but I can't guarantee it.

    Are you also reimplementing the code for
    builtin and frozen modules in Python?

    I use the imp module to handle the actual loading, but the importers
    and loaders are all in Python. You can see the implementation in the
    sandbox under import_in_py.

    @ncoghlan
    Copy link
    Contributor

    Interesting - I'll try to find the time to take a look at this. (I also
    added PJE to the nosy list - hopefully he will get a chance to look at it)

    @brettcannon
    Copy link
    Member

    Closed at Doug's request. My importlib work, once it lands, will supplant
    all of this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants