Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch to add a get_data function to pkgutil #46691

Closed
pfmoore opened this issue Mar 20, 2008 · 21 comments
Closed

Patch to add a get_data function to pkgutil #46691

pfmoore opened this issue Mar 20, 2008 · 21 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@pfmoore
Copy link
Member

pfmoore commented Mar 20, 2008

BPO 2439
Nosy @pfmoore, @pjeby, @ncoghlan
Files
  • pkgutil.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ncoghlan'
    closed_at = <Date 2008-04-15.10:26:34.499>
    created_at = <Date 2008-03-20.20:58:32.897>
    labels = ['type-feature', 'library']
    title = 'Patch to add a get_data function to pkgutil'
    updated_at = <Date 2008-04-15.10:28:51.054>
    user = 'https://github.com/pfmoore'

    bugs.python.org fields:

    activity = <Date 2008-04-15.10:28:51.054>
    actor = 'ncoghlan'
    assignee = 'ncoghlan'
    closed = True
    closed_date = <Date 2008-04-15.10:26:34.499>
    closer = 'ncoghlan'
    components = ['Library (Lib)']
    creation = <Date 2008-03-20.20:58:32.897>
    creator = 'paul.moore'
    dependencies = []
    files = ['9806']
    hgrepos = []
    issue_num = 2439
    keywords = ['patch']
    message_count = 21.0
    messages = ['64208', '64211', '64219', '64224', '64225', '64226', '64231', '64237', '64238', '64240', '64241', '64242', '64243', '64260', '64265', '64266', '64282', '65508', '65509', '65510', '65511']
    nosy_count = 3.0
    nosy_names = ['paul.moore', 'pje', 'ncoghlan']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue2439'
    versions = ['Python 2.6']

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 20, 2008

    This patch adds a new get_data function to the pkgutil module, allowing
    access to data stored in the package directory. It wraps the PEP-302
    loader "get_data" function, so that it works with such loaders (for
    example, zipimport).

    The patch includes documentation and tests (I created a new test_pkgutil
    test module, but did not add tests for the existing pkgutil functionality).

    All tests pass on Windows against svn rev 61679 (except test_tcl, which
    was skipped as I didn't build TCL support).

    @pfmoore pfmoore added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Mar 20, 2008
    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Mar 20, 2008

    Hi Paul. AFAICT, this doesn't look like it will actually work for
    filesystem data. get_loader() will return None for "regular",
    filesystem-installed modules, at least in Python 2.5. Perhaps you
    should add a test case for that?

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 20, 2008

    I'm not sure I understand. It uses pkgutil.get_loader, which returns a
    wrapper for filesystem modules. You pointed me to it. It seems to work,
    that's what test_getdata_filesys is testing in test_pkgutil.py.

    Can you supply a testcase that fails? (BTW, this is a patch for the
    trunk - ie 2.6-to be. I can't see why 2.5 would be different, but maybe
    that's the problem for you?)

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Mar 20, 2008

    Oops, my bad. I'm thinking of an entirely unrelated get_loader()
    function. Meanwhile, I misread your patch entirely, and thought you had
    some dead code for os.path processing that is in fact live. So there is
    "another" problem (really the only one) that I spotted on this re-read.

    Your patch is calling load_module() even if the module is already in
    sys.modules. This will *reload* the module, potentially causing
    problems elsewhere in the system. You can test this by adding an
    assertion to your test's load_module(), that the module isn't already in
    sys.modules. Then call get_data for the same module twice.

    Sorry again for the mixup.

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 20, 2008

    Is that true? loader.load_module(pkg) should return sys.modules[pkg]
    without reloading if it already exists. See PEP-302 "Specification part
    1: The Importer Protocol":

    The load_module() method has a few responsibilities that it must
    fulfil *before* it runs any code:

    • If there is an existing module object named 'fullname' in
      sys.modules, the loader must use that existing module.
      (Otherwise, the reload() builtin will not work correctly.)
      If a module named 'fullname' does not exist in sys.modules,
      the loader must create a new module object and add it to
      sys.modules.

    I've added a test for this, but I'm not 100% sure it's right. It
    certainly works with unchanged code. If you can make it fail, let me
    know and I'll work up a fix. But I think repeated calls to load_module()
    should be safe (assuming loaders work as required by PEP-302).

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Mar 21, 2008

    reload() is implemented by calling the PEP-302 load_module() method
    on the existing module object.

    @ncoghlan
    Copy link
    Contributor

    To clarify the intent of the section of PEP-302 Paul quoted: the
    *module* object gets reused, but the contents of mod.__dict__ are
    clobbered and the module code re-executed.

    So it's the same module object, but with brand new contents (this is why
    "from foo import bar" and "reload(foo)" do not play nicely with each
    other, but doing "import foo" and then invoking "foo.bar" picks up the
    new version of "bar" after a reload).

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    Nick, thanks I now see the issue. I'll work up a test to check for this
    (and then correct it :-)).

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Mar 21, 2008

    An easy way to test it: just change your load_module() to raise an
    AssertionError if the module is already in sys.modules, and the main
    body of the test to make two get_data() calls for the same module.

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    But that's not a valid loader.

    I'm still struggling here. I see what you're trying to get at, but I
    can't see how I can write a valid loader that does this.

    To test the problem you're suggesting (that the code calls load_module
    when the module is already loaded) I need a valid loader which does
    something detectably different of the module is already loaded when it runs.

    Obviously, I can fix the get_data code - that's not even remotely hard.
    But I'd rather create a failing test with the current code, so that I
    can confirm that the "problem" is fixed. At the moment, I can't even
    demonstrate a problem!

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Mar 21, 2008

    Why does it need to be a valid loader? It's a mock, not a real
    loader. But if it really bothers you, have it increment a global in
    the module or something, and put the assertion in the test
    proper. Heck, have it update a counter in the module it returns,
    making it into a loader that keeps track of how many times you reload
    the same module. Then it's a useful example loader. :)

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    It has to be a valid loader, as I see no reason to support loaders that
    aren't valid. In any case, I did try incrementing a counter and it
    doesn't demonstrate the problem. If you try the currently attached
    patch, you should see that. (I assume you've tried or at least read the
    current patch - but the fact that you're suggesting the approach I have
    implemented makes me wonder. I did re-upload the patch after you
    reported the issue - msg 64225 - maybe you didn't notice this, as I
    deleted the old patch?) If you do see what I mean, please tell me where
    my code is wrong. I don't want to add a fix without a test showing why
    the current behaviour is wrong. The test_alreadyloaded test is intended
    to do that, but the current pkgutil code doesn't fail the test - so
    either the test is wrong (and I'd appreciate help fixing the test) or
    the "problem" isn't real, and I can leave the code as is.

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Mar 21, 2008

    But I'm getting a failure on that test, and the other one, too:

    $ ./python Lib/test/test_pkgutil.py
    test_alreadyloaded (__main__.PkgutilTests) ... FAIL
    test_getdata_filesys (__main__.PkgutilTests) ... FAIL
    test_getdata_pep302 (__main__.PkgutilTests) ... ok

    ======================================================================
    FAIL: test_alreadyloaded (main.PkgutilTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
       File "Lib/test/test_pkgutil.py", line 45, in test_alreadyloaded
         self.assertEqual(foo.loads, 1)
    AssertionError: 2 != 1

    ======================================================================
    FAIL: test_getdata_filesys (main.PkgutilTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
       File "Lib/test/test_pkgutil.py", line 30, in test_getdata_filesys
         self.assert_('PkgutilTests' in this_file)
    AssertionError

    Ran 3 tests in 0.017s

    FAILED (failures=2)

    I'm rebuilding my entire 2.6 checkout to double-check there's not
    something else going on, but I know my Lib/ tree is up-to-date.

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    Thanks for the update. Something's seriously screwy here. I am getting
    no failures, so you can probably see why I was confused. Can you tell me
    what platform, etc (anything that might be relevant) and I'll try to see
    what's going on.

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    By the way, for comparison, I'm running the test (under Windows) using
    rt -q -v test_pkgutil from the PCBuild directory. I can't see how
    test_getdata_filesys can fail, as long as you're running it from the
    correct place - it reads the test_pkgutil.py file relative to the test
    package, so it won't run outside of there (maybe I should change this,
    but that's a separate issue for now).

    Here's my output:

    rt -q -v test_pkgutil
    .\\python -E -tt ../lib/test/regrtest.py -v test_pkgutil
    test_pkgutil
    test_alreadyloaded (test.test_pkgutil.PkgutilTests) ... ok
    test_getdata_filesys (test.test_pkgutil.PkgutilTests) ... ok
    test_getdata_pep302 (test.test_pkgutil.PkgutilTests) ... ok

    ----------------------------------------------------------------------
    Ran 3 tests in 0.000s

    OK
    1 test OK.

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    Ah, no. I see your command line. I get the same failure as you in that
    case. I can see why test_getdata_filesys fails in that case, I'll fix
    that. I'm confused as to why test_alreadyloaded fails there but not via
    rt.bat, but nevertheless I can fix that now I can see it. Thanks for
    your patience. Leave it with me.

    @pfmoore
    Copy link
    Member Author

    pfmoore commented Mar 21, 2008

    OK, I found it. There were 2 problems, the double-loading one, and a
    problem with adding my importer to sys.meta_path - if the test failed, I
    wasn't removing it (so it was there for the next test, and interfering
    with it).

    Here's a fixed patch. Thanks, Phillip, for persevering!

    @ncoghlan
    Copy link
    Contributor

    I should be able to review/commit this for the next alpha (looking into
    it right now).

    @ncoghlan ncoghlan self-assigned this Apr 15, 2008
    @pfmoore
    Copy link
    Member Author

    pfmoore commented Apr 15, 2008

    Thanks,
    Paul.

    @ncoghlan
    Copy link
    Contributor

    Commited as rev 62350

    @ncoghlan
    Copy link
    Contributor

    Test file added in rev 62351 (forgot to svn add it the first time around)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants