Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify IMPORT_FROM to fallback on sys.modules #61836

Closed
brettcannon opened this issue Apr 5, 2013 · 34 comments
Closed

Modify IMPORT_FROM to fallback on sys.modules #61836

brettcannon opened this issue Apr 5, 2013 · 34 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@brettcannon
Copy link
Member

BPO 17636
Nosy @warsaw, @brettcannon, @pjeby, @ncoghlan, @pitrou, @kristjanvalur, @methane, @ericsnowcurrently, @serhiy-storchaka, @phmc
PRs
  • bpo-39551: mock patch should match behavior of import from when module isn't present in sys.modules #18347
  • Files
  • circular_import_tests.diff
  • import_from_tests.diff
  • import_from_tests2.diff
  • import_from_tests2.diff
  • circrelimports.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-10-13.18:22:42.749>
    created_at = <Date 2013-04-05.01:15:02.295>
    labels = ['interpreter-core', 'type-bug']
    title = 'Modify IMPORT_FROM to fallback on sys.modules'
    updated_at = <Date 2020-02-04.20:36:51.966>
    user = 'https://github.com/brettcannon'

    bugs.python.org fields:

    activity = <Date 2020-02-04.20:36:51.966>
    actor = 'dino.viehland'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-10-13.18:22:42.749>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2013-04-05.01:15:02.295>
    creator = 'brett.cannon'
    dependencies = []
    files = ['30115', '30132', '36803', '36804', '36805']
    hgrepos = []
    issue_num = 17636
    keywords = ['patch']
    message_count = 34.0
    messages = ['186059', '186073', '186078', '186079', '186080', '186093', '186095', '186096', '186098', '186102', '186103', '186887', '186922', '188319', '188417', '188441', '190667', '190670', '190671', '190673', '190679', '206434', '206476', '228497', '228501', '228502', '228822', '228833', '228915', '228927', '228928', '229257', '229258', '232601']
    nosy_count = 13.0
    nosy_names = ['barry', 'brett.cannon', 'pje', 'ncoghlan', 'pitrou', 'kristjan.jonsson', 'methane', 'python-dev', 'eric.snow', 'serhiy.storchaka', 'pconnell', 'isoschiz', 'Pascal.Chambon']
    pr_nums = ['18347']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue17636'
    versions = ['Python 3.5']

    @brettcannon
    Copy link
    Member Author

    To help with circular imports, IMPORT_FROM should be modified to check sys.modules if a getattr() check fails.

    http://mail.python.org/pipermail/python-dev/2013-April/125123.html

    @brettcannon brettcannon added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Apr 5, 2013
    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Apr 5, 2013

    Like I mentioned on python-dev, it worries me a bit that this could be considered "unexpected", i.e. when there is a conflict between a legitimage attribute member of a module, and a submodule of the same name.

    Also, I wonder if this isn't a bigger change to the import mechanism, than simply:
    | Another would be
    | to always require an 'as' clause in this case, so that you would have
    | to write'
    | import .foo as foo
    which would possibly only require a change to the syntax.

    @brettcannon
    Copy link
    Member Author

    It won't conflict with attributes. Only if the attribute does not exist on the module already will it fall back to sys.modules. If the import finished then any attribute created from the import will already be there and thus not be an issue.

    But to make sure this isn't a problem we can make sure there is a test in this instance.

    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Apr 5, 2013

    While I'm happy that this is being ackowledged as a problem, I'm not sure changing the "import x from y" semantics is necessarily a good idea. I mean, it is obvious to me that it means "import x, then getattr(x, "y")". I presume that is the meaning most people associate with it. Certainly, "y" can be any old module attribute.

    To change it to actually fall back to a submodule, well. I suppose if you could explain it roughly like "y = getattr(x, 'y', x.y)" then it would be ok.

    But I can think of contrived examples where this could break things:
    #a.py:
    from .b import util

    #b.py
    from . import a
    from .util import util

    Because of the circular import order, b.util will not exist as an attribute on b when a does its import. So a.util will end up being util instead of util.util as one might expect.

    I'm basically saying that it is possible that the fallback to submodule will occur, where the successful getattr would occur later and return something different than the submodule. Possible. But perhaps very much an edge case :)

    @brettcannon
    Copy link
    Member Author

    By declaring what the semantics will be to make the case even possible we are not breaking anything but making something possible. IOW it isn't even an edge case to me since there is no working case to compare against. =)

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Apr 5, 2013

    On Fri, Apr 5, 2013 at 9:07 AM, Kristján Valur Jónsson
    <report@bugs.python.org> wrote:

    But I can think of contrived examples where this could break things:
    #a.py:
    from .b import util

    #b.py
    from . import a
    from .util import util

    Because of the circular import order, b.util will not exist as an attribute on b when a does its import. So a.util will end up being util instead of util.util as one might expect.

    Not quite. It will only do this if the '.b.util' module is *in
    sys.modules* at the time that a is being imported, which also means
    that .b.util has to be imported somewhere *before* .a, in order to end
    up with a stale value. As written, your example actually works
    correctly if .a is imported first, and fails with an ImportError if .b
    is imported first.

    In fact, this example is kind of useful for proving the change
    *correct*, not broken. At the very least, it shows that you'll have
    to be more inventive to come up with a breaking case. ;-)

    Consider that for any module x.y, x must be in sys.modules before x.y
    can. Therefore, any "from x import" taking place before x is fully
    loaded will either happen before x.y is fully loaded, during the load,
    or after it, and the following cases apply:

    1. If it happens before, then it fails with an ImportError as is the case today.
    2. If it happens during (i.e. there is a cycle with x.y rather than
      with just x),
      then the import returns the module.
    3. If it happens after, then either the x.y attribute is bound to the submodule,
      or has been rebound to something else, or deleted.
    4. If after and deleted, the import returns the module.
    5. If after and rebound, the import returns the changed attribute
      (just like today)
    6. If after and normally bound, the import returns the module (just like today)

    The only cases in which the behavior changes from today are cases 2
    and 4, which would both fail today with an ImportError. Case 4 also
    doesn't make much sense, since 'import x.y' would still permit access
    to the module -- so it'd have to be an odd situation in which you
    didn't want 'from import' (and *only* from import) to fail.

    So let's consider case 2, which would have to be written something like:

    #a.py
    from .b import util
    
    #b.py
    from .util import util
    
    #b/util.py
    from .. import a
    def util(): pass
    
    #__main__.py
    import b

    So, import b leads to starting the load of b.util, which leads to
    importing a, which succeeds and sees the b.util module instead of the
    b.util:util function.

    But, because of the circularity, this will *also* happen if you import
    a first. So 'a' will *always* see b.util rather than b.util:util,
    unless you remove the circularity. If you remove the circularity,
    then 'a' will always see b.util:util as the value of util.

    So case 2 does not lead to a hard-to-debug ordering dependency, it
    leads to an immediately discoverable change in behavior the moment you
    start importing a from b.util.

    The tl;dr version of the above is that you will only receive a module
    instead of an attribute if the module that's about to be replaced just
    imported *you* during its initial loading, and if it does that, it'll
    do it no matter what order the imports occur in, making the problem
    occur immediately and consistently as soon as you introduce the
    circularity.

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Apr 5, 2013

    Actually, after a little reflection, I can see that there are more
    complex conditions to analyze, if 'b' doesn't import 'b.util', but
    some other module imports b and sets b.util. But that's just freaking
    insane and whoever does that probably deserves whatever they get,
    especially since a later 'import b.util' would blow away the modified
    attribute.

    So, this note is just to satisfy myself that the change doesn't
    introduce any *new* weirdness unless you are in a case where the
    parent imports and replaces the child, and the child is involved in an
    import cycle. If the parent doesn't import the child but just assigns
    an attribute, it's already broken the minute somebody else imports the
    child, and the same thing applies if something else sets the attribute
    without importing the child first, and if it imports the child first,
    then the previous analysis (mostly) applies, but either way that code
    is broken and will have plenty of other ordering dependencies to debug
    on its own. ;-)

    (No wonder Nick said nobody wanted to touch this... the analysis
    alone is a killer. ;-) Luckily, it seems we're good to go unless
    somebody can find a case I missed.)

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Apr 5, 2013

    ...and I thought of *one* more way to trigger the changed behavior,
    which looks like:

    #b.py
    from .b import util
    import .a
    util = util.util
    
    #b/util.py
    def util(): pass

    (with the other files the same as before).

    I'm including it only for completeness' sake, because my original
    enumeration of cases ignored the possibility that 'a' could be doing
    the import *after* b.util is loaded and bound, but *before* it's
    rebound. However, it doesn't produce any new or problematic effects:
    it's essentially the same as if 'a' were imported from 'b.util'. Once
    again, regardless of the order in which imports happen, 'a' ends up
    with 'b.util' the moment the circularity is introduced, and it stays
    that way.

    It's also hard to argue that a case written this way isn't getting
    exactly what it *says* it wants. In fact, it looks like code that was
    deliberately written to *force* a to end up with the original b.util
    instead of the replaced one. ;-)

    Final (hopefully) conclusion: this change replaces the FAQ of "Don't
    use 'from-import' for circular imports" with the hopefully-less-FA'd Q
    of "when from-import is part of an import cycle, it works *exactly*
    like regular import, so you're going to get a submodule rather than an
    attribute. If you need the attribute instead, move the import so that
    it happens after the attribute is set up." (Which is exactly the same
    advice that would apply in a cycle with any other unitialized
    attribute, whether you were using from-import or not.)

    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Apr 5, 2013

    Wow, Good analysis Phillip.
    So, we agree that the fallback still is a sensible fallback? Then everything is fine and I'll put my official +1 stamp of approval on this.

    Now... should we consider the current behavious to be a bug? 2.7 sure could do with some fixing :)

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Apr 5, 2013

    I don't care much one way or the other about it being considered a bug
    in 2.x, but it might be worth considering a bug in 3.x.

    Either way, though, the language reference for "from import" should
    reflect the change, and "alternative implementations" should be
    informed of the update.

    @brettcannon
    Copy link
    Member Author

    It is not a bug but a change in semantics to accommodate a use-case so this will only be going into Python 3.4.

    @ncoghlan
    Copy link
    Contributor

    Thanks for working through that Phillip - falling back to sys.modules when the expected attribute was missing is actually something I suggested as a possibility years ago, and Guido's response at the time was "If it was that easy, someone would have done it already".

    Your analysis is one of the pieces that was missing, along with Brett's insight that the code that needs the fallback is the IMPORT_FROM bytecode rather than the import implementation.

    I'm going to close the original circular import bug as being superseded by this one.

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Apr 14, 2013

    On Sun, Apr 14, 2013 at 3:51 AM, Nick Coghlan <report@bugs.python.org> wrote:

    Your analysis is one of the pieces that was missing,

    Unfortunately, I just noticed it's actually incorrect in a pretty
    important part In my original example, I said, "because of the
    circularity, this will *also* happen if you import 'a' first." This
    statement is actually false. Importing 'a' first in that example will
    result in a.util == b.util:util, not a.util=b.util. I made the
    mistake because I was for some reason thinking that 'a' was going to
    execute its import while being imported from b.util, and in that
    scenario it will not.

    That means there *is* an ordering dependency, and an ambiguity like
    this one can lie dormant until long after you've introduced the
    circularity. :-(

    @brettcannon
    Copy link
    Member Author

    I have uploaded a patch with failing tests that should work after this is all said and done. Philip, please make sure I covered your tests as expected (I tweaked one because it already was working the way I did it). This way we at least know what we are aiming for in terms of results.

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented May 4, 2013

    It looks like maybe basic2 should be importing basic, not basic2.
    Otherwise, you might as well just import basic2 directly and be done
    with it. ;-) Likewise, I think indirect should be importing from
    indirect2, not from itself? i.e., I'd expect that test to fail even
    with the change. In addition, even if that is fixed, it's still not
    testing a cycle involving util; it's really just testing the same
    thing as "basic" is supposed to be testing.

    It also looks as though like the rebinding test doesn't actually test
    any rebinding, since nobody ever imports the thing that's rebound.
    It's failing for the same reason as the subpackage test. The
    subpackage test looks like a valid test, though - i.e., it's the
    "basic" case correctly implemented as a parent-child cycle. It's
    actually the only one of the tests that tests what it says it does.

    @brettcannon
    Copy link
    Member Author

    Don't be distracted when trying to write tests is the lesson learned. Fixed basic and rebinding and just deleted indirect.

    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Jun 5, 2013

    One question.
    when doing "Programmatic" import, i.e. using the __import__ function, how do we do the same?

    #module foo.B
    A = __import__(".", fromlist=["A"]).A
    
    #module foo.A
    B = __import__(".", fromlist=["B"]).B

    Clearly the above won't work. Can we extend __import__ to allow a full path, including relative? The objection about which name to bind to is no longer valid, since the binding is explicit.
    So, could we do:

    #module foo.B
    A = __import__(".A")
    
    #module foo.A
    B = __import__(".B")

    ?

    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Jun 5, 2013

    sorry, the last example really needs to be:
    #module foo.B
    A = __import__(".A", fromlist=["dummy"])

    to invoke the "return final module" behaviour.
    Hm, maybe this simply works... I didn't test....
    Nope, I get
    ValueError: Empty module name (in 2.7)

    @brettcannon
    Copy link
    Member Author

    You use importlib.import_module to do programmatic imports (and FYI dots never go into the first argument of __import__).

    @kristjanvalur
    Copy link
    Mannequin

    kristjanvalur mannequin commented Jun 5, 2013

    Interesting.
    And I can report that it works, in 2.7, with code such as
    b = importlib.import_module("..b", __name__)
    and
    a = importlib.import_module("..a", __name__)

    Still, it seems odd that a whole "importlib" is requried ot resolve the relative import, and that it doesn"t work with __import__, given how "supposedly" the import statement is supposed to translate to a __import__ call internally. One would think that __import__ had access to the relative path machinery somehow.

    @brettcannon
    Copy link
    Member Author

    If you look at the importlib source code in 2.7 it's actually really small and simply translates the call into an __import__ call under the hood. So __import__ can completely handle relative imports, but you were not using the level argument to __import__ nor passing the necessary globals to make the relative name computation work (see the importlib source if you want to see how it works).

    @ncoghlan
    Copy link
    Contributor

    With PEP-451 implemented, note that I have reopened bpo-992389 - the patch to set the parent module attribute at the same time as setting the sys.module attribute is actually pretty trivial for the PEP-451 loader case, and that now includes all the standard loaders.

    I also think that approach will have fewer weird edge cases than this version.

    @pjeby
    Copy link
    Mannequin

    pjeby mannequin commented Dec 17, 2013

    Unfortunately, this is not quite true: the weird edge cases for that approach are simply *different*, and harder to diagnose. ;-)

    The approach here can only affect execution paths that would currently raise ImportError; that one can break execution paths that are *currently working*.

    As Brett said above, "By declaring what the semantics will be to make the case even possible we are not breaking anything but making something possible." That is not the case for the approach proposed in the other issue.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 4, 2014

    This is Brett's tests patch updated against default branch.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 4, 2014

    And this is a full patch.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 4, 2014

    (full patch is actually the latest upload: circrelimports.patch. Sorry for the glitch)

    @pitrou
    Copy link
    Member

    pitrou commented Oct 8, 2014

    By the way, is there any documentation that would need to be updated for this?

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Oct 9, 2014

    The language reference - at least the section on the import statement, and potentially the section on the overall operation of the import system.

    I don't *think* it would affect anywhere in the tutorial or the importlib docs, but they may be worth skimming to see if anything leaps out.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 9, 2014

    By the way, my patch is currently using the package's __name__ attribute to build the fully-qualified name. Should it use the package's __package__ attribute instead?

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Oct 9, 2014

    If I understand your question correctly, then yes, __package__ should be
    used when converting explicit relative imports to absolute ones.

    To write a test, run a submodule that needs the new fallback feature via
    the -m switch (it will fail if using __name__)

    @pitrou
    Copy link
    Member

    pitrou commented Oct 9, 2014

    Le 10/10/2014 00:11, Nick Coghlan a écrit :

    Nick Coghlan added the comment:

    If I understand your question correctly, then yes, __package__ should be
    used when converting explicit relative imports to absolute ones.

    To write a test, run a submodule that needs the new fallback feature via
    the -m switch (it will fail if using __name__)

    I'm not sure it would. __package__ is not looked up on the module (since
    precisely we failed importing it) but on the parent package (which is
    passed to IMPORT_FROM).

    @pitrou
    Copy link
    Member

    pitrou commented Oct 13, 2014

    Actually, looking up __package__ would be wrong.

    Say I have: from pack.module import foo

    and "foo" doesn't exist in pack.module but exists in pack.
    Since pack.module.__package__ == "pack", using __package__ would wrongly find the "foo" in pack.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 13, 2014

    New changeset fded07a2d616 by Antoine Pitrou in branch 'default':
    Issue bpo-17636: Circular imports involving relative imports are now supported.
    https://hg.python.org/cpython/rev/fded07a2d616

    @pitrou pitrou closed this as completed Oct 13, 2014
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 13, 2014

    New changeset 3a35638bce66 by Ned Deily in branch 'default':
    Issue bpo-17636: Install new test directories.
    https://hg.python.org/cpython/rev/3a35638bce66

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants