Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traceback wrong on ImportError while executing a package #58493

Closed
schlamar mannequin opened this issue Mar 13, 2012 · 27 comments
Closed

Traceback wrong on ImportError while executing a package #58493

schlamar mannequin opened this issue Mar 13, 2012 · 27 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@schlamar
Copy link
Mannequin

schlamar mannequin commented Mar 13, 2012

BPO 14285
Nosy @ncoghlan, @merwok, @schlamar, @ericsnowcurrently, @vadmium
Files
  • runpy-traceback.patch: Tests but no actual fix
  • finding-spec.patch: Separate test for ImportError wrapping
  • internal-error.patch: Fix by using internal _Error
  • init-ancestor.patch
  • init-ancestor.2.patch: Check ImportError.name
  • init-ancestor.3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-12-11.07:41:04.107>
    created_at = <Date 2012-03-13.06:50:35.711>
    labels = ['type-bug', 'library']
    title = 'Traceback wrong on ImportError while executing a package'
    updated_at = <Date 2015-12-11.07:41:04.106>
    user = 'https://github.com/schlamar'

    bugs.python.org fields:

    activity = <Date 2015-12-11.07:41:04.106>
    actor = 'ncoghlan'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-12-11.07:41:04.107>
    closer = 'ncoghlan'
    components = ['Library (Lib)']
    creation = <Date 2012-03-13.06:50:35.711>
    creator = 'schlamar'
    dependencies = []
    files = ['38438', '38439', '41205', '41231', '41254', '41258']
    hgrepos = []
    issue_num = 14285
    keywords = ['patch']
    message_count = 27.0
    messages = ['155574', '223289', '223317', '226945', '236354', '237830', '237849', '237851', '255639', '255690', '255692', '255707', '255716', '255799', '255848', '255849', '255852', '255907', '255957', '255998', '256000', '256010', '256036', '256041', '256187', '256190', '256205']
    nosy_count = 8.0
    nosy_names = ['ncoghlan', 'eric.araujo', 'BreamoreBoy', 'python-dev', 'schlamar', 'eric.snow', 'martin.panter', 'robagar']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14285'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6']

    @schlamar
    Copy link
    Mannequin Author

    schlamar mannequin commented Mar 13, 2012

    It is very simple to reproduce this error.
    There is an executable package:

    package/
    __init__.py
    __main__.py

    The __init__ imports a missing module:

        import missing_module

    And the __main__ imports from it:

        from . import missing_module

    Now I get the following output which is not what I am expecting:

    C:\Python27\python.exe: No module named missing_module; 'package' is 
    a package and cannot be directly executed
    

    @schlamar schlamar mannequin added the type-bug An unexpected behavior, bug, or error label Mar 13, 2012
    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 16, 2014

    I've no idea what having a file called __main__.py is likely to do so can someone comment please.

    @vadmium
    Copy link
    Member

    vadmium commented Jul 17, 2014

    A file called “package/main.py” is executed as a script by “python -m package”. See <https://docs.python.org/dev/library/__main__.html\>.

    I’ve came across this issue myself. You don’t even need the __main__.py file to be doing anything special, as long as the __init__.py raises an ImportError, I think. On Python 3.4 the report is even more convoluted:

    /sbin/python3: Error while finding spec for 'package.__main__' (<class 'ImportError'>: No module named 'missing_module'); 'package' is a package and cannot be directly executed

    I dunno what “finding spec” means, and packages _can_ be directly executed if they have a __main__ module, so at least the last bit is definitely wrong.

    @robagar
    Copy link
    Mannequin

    robagar mannequin commented Sep 16, 2014

    The message also needs to include the file and line number of the ImportError.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 21, 2015

    The relevant code is in the _get_module_details() function at Lib/runpy.py:101. There are a few of things going on:

    1. The code is calling importlib.util.find_spec("<package>.__main__"), and handles various kinds of exceptions by wrapping them in an ImportError, adding the “Error while finding spec” message. The call apparently causes the package to be imported, so some exceptions triggered by running __init__.py are wrapped. It would be nice if exceptions raised by running __init__.py were not caught, but I have no idea if this is practical or how to do it. It seems the exception handler is there to raise ImportError if you do something like “python -m os.path.something”, which seems rather unlikely to me.

    2. The logic for handling the __main__ module in a package seems to assume that any ImportError (such as raised from step 1) is due to the __main__ module not being present. Then it wraps that exception in another ImportError, adding the “. . . cannot be directly executed” bit. Again, it would be good to separate the __init__.py running from this exception handling.

    3. Finally in _run_module_as_main(), the ImportError is caught and printed, without any traceback. It would be good to only do this for internal errors loading the module, and not if ImportError is raised when the module (or __init__.py) is run.

    I’m not sure what the best way to fix this is. Perhaps add an internal ImportError subclass (or subclasses) that is only used for the “No module named . . .” errors, or only used for errors appropriate for final handler in step 3.

    @vadmium
    Copy link
    Member

    vadmium commented Mar 11, 2015

    Closely related: bpo-19771, which seems to be complaining about a the error message when __main__.py generates an ImportError.

    @vadmium
    Copy link
    Member

    vadmium commented Mar 11, 2015

    The patches at bpo-19771 should remove the part of the message that incorrectly says “. . . is a package and cannot be directly executed”. However that still leaves the problem of the suppressed traceback.

    I am posting runpy-traceback.patch here which adds some tests to check if the traceback is suppressed. The offending test is marked @expectedfailure. It also points out where the exception is being incorrectly caught, but I haven’t thought of a nice way to fix the problem.

    Other changes in the runpy-traceback.patch:

    • Removed the exception message rewriting in _run_module_as_main(), because it seems to be redundant with the _get_main_module_details() function
    • Fixed test_cmd_line_script.CmdLineTest.test_dash_m_error_code_is_one(), which was only checking the Python exit status, and not verifying that the specific failure was the one anticipated

    @vadmium
    Copy link
    Member

    vadmium commented Mar 11, 2015

    Posting finding-spec.patch which just has another test I wrote. It tests if AttributeError/ValueError/TypeError gets wrapped in the “finding spec” ImportError, though I’m not sure if this is a bug or a feature, hence I kept it separate. And again I’m not sure of a good way to fix it either.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 1, 2015

    I suspect a proper solution of this bug would also help with bpo-16217 (provide only the relevant traceback for exceptions from user code).

    @vadmium
    Copy link
    Member

    vadmium commented Dec 2, 2015

    Now I have a deeper understanding I think this can be handled separately to bpo-16217.

    This patch pulls together my previous two patches, and adds a fix. There are two aspects of my fix:

    1. Import the package before calling find_spec() on the __main__ submodule. This means exceptions raised by the initialization code can be differentiated from exceptions from find_spec().

    2. Change all the special ImportError exceptions raised inside runpy [and also one raised by InspectLoader.get_code()] to an internal _Error exception known only to runpy. Now I can be sure that all _Error exceptions are not caused by the initialization code, and I can stop catching ImportError, but still catch _Error and suppress the traceback. When runpy is invoked from the documented run_module() or run_path() APIs, _Error is not used, and it still raises ImportError to maintain backwards compatibility.

    I think my patch should avoid the main problem in bpo-19771 as well.

    Please review my patch. There are so many error possibilities; it is hard to be sure I have got them all right.

    @vadmium vadmium added the stdlib Python modules in the Lib dir label Dec 2, 2015
    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 2, 2015

    Martin, your patch looks good to me, and is at the very least an improvement over the status quo (which clearly traps exceptions that it shouldn't), so I'd say go ahead and apply it.

    Thanks for digging into this and figuring out a clean solution.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 2, 2015

    Thanks for the review Nick. You removed Python 3.4 from the versions; do you think it is not worth the risk committing in 3.4? I understand the deadline for the final release of 3.4 is the end of this week.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 2, 2015

    Right, while I agree this is a bug fix that makes sense to apply to 2.7 and 3.5, it's also a large enough change to runpy's control flow logic that I'm wary of including it in the final 3.4 binary release.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 3, 2015

    New changeset 784a64a21fd0 by Martin Panter in branch '3.5':
    Issue bpo-14285: Do not catch __init__.py exceptions in runpy
    https://hg.python.org/cpython/rev/784a64a21fd0

    New changeset 01397c11ebb8 by Martin Panter in branch 'default':
    Issue bpo-14285: Merge runpy exception fix from 3.5
    https://hg.python.org/cpython/rev/01397c11ebb8

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 4, 2015

    New changeset c4e950338e79 by Martin Panter in branch '2.7':
    Issue bpo-14285: Do not catch ImportError from __init__.py in runpy
    https://hg.python.org/cpython/rev/c4e950338e79

    @vadmium
    Copy link
    Member

    vadmium commented Dec 4, 2015

    Even with my committed fix, tracebacks triggered by initializing a parent package are still suppressed:

    $ ./python -m hello
    Traceback (most recent call last):
      File "/media/disk/home/proj/python/cpython/Lib/runpy.py", line 158, in _run_module_as_main
        mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
      File "/media/disk/home/proj/python/cpython/Lib/runpy.py", line 116, in _get_module_details
        __import__(mod_name)  # Do not catch exceptions initializing package
      File "/media/disk/home/proj/python/cpython/hello/__init__.py", line 1, in <module>
        import random; random.dsgjdgj
    AttributeError: module 'random' has no attribute 'dsgjdgj'
    [Exit 1]
    $ ./python -m hello.submodule
    /media/disk/home/proj/python/cpython/python: Error while finding spec for 'hello.submodule' (<class 'AttributeError'>: module 'random' has no attribute 'dsgjdgj')
    [Exit 1]

    Fixing this in general might require a loop around the find_spec() call as in init-ancestor.patch. But I’m unsure if it is worth the extra complication.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 4, 2015

    I'm wondering if there might be a simpler option: use rpartition() to strip off any trailing segment (whether that's "__main__" or not), and then always do a plain dynamic import of that package (if any). Something like the following at the start of _get_module_details():

    pkg_name, is_submodule, submodule = mod_name.rpartition(".")
    if is_submodule:
        __import__(pkg_name)
    

    The key is that we *don't* want to be relying on the fact find_spec() will import parent packages implicitly.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 5, 2015

    I think the problem with doing a blind import of the parent package is it would be hard to differentiate between the ImportError when the package (or an ancestor) is missing, versus a user-generated ImportError. Maybe you could inspect the exception as a workaround (untested):

    try:
    __import__(pkg_name)
    except ImportError as err:
    if err.name != pkg_name and not pkg_name.startswith(err.name + "."):
    raise
    raise error(format(err))

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 5, 2015

    That import will only import the parent package (if there is one), not the module itself. Imports from __main__ will still happen during the exec call later on.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 6, 2015

    Yes, the package is all we need to import. I understand this bug is all about importing the parent package, and not __main__. If you read the original report, it is __init__.py that is trying to import a missing module.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 6, 2015

    init-ancestor.2.patch implements the single __import__ with ImportError.name checking

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 6, 2015

    Ah, I think I understand the problem you're getting at now:

    • if the parent package is missing entirely, we still want to suppress the traceback by throwing the runpy specific exception
    • if the parent package is present, but one of the modules *that* tries to import is missing (or otherwise doesn't load), we want to let the traceback happen normally

    The exception handler in the new code snippet addresses that by checking the name attribute on the ImportError, but could likely use a comment explaining the subtlety.

    The way you've handled checking the result of the rpartition call also highlights a deficiency in the runpy docs, and likely also the test suite as well: runpy.run_module and the -m switch only work with absolute module names, but this isn't stated explicitly in the documentation, nor is there any early check in _get_module_details() to bail out if given a relative module name.

    @vadmium
    Copy link
    Member

    vadmium commented Dec 6, 2015

    The test suite does check run_module() with some relative names; see test_invalid_names() in test_runpy. But there does not appear to be a test for “python -m .relative”.

    Changes in init-ancestor.3.patch:

    • Added a comment explaining the exception handling
    • Added check, tests and documentation for relative module names

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Dec 7, 2015

    +1, latest iteration looks good to me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 11, 2015

    New changeset 3202d143a194 by Martin Panter in branch '3.5':
    Issue bpo-14285: Do not catch exceptions initializing any ancestor package
    https://hg.python.org/cpython/rev/3202d143a194

    New changeset a526ebcfd31d by Martin Panter in branch 'default':
    Issue bpo-14285: Merge runpy fix from 3.5
    https://hg.python.org/cpython/rev/a526ebcfd31d

    @vadmium
    Copy link
    Member

    vadmium commented Dec 11, 2015

    I committed a slightly modified version to 3.5+. I stopped wrapping the new ImportError, and let the existing find_spec() handler wrap it instead. That way the existing error messages stay the same.

    However I cannot figure out an easy way to do a similar fix for 2.7, where ImportError.name does not exist. Therefore I propose to leave 2.7 as it is, with just my first commit. This means that 2.7 still omits the traceback if a parent package fails to initialize:

    $ ./python -m package.submodule
    [. . .]/python: No module named missing_module

    It also means we see an unnecessary traceback with broken *.pyc files, although the eventual error message is improved. This case was reported in <https://bugs.python.org/issue19771#msg248193\>:

    $ mkdir bad_pyc
    $ : > bad_pyc/__init__.pyc  # Create empty file
    $ python2 -m bad_pyc  # Original behaviour
    /sbin/python2: Bad magic number in bad_pyc/__init__.pyc; 'bad_pyc' is a package and cannot be directly executed
    $ ./python -m bad_pyc  # Current behaviour
    Traceback (most recent call last):
      File "[. . .]/Lib/runpy.py", line 163, in _run_module_as_main
        mod_name, _Error)
      File "[. . .]/Lib/runpy.py", line 111, in _get_module_details
        __import__(mod_name)  # Do not catch exceptions initializing package
    ImportError: Bad magic number in bad_pyc/__init__.pyc

    I propose to leave these two cases as they currently are in 2.7, unless someone has any ideas how to improve them.

    @ncoghlan
    Copy link
    Contributor

    Leaving Python 2.7 alone makes sense to me - runpy is heavily dependent on the import system, so there are going to be limits to the improvements that can be made there.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants