Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to absolute file paths for module.__file__ #62616

Closed
brettcannon opened this issue Jul 9, 2013 · 22 comments
Closed

Move to absolute file paths for module.__file__ #62616

brettcannon opened this issue Jul 9, 2013 · 22 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@brettcannon
Copy link
Member

BPO 18416
Nosy @brettcannon, @ronaldoussoren, @ncoghlan, @ericsnowcurrently
Files
  • Issue18416.patch: Initial attempt at patch that ensures module.file returns an absolute path
  • Issue18416_v2.patch: Removal of unnecessary decorator from test added to test_import
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2013-10-18.15:41:11.635>
    created_at = <Date 2013-07-09.20:01:47.489>
    labels = ['interpreter-core', 'type-bug']
    title = 'Move to absolute file paths for module.__file__'
    updated_at = <Date 2013-10-18.16:01:35.866>
    user = 'https://github.com/brettcannon'

    bugs.python.org fields:

    activity = <Date 2013-10-18.16:01:35.866>
    actor = 'python-dev'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2013-10-18.15:41:11.635>
    closer = 'brett.cannon'
    components = ['Interpreter Core']
    creation = <Date 2013-07-09.20:01:47.489>
    creator = 'brett.cannon'
    dependencies = []
    files = ['31236', '31248']
    hgrepos = []
    issue_num = 18416
    keywords = ['patch']
    message_count = 22.0
    messages = ['192772', '192817', '192830', '192841', '194224', '194555', '194560', '194588', '194591', '194596', '194637', '194663', '194778', '194829', '194831', '194840', '194911', '194918', '194978', '200276', '200277', '200280']
    nosy_count = 7.0
    nosy_names = ['brett.cannon', 'ronaldoussoren', 'ncoghlan', 'Arfrever', 'python-dev', 'eric.snow', 'madison.may']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue18416'
    versions = ['Python 3.4']

    @brettcannon
    Copy link
    Member Author

    $ touch blah.py; ./python -c "import blah; print(blah.__file__)"
    ./blah.py

    Should really change that to be an absolute path since the file's location is not accurate if you call os.chdir().

    @brettcannon brettcannon added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Jul 9, 2013
    @ronaldoussoren
    Copy link
    Contributor

    Isn't this because the first entry of sys.path is '' when python is started without a script? All other paths on sys.path are already absolute paths (tested with a relative path in $PYTHONPATH).

    If that's correct, just adding os.getcwd() instead of '' as the first entry of sys.path would fix this issue, and would also make sure that imports of modules next to the script keep working when the script uses os.chdir.

    The disadvantage is that some users might depend on the current behavior :-(

    @brettcannon
    Copy link
    Member Author

    That's exactly why it currently happens. What I would do is change importlib to just associate '' with os.getcwd() in FileFinder. That locks down where the module was found to an absolute path while still allowing '' to function as the cwd and be dynamically calculated.

    @ncoghlan
    Copy link
    Contributor

    Brett's plan sounds good to me. We should also check we make sure __main__.__file__ is always absolute.

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 3, 2013

    PathFinder or FileFinder? Changing PathFinder._path_importer_cache(cls, path) seems to fix the issue.

    See line 1302 in _bootstrap.py.

             if path == '':
    -            path = '.'
    +            path = _os.getcwd()
    $ touch blah.py; ./python -c "import blah; print(blah.__file__)"
    /home/mmay/cpython/blah.py

    @brettcannon
    Copy link
    Member Author

    I actually meant FileFinder but PathFinder is a good point.

    In FileFinder you need to change self.path = path or '.' to use the cwd.

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 6, 2013

    A few days ago I tried the change: self.path = path or _os.cwd(), but the problem didn't seem to resolve itself.

    ./python -c "import blah; print(blah.__file__) still returned a relative path on my system. The tie between FileFinder and the file attribute isn't yet obvious to me.

    I'm traveling for the next few days but I'll probably give it a second look when I get some free time.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Aug 6, 2013

    Make sure to run "make" to rebuild the frozen module - editing
    importlib._bootstrap.py involves a C style "edit, build, test" cycle rather
    than the typical Python "edit, test" cycle.

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 6, 2013

    Thanks for the heads up, Nick. I've worked with _bootstrap.py before, so I'm familiar with the cycle.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Aug 7, 2013

    Figured it was worth mentioning, since I've been caught by forgetting
    that myself :)

    I suspect you're right that it's just a case of '.' passing the truth
    test, even though it still results in a relative path.

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 8, 2013

    Nick, it was definitely a good thing to mention. I had to learn the "edit, build, test" cycle the hard way my first time. It took me a good 15-20 minutes to figure out why none of my edits seemed to change anything :)

    Anyhow, here's how I see the issue. It seems like we have three main options:

    In option one, we only modify PathFinder._path_importer_cache().

             if path == '':
    -            path = '.'
    +            path = _os.getcwd()

    This associates the cwd with FileFinder(cwd) in sys.path_importer_cache

    In option two, we only modify FileFinder.__init__().

    -        self.path = path or '.'
    +        if not path or path == '.':
    +            path = _os.getcwd()
    +        self.path = path

    This associates '.' with FileFinder(cwd) in sys.path_importer_cache.

    In option three, we modify both PathFinder and FileFinder. In PathFinder._path_importer_cache() we remove the line that reassigns path to '.' if path is the empty string.

    •    if path == '':
      
    •        path = '.'
      

    In FileFinder.__init__(), we set path to _os.getcwd() if path is false.

    •    self.path = path or '.'
      

    + self.path = path or _os.getcwd()

    This associates the empty string with FileFinder(cwd) in sys.path_importer_cache.

    What are your thoughts? Which solution would you all support?

    @brettcannon
    Copy link
    Member Author

    So this is bringing up a sticky situation that I ran across when initially implementing all of this: what should sys.path_importer_cache use as a key? '' would be what happens with Madison's option 3, and with option 1 it would be os.getcwd(). Now if you iterate through sys.path you won't find what '' connects to. Then again, it won't be accurate if you change the directory either.

    So the question becomes should sys.path_importer_cache reflect exactly what is in sys.path or what should be cached for a finder based on what import should do (i.e. the full path and change based on the cwd)?

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 9, 2013

    I quickly ran the tests with each of the above edits to see what bits of the test suite would break.

    With option 1 (edits to PathFinder only):

    1 test failed: test_importlib

    With option 2 (edits to FileFinder only):

    3 tests failed: test_import test_importlib test_support

    With option 3 (edits to PathFinder and FileFinder):

    3 tests failed: test_import test_importlib test_support

    So using the cwd as a key in sys.path_importer_cache seems to break fewer tests than using '' as a key does. Perhaps that counts for something.

    In test_importlib, the only test to fail was test_path_importer_cache_empty_string(), for rather obvious reasons.
    I haven't spent much time looking at failing tests in test_import and test_support yet, though.

    In regard to sys.path_importer_cache behavior -- at first glance, I'd lean toward either:

    1. Creating a new entry in sys.path_importer_cache after changing directories and importing a new module (as is the case in option 1).

    Key: os.getcwd() => Value: FileFinder(os.getcwd())

    1. Keeping the current behavior in sys.path_importer_cache and somehow changing the file attribute a bit further down the line (perhaps in FileFinder._init_file_attr() or FileLoader.get_filename()).

    Key: '.' => Value: FileFinder('.')

    or with a minor tweak for consistency with sys.path:

    Key: '' => Value: FileFinder('.')

    Have you given the issue any more thought, Brett?

    @brettcannon
    Copy link
    Member Author

    I'm currently leaning towards having sys.path_importer_cache store the actual directory name.

    @ericsnowcurrently
    Copy link
    Member

    I'm currently leaning towards having sys.path_importer_cache store
    the actual directory name.

    Makes sense to me. It the problem that sys.path will still have '' in it and it may be hard to figure out what it means (and how it maps to sys.path_importer_cache)?

    @ncoghlan
    Copy link
    Contributor

    It seems to me that by *not* doing that, we may have a bug if the cwd changes and the import machinery returns a stateful importer that remembers the path name. Using the actual filesystem path for everything other than sys.path sounds like a much better option.

    @brettcannon
    Copy link
    Member Author

    To answer Eric's question: yes.

    Since no one seems to be screaming that sys.path be the only place to have the concept of a relative path, then let's use only absolute paths except in sys.path for ''.

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 12, 2013

    I'm currently leaning towards having sys.path_importer_cache store
    the actual directory name.

    Exactly what I meant by "Creating a new entry in sys.path_importer_cache after changing directories and importing a new module", but expressed much more succinctly :) Good to see we're on pretty much the same page.

    Here's a (very) preliminary patch for the issue that stores the directory name as a key in sys.path_importer_cache and ensures module.__file__ is an absolute path.

    I've also modified test_path_importer_cache_empty_string in test_importlib/import_/test_path.py to use os.getcwd() instead of os.curdir as the key for sys.path_importer_cache.

    Finally, I've added a test to test_import.py that tests that module.__file__ is equivalent to os.path.abspath(module.__file__).

    Look it over when you get a chance and let me know what you would change. Thanks...

    @MadisonMay
    Copy link
    Mannequin

    MadisonMay mannequin commented Aug 12, 2013

    Here's a minor revision to that patch removing an unnecessary @skip_if_dont_write_bytecode decorator from the test I added to test_import.py.

    No docs changes are included in the current patch -- I'm guessing this should probably wait until we have all the other details ironed out.

    @brettcannon brettcannon self-assigned this Oct 17, 2013
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 18, 2013

    New changeset 76184b5339f2 by Brett Cannon in branch 'default':
    Issue bpo-18416: Have importlib.machinery.PathFinder treat '' as the cwd
    http://hg.python.org/cpython/rev/76184b5339f2

    @brettcannon
    Copy link
    Member Author

    I went with option 1 (changed PathFinder to consider '' the cwd) which allowed me to flat-out remove the special-casing for '' in FileFinder.

    Thanks for the initial patch, Madison!

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 18, 2013

    New changeset 33844153cd02 by Brett Cannon in branch 'default':
    Issue bpo-18416: Fix various os calls in importlib.machinery.FileFinder
    http://hg.python.org/cpython/rev/33844153cd02

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants