classification
Title: Move to absolute file paths for module.__file__
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: Arfrever, brett.cannon, eric.snow, madison.may, ncoghlan, python-dev, ronaldoussoren
Priority: normal Keywords: patch

Created on 2013-07-09 20:01 by brett.cannon, last changed 2013-10-18 16:01 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
Issue18416.patch madison.may, 2013-08-12 02:23 Initial attempt at patch that ensures module.__file__ returns an absolute path review
Issue18416_v2.patch madison.may, 2013-08-12 18:03 Removal of unnecessary decorator from test added to test_import
Messages (22)
msg192772 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-07-09 20:01
$ touch blah.py; ./python -c "import blah; print(blah.__file__)"
./blah.py

Should really change that to be an absolute path since the file's location is not accurate if you call os.chdir().
msg192817 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-10 15:17
Isn't this because the first entry of sys.path is '' when python is started without a script? All other paths on sys.path are already absolute paths (tested with a relative path in $PYTHONPATH).

If that's correct, just adding os.getcwd() instead of '' as the first entry of sys.path would fix this issue, and would also make sure that imports of modules next to the script keep working when the script uses os.chdir.

The disadvantage is that some users might depend on the current behavior :-(
msg192830 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-07-10 18:47
That's exactly why it currently happens. What I would do is change importlib to just associate '' with os.getcwd() in FileFinder. That locks down where the module was found to an absolute path while still allowing '' to function as the cwd and be dynamically calculated.
msg192841 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-07-10 22:35
Brett's plan sounds good to me. We should also check we make sure __main__.__file__ is always absolute.
msg194224 - (view) Author: Madison May (madison.may) * Date: 2013-08-03 00:13
PathFinder or FileFinder?  Changing PathFinder._path_importer_cache(cls, path) seems to fix the issue.

See line 1302 in _bootstrap.py.

         if path == '':
-            path = '.'
+            path = _os.getcwd()


$ touch blah.py; ./python -c "import blah; print(blah.__file__)"
/home/mmay/cpython/blah.py
msg194555 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-06 15:16
I actually meant FileFinder but PathFinder is a good point.

In FileFinder you need to change ``self.path = path or '.'`` to use the cwd.
msg194560 - (view) Author: Madison May (madison.may) * Date: 2013-08-06 16:07
A few days ago I tried the change: ``self.path = path or _os.cwd()``, but the problem didn't seem to resolve itself.  

``./python -c "import blah; print(blah.__file__)`` still returned a relative path on my system. The tie between FileFinder and the __file__ attribute isn't yet obvious to me.  

I'm traveling for the next few days but I'll probably give it a second look when I get some free time.
msg194588 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-08-06 22:17
Make sure to run "make" to rebuild the frozen module - editing
importlib._bootstrap.py involves a C style "edit, build, test" cycle rather
than the typical Python "edit, test" cycle.
msg194591 - (view) Author: Madison May (madison.may) * Date: 2013-08-06 23:16
Thanks for the heads up, Nick.  I've worked with _bootstrap.py before, so I'm familiar with the cycle.
msg194596 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-08-07 04:58
Figured it was worth mentioning, since I've been caught by forgetting
that myself :)

I suspect you're right that it's just a case of '.' passing the truth
test, even though it still results in a relative path.
msg194637 - (view) Author: Madison May (madison.may) * Date: 2013-08-08 01:16
Nick, it was definitely a good thing to mention.  I had to learn the "edit, build, test" cycle the hard way my first time.  It took me a good 15-20 minutes to figure out why none of my edits seemed to change anything :)


Anyhow, here's how I see the issue. It seems like we have three main options:

In option one, we only modify PathFinder._path_importer_cache().

         if path == '':
-            path = '.'
+            path = _os.getcwd()

This associates the cwd with FileFinder(cwd) in sys.path_importer_cache

In option two, we only modify FileFinder.__init__().

-        self.path = path or '.'
+        if not path or path == '.':
+            path = _os.getcwd()
+        self.path = path

This associates '.' with FileFinder(cwd) in sys.path_importer_cache.

In option three, we modify both PathFinder and FileFinder.  In PathFinder._path_importer_cache() we remove the line that reassigns path to '.' if path is the empty string.

-        if path == '':
-            path = '.'

In FileFinder.__init__(), we set path to _os.getcwd() if path is false.

-        self.path = path or '.'
+        self.path = path or _os.getcwd() 

This associates the empty string with FileFinder(cwd) in sys.path_importer_cache.

What are your thoughts? Which solution would you all support?
msg194663 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-08 12:24
So this is bringing up a sticky situation that I ran across when initially implementing all of this: what should sys.path_importer_cache use as a key? '' would be what happens with Madison's option 3, and with option 1 it would be os.getcwd(). Now if you iterate through sys.path you won't find what '' connects to. Then again, it won't be accurate if you change the directory either.

So the question becomes should sys.path_importer_cache reflect exactly what is in sys.path or what should be cached for a finder based on what import should do (i.e. the full path and change based on the cwd)?
msg194778 - (view) Author: Madison May (madison.may) * Date: 2013-08-09 23:59
I quickly ran the tests with each of the above edits to see what bits of the test suite would break.

With option 1 (edits to PathFinder only):

``1 test failed:
    test_importlib``
 

With option 2 (edits to FileFinder only):

``3 tests failed:
    test_import test_importlib test_support``


With option 3 (edits to PathFinder and FileFinder):

``3 tests failed:
    test_import test_importlib test_support``

So using the cwd as a key in sys.path_importer_cache seems to break fewer tests than using '' as a key does.  Perhaps that counts for something.

In test_importlib, the only test to fail was `test_path_importer_cache_empty_string()`, for rather obvious reasons. 
I haven't spent much time looking at failing tests in test_import and test_support yet, though.


In regard to sys.path_importer_cache behavior -- at first glance, I'd lean toward either:

1) Creating a new entry in sys.path_importer_cache after changing directories and importing a new module (as is the case in option 1).

Key: os.getcwd() => Value: FileFinder(os.getcwd())

2) Keeping the current behavior in sys.path_importer_cache and somehow changing the __file__ attribute a bit further down the line (perhaps in `FileFinder._init_file_attr()` or `FileLoader.get_filename()`).

Key: '.' => Value: FileFinder('.') 

or with a minor tweak for consistency with sys.path:

Key: ''  => Value: FileFinder('.')

Have you given the issue any more thought, Brett?
msg194829 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-10 20:03
I'm currently leaning towards having sys.path_importer_cache store the actual directory name.
msg194831 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-08-10 20:15
> I'm currently leaning towards having sys.path_importer_cache store
> the actual directory name.

Makes sense to me.  It the problem that sys.path will still have '' in it and it may be hard to figure out what it means (and how it maps to sys.path_importer_cache)?
msg194840 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-08-10 21:58
It seems to me that by *not* doing that, we may have a bug if the cwd changes and the import machinery returns a stateful importer that remembers the path name. Using the actual filesystem path for everything other than sys.path sounds like a much better option.
msg194911 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-11 20:14
To answer Eric's question: yes.

Since no one seems to be screaming that sys.path be the only place to have the concept of a relative path, then let's use only absolute paths except in sys.path for ''.
msg194918 - (view) Author: Madison May (madison.may) * Date: 2013-08-12 02:23
> I'm currently leaning towards having sys.path_importer_cache store 
> the actual directory name.

Exactly what I meant by "Creating a new entry in sys.path_importer_cache after changing directories and importing a new module", but expressed much more succinctly :)  Good to see we're on pretty much the same page.


Here's a (very) preliminary patch for the issue that stores the directory name as a key in sys.path_importer_cache and ensures module.__file__ is an absolute path. 

I've also modified test_path_importer_cache_empty_string in test_importlib/import_/test_path.py to use os.getcwd() instead of os.curdir as the key for sys.path_importer_cache.

Finally, I've added a test to test_import.py that tests that module.__file__ is equivalent to os.path.abspath(module.__file__).

Look it over when you get a chance and let me know what you would change.  Thanks...
msg194978 - (view) Author: Madison May (madison.may) * Date: 2013-08-12 18:02
Here's a minor revision to that patch removing an unnecessary @skip_if_dont_write_bytecode decorator from the test I added to test_import.py.  

No docs changes are included in the current patch -- I'm guessing this should probably wait until we have all the other details ironed out.
msg200276 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-18 15:39
New changeset 76184b5339f2 by Brett Cannon in branch 'default':
Issue #18416: Have importlib.machinery.PathFinder treat '' as the cwd
http://hg.python.org/cpython/rev/76184b5339f2
msg200277 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-10-18 15:41
I went with option 1 (changed PathFinder to consider '' the cwd) which allowed me to flat-out remove the special-casing for '' in FileFinder.

Thanks for the initial patch, Madison!
msg200280 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-18 16:01
New changeset 33844153cd02 by Brett Cannon in branch 'default':
Issue #18416: Fix various os calls in importlib.machinery.FileFinder
http://hg.python.org/cpython/rev/33844153cd02
History
Date User Action Args
2013-10-18 16:01:35python-devsetmessages: + msg200280
2013-10-18 15:41:11brett.cannonsetstatus: open -> closed
resolution: fixed
messages: + msg200277

stage: test needed -> resolved
2013-10-18 15:39:43python-devsetnosy: + python-dev
messages: + msg200276
2013-10-17 18:13:16brett.cannonsetassignee: brett.cannon
2013-09-08 22:21:22pitrouunlinkissue12317 dependencies
2013-09-08 22:13:53brett.cannonlinkissue12317 dependencies
2013-08-12 18:03:57madison.maysetfiles: + Issue18416_v2.patch
2013-08-12 18:02:57madison.maysetmessages: + msg194978
2013-08-12 02:23:44madison.maysetfiles: + Issue18416.patch
keywords: + patch
messages: + msg194918
2013-08-11 20:14:57brett.cannonsetmessages: + msg194911
2013-08-10 21:58:20ncoghlansetmessages: + msg194840
2013-08-10 20:15:43eric.snowsetnosy: + eric.snow
messages: + msg194831
2013-08-10 20:03:12brett.cannonsetmessages: + msg194829
2013-08-09 23:59:55madison.maysetmessages: + msg194778
2013-08-08 12:24:46brett.cannonsetmessages: + msg194663
2013-08-08 01:16:49madison.maysetmessages: + msg194637
2013-08-07 04:58:41ncoghlansetmessages: + msg194596
2013-08-06 23:16:21madison.maysetmessages: + msg194591
2013-08-06 22:17:31ncoghlansetmessages: + msg194588
2013-08-06 16:07:46madison.maysetmessages: + msg194560
2013-08-06 15:16:27brett.cannonsetmessages: + msg194555
2013-08-03 00:13:26madison.maysetnosy: + madison.may
messages: + msg194224
2013-07-10 22:35:31ncoghlansetmessages: + msg192841
2013-07-10 18:47:47brett.cannonsetmessages: + msg192830
2013-07-10 15:17:15ronaldoussorensetnosy: + ronaldoussoren
messages: + msg192817
2013-07-09 20:21:07brett.cannonsetnosy: + ncoghlan
2013-07-09 20:12:20Arfreversetnosy: + Arfrever
2013-07-09 20:01:47brett.cannoncreate