This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: sys.path[0] when executed thru a symbolic link
Type: Stage:
Components: macOS Versions: Python 2.4, Python 2.6
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: jackjansen Nosy List: jackjansen, kowaltowski, kristjan.jonsson, neologix
Priority: normal Keywords:

Created on 2005-12-21 20:23 by kowaltowski, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
TESTPATH.tgz kowaltowski, 2005-12-21 20:23
Messages (11)
msg27120 - (view) Author: Tomasz Kowaltowski (kowaltowski) Date: 2005-12-21 20:23
Under certain conditions there is a difference between
Mac OS X and Linux (both 2.4.1) with regard to the
value of the variable sys.path[0] which should contain
the directory from which the script was called.

This difference appears when the script is called
through a symbolic link by a different user. See the
attached example. It should be executed once by the
owner of the TESTPATH directory:

   ~/TESTPATH/sub1/testpath.py
and
   ~/TESTPATH/sub2/testpath.py

In both cases, under Linux and Mac OS X, the result is:

   /home/owner/TESTPATH/sub1

If a different user executes:

   ~owner/TESTPATH/sub1/testpath.py
and
   ~owner/TESTPATH/sub2/testpath.py

he gets the same results under Linux:

   /home/owner/TESTPATH/sub1

but two different results under Mac OS:

   /Users/owner/TESTPATH/sub1
and
   /Users/owner/TESTPATH/sub2

This seems like a minor problem but it breaks my
application because sys.path[0] is the first place to
look for imports!

I am not sure whether this is a Python problem or
something to do with the Mac OS X. My Mac OS X version
is 10.4.3.
msg27121 - (view) Author: Jack Jansen (jackjansen) * (Python committer) Date: 2005-12-21 20:39
Logged In: YES 
user_id=45365

I don't see this problem: both users see "sub1" as the working directory. Also 
on 10.4.3.

My guess: there is some problem with the readlink() call that Python uses to 
obtain the real pathname of the script (this is how it finds out sub2/
testpath.py is really sub1/testpath.py. Easy to test: fire up Python as user 2 
and do os.readlink("/Users/owner/TESTPATH/sub2").

I wouldn't be surprised if it is some sort of permission problem (maybe /
Users/owner being mode rwx--x--x so the readlink can't traverse through 
it?).
msg27122 - (view) Author: Tomasz Kowaltowski (kowaltowski) Date: 2005-12-22 11:40
Logged In: YES 
user_id=185428

(1) I think it is a problem because under Mac OS X the user
#2 sees "sub2" (and not "sub1"!) as the working directory,
which is where it is different from Linux.

(2) The problem persists even if all permissions are open.

(3) The implementation of "os.readlink" migh be the right
clue about the problem: when the user #2 executes
"os.readlink("/Users/owner/TESTPATH/sub2/testpath.py") the
answer under Mac OS X is:

OSError: [Errorno 13] Permission denied:
'/Users/owner/TESTPATH/sub2/testpath.py'

even though all permissions are open.

Under Linux I get the expected answer: "../sub1/testpath.py".

Obviously there is a problem under Mac OS X, and this matter
should be reopen.

msg186034 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-04-04 14:34
Just came across this when running hadoop jobs
it takes your .py script folder, puts each file in its own cache folder, then recreates the script folder populating it with individual symlinks.
When run like this, the scripts can no longer import each other, because sys.path[0] is set to the "real" place of the file, rather than the place it was invoked from.

I just reproed this with python 2.7.3 on a new ubuntu system:

repro:
mkdir pydir
mkdir pydir/lnk
echo "import sys; print ">", sys.path[0]" >> pydir/lnk/test.py
lndir -s lnk/test.py pydir/test.py
python pydir/test.py
> /home/kristjan/pydir/lnk

You would have expected "/home/kristjan/pydir" since this is the apparent directory of the file.  When "pydir" contains many .py files, each residing in their own unique real target directories, then they cannot import each other.
msg186037 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-04-04 14:51
The haddoop thingie in question is called cloudera CDH4
msg186039 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-04-04 15:49
> You would have expected "/home/kristjan/pydir" since this is the
> apparent directory of the file.

That's questionable.
You usually have the libraries along with the binary: that's for example the case when you do a CPython checkout.

Changing this to not resolve the symlink would break some use cases. An alternative would be to add both the original and target directory if they differ, hoping that there's no conflict in the modules.
msg186042 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-04-04 16:33
There is no "binary" here contributing to the problem.  The particular case is a the hadoop runtime engine which creates a "virtual" folder of your working directory.  We have set up a directory of .py files:

foo/
 a.py
 b.py
 c.py

The hadoop job is then to run a.py.  It is run simply as python a.py.  In this case, by cd-ing into the dir and running the file.  hadoop knows nothing of python and merely executes the given file.

Now, what this hadoop implementation does, however, is to create a virtual symlink image of your project directory, and duplicate this in various places, e.g.:

tmp1/
 a.py -> /secret/filecache/0001/a.py
 b.py -> /secret/filecache/0002/b.py
 c.py -> /secret/filecache/0003/c.py

tmp2/
 a.py -> /secret/filecache/0001/a.py
 b.py -> /secret/filecache/0002/b.py
 c.py -> /secret/filecache/0003/c.py

Notice how each file, previously together in a folder, now pyhysically resides in a unique place.
now, when you run a.py from either tmp1 or tmp2, it will contain
sys.path[0] == "/secret/filecache/0001"

This means that a.py can no longer import b.py.  I am unaware of a workaround for this issue which does not involve modifying the individual .py files themselves to set up a path.

Fixing this would mean that rather than to
- get the absolute path of the .py file, and extract the path from it
you would
- extract the relative path of the .py file and retrieve the absolute path for it.

I am not sure about what use cases could be broken by the above change, do you have examples?
Normal use cases of symbolic links have to do with linking entire folders, not individual files, and that behaviour would not be broken by  such a hypothetical change, I think.
msg186063 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-04-05 07:06
> I am not sure about what use cases could be broken by the above change, do you have examples?
> Normal use cases of symbolic links have to do with linking entire folders, not individual files, and that behaviour would not be broken by  such a hypothetical change, I think.

For example:

/tmp/
    foo/
        foo.py
        libfoo.py

    foo.py -> /tmp/foo/foo.py

With foo.py containing "import libfoo".

Now, calling /tmp/foo.py works because sys.path[0] ==
dirname(realpath("/tmp/foo.py")) == dirname("/tmp/foo/foo.py") ==
"/tmp/foo.

If we change sys.path[0] to not dereference the symlink (that's how I
understood your suggestion, maybe I got it wrong), it will now be
/tmp, and importing libfoo will fail.

That's AFAICT exacyly the problem reported by the OP on OS-X.
msg186064 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-04-05 07:09
I've no clue what happened to the issue title (I just replied to the email, and the title changed)...
msg186068 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-04-05 09:03
Thanks for your example.

> That's AFAICT exacyly the problem reported by the OP on OS-X.
You are right, I mis-read the original problem.

IMHO, the example you quote is "unexpected".  The purpose of symbolic links is to create a "virtual" image of a structure.
a structure like you describe:
/scripts/
  foo.py -> /otherplace/foo.py

contains only a foo.py in its apparent location (scripts).  I would not expect the file to be able to import stuff from /otherplace unless that stuff were also present in /scripts

In other words: symlinking individual files normally works like you are "pulling that file in", not "hopping into that file's real location".

This behaviour is unexpected because I know of no other language tools that behave in this way:

/code/
  myfile.c -> /sources/myfile.c
  mylib.h  -> /libs/mylib.h
  libmylib.so -> /libs/libmylib.so

an "#include "mylib.h" in myfile.c would look for the file in /code and find it.
a "cc myfile.c -lmylib" would find the libmylib.so in /code

Since this is not the original problem described, I'll open up a separate defect report.
msg186071 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-04-05 09:30
Closign this again.
Opened up a new issue:
http://bugs.python.org/issue17639
History
Date User Action Args
2022-04-11 14:56:14adminsetgithub: 42713
2013-04-05 09:30:13kristjan.jonssonsetstatus: open -> closed

messages: + msg186071
2013-04-05 09:03:16kristjan.jonssonsetmessages: + msg186068
2013-04-05 07:09:43neologixsetmessages: + msg186064
title: sys.path -> sys.path[0] when executed thru a symbolic link
2013-04-05 07:06:09neologixsetmessages: + msg186063
title: sys.path[0] when executed thru a symbolic link -> sys.path
2013-04-04 16:33:32kristjan.jonssonsetmessages: + msg186042
2013-04-04 15:49:58neologixsetnosy: + neologix
messages: + msg186039
2013-04-04 14:51:04kristjan.jonssonsetmessages: + msg186037
2013-04-04 14:34:15kristjan.jonssonsetstatus: closed -> open
versions: + Python 2.6
nosy: + kristjan.jonsson

messages: + msg186034
2005-12-21 20:23:44kowaltowskicreate