Issue89

Title easy_install silently drop symlinks when auto-extracting tarball source distributions
Priority bug Status resolved
Superseder Nosy List maxb, pje
Assigned To Keywords

Created on 2009-11-01.11:00:41 by maxb, last changed 2011-03-23.20:55:51 by pje.

Messages
msg603 (view) Author: pje Date: 2011-03-23.20:55:51
Closing due to no reply by requester in over a year.  (The patch has already been out in the development snapshot releases for almost that long, anyway.)
msg498 (view) Author: pje Date: 2010-02-01.18:23:02
The Python tarfile module normally extracts links as copies, actually, in order to work across platforms.  (And it uses a similar loop approach, although in its case a recursion error will result if the symlinks are looped, rather than an infinite loop.)

Also, pytz is distributed in .egg form on PyPI, and eggs cannot contain symlinks (zipfiles don't support them), so I doubt the copying will cause any problems.

Nonetheless, if you could test the patch, it'd be helpful in getting this into the next release.  Thanks.

(Btw, as far as I know, ptyz is the only package with *any* symlinks in its source distribution, and I doubt it's by conscious choice OR requirement.  If you look at their .zip source distribution, you'll see that it also contains copies, rather than symlinks.)
msg494 (view) Author: maxb Date: 2010-02-01.17:57:12
As a further remark:
I think the suggested new code would go into an infinite loop if someone had
self-referential symlinks.  Someone somewhere is likely to be silly enough to do
that, one day.
msg492 (view) Author: maxb Date: 2010-02-01.17:54:50
Your proposed fix would convert symlinks into copies - I'll have to go check whether pytz will at all be adversely be affected by this. Would it be possible to extract symlinks as real symlinks?

Also, can I recommend that any point in this routine where it decides to not extract an archive member, it would be very desirable to report a user-visible warning? I, and others, were very confused for a while before we tracked down this cause of this behaviour - some warnings on stderr would have saved a lot of time.
msg489 (view) Author: pje Date: 2010-02-01.17:46:06
Would you mind testing the following patch, to see if it solves your problem?  Thanks.

Index: setuptools/archive_util.py
===================================================================
--- setuptools/archive_util.py  (revision 75363)
+++ setuptools/archive_util.py  (working copy)
@@ -180,11 +180,15 @@
     try:
         tarobj.chown = lambda *args: None   # don't do any chowning!
         for member in tarobj:
-            if member.isfile() or member.isdir():
-                name = member.name
-                # don't extract absolute paths or ones with .. in them
-                if not name.startswith('/') and '..' not in name:
-                    dst = os.path.join(extract_dir, *name.split('/'))
+            name = member.name
+            # don't extract absolute paths or ones with .. in them
+            if not name.startswith('/') and '..' not in name:
+                dst = os.path.join(extract_dir, *name.split('/'))
+
+                while member.islnk() or member.issym():
+                    member = tarobj._getmember(member.linkname, member)
+
+                if member.isfile() or member.isdir():
                     dst = progress_filter(name, dst)
                     if dst:
                         if dst.endswith(os.sep):
msg444 (view) Author: maxb Date: 2009-11-01.11:00:41
Trying to determine why pytz installed by easy_install was broken, I located the
following problem:

When setuptools extracts a tarball source dist to install it, it silently drops
any tar members which are neither files or directories. The problem is in
archive_util.py which specifically tests "if member.isfile() or
member.isdir():". I am uncertain why it would try to do this, but it is fatally
incorrect to do so when the software being unpacked includes symlinks as a
functional part of its sourcecode, as pytz does.
History
Date User Action Args
2011-03-23 20:55:51pjesetstatus: testing -> resolved
messages: + msg603
2010-02-01 18:23:02pjesetmessages: + msg498
2010-02-01 17:57:12maxbsetmessages: + msg494
2010-02-01 17:54:50maxbsetmessages: + msg492
2010-02-01 17:46:07pjesetstatus: unread -> testing
nosy: + pje
messages: + msg489
2009-11-01 11:00:41maxbcreate