classification
Title: distutils install race condition
Type: behavior Stage:
Components: Distutils Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: dstufft Nosy List: christian.heimes, dstufft, eric.araujo, illume, tarek
Priority: low Keywords:

Created on 2009-11-30 09:01 by illume, last changed 2016-09-24 22:21 by christian.heimes.

Messages (5)
msg95833 - (view) Author: Rene Dudfield (illume) Date: 2009-11-30 09:01
hello!

Pythons distutils has a race condition where it starts to copy files
into the python path whilst installing.

This is a race condition, since python programs can be importing the
package whilst the package is being installed.

It would be good for distutils to install things in an atomic manner.
Where things can be installed, or not installed. Like, on unix by moving
the files in from a temporary directory. This would also help reduce
breakages. Since if a package breaks half way installing a package then
the broken version will not over write the existing version.

It's not a very serious problem, since most people don't install things
on live important systems(but some do).  It does make hard to diagnose
problems though... which are good ones to fix.  Most packaging tools fix
the issues with the source installs (eg, using .deb installers).

I've only looked at the source install, but I imagine other install
methods might be affected too.  

os.rename
mkdtemp
http://docs.python.org/library/tempfile.html#tempfile.mkdtemp

Perhaps files should be created in the /tmp first, and then moved in.

/tmp/ can be in a different file system so a rename won't work in that
case on some OSes.  If you create the temp directory in the same
directory as the directory to install.  In that case rename doesn't
work, using the temp file system(eg /tmp) and then copying to a temp
directory in the site-packages directory, with finally a move into place.

If copying to the site-packages before moving, then there could be left
over temp files in the site-packages directory.  So these would need to
be considered, and cruft cleaned up.  This is why using the normal temp
directory would be better, since the temp files will be cleaned as
normal policy on the system.
  
setuptools/distribute writes the .egg-info at the end of the install. 
This should also use a move for install, for the same reasons.  However,
there might still be a separate race condition since that lives in a
different directory.  Since you can not atomically move two things at
once( I think? ) the .egg-info will continue to be a race condition,
unless both directories are moved into the same one.



Just as moving packages into place will make the installs more robust,
so will testing the packages before moving them into place.  Some way to
test the install before moving it into place would be good.  Or at least
trying to import the package in a sub process.  Taking advantage of the
test command could be an idea.  Test before moving into place would give
a better chance of good packages being used.  Import before moving into
place could also help.
msg96579 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-12-18 19:38
As you said, this race condition can happen even if you copy the file in
some temp dir, then move them to python. That would reduce the window,
but not remove it.

The best way to avoid it is to stop any running Python process when 
updates occurs I guess, or work on some kind of protected import statement. 

Now I am not sure about potential breakages during installation at
Distutils' level since it just copies files to site-packages only once
everything is built. The only breakage I have in mind is when there's
some kind of I/O error when copying files on the disk.

So I agree that it would be better to save any file that is about to be
overriden when the install command starts to copy files, so it can
rollback if a problem occurs.
msg97107 - (view) Author: Rene Dudfield (illume) Date: 2009-12-31 19:04
Hi Tarek,


moving a package into place right at the end is the best thing to do I
think.

It solves a couple common problems:
    - broken packages after an install is stopped half way for one of
many common reasons.
    - old files left around will not be there, since they would be
moved.  numpy 1.4 breaks for example if installing over an old version
first and not removing the old version.  So did pygame in the past...
and likely other packages.
    - race condition whilst installing (not so common, but happens).

I'm not sure I did say something like 'this race condition can happen
even if you copy the file in some temp dir, then move them to python.'.
 But now I see a problem... if the existing directory has files in it,
you can not move another directory over the top of it atomically.

I don't think there is a way to solve this completely, without changing
the packaging system quite dramatically.  However we can make it much
more likely we will win the race!!!  First trying to install into a temp
directory, then if on the same file system, moving the old directory to
a backup one, then moving the directory into place, finally removing the
old directory.

This would reduce the time of the race condition to a fraction of a
second(0.01 seconds on my box) compared to many seconds on average, and
longer for big packages with big files.

Packages not installing correctly is a big problem for C/C++ packages
more so than simply python packages.

Fixing this will mean distutils becomes more robust, and also will stop
a security issue.


I tried a method similar to this pseudo code as a work around:
python setup.py install --home=/tmp/apackagetmp
move temp package into place...

Here is a script I started on... for pygame installs on ubuntu karmic koala.
http://rene.f0o.com/~rene/stuff/safer_install.py

I thought I'd try it out as a separate thing first to try and get it
right before trying a patch with distutils.  There's still a bunch of
issues with it, and it needs tests written.
- when installing to a root place the permissions are not correct.  They
are the ones of the user it is ran as.  ie, 'chmod -R root:staff dest'
would probably fix it.
- it would be good to extract the paths for various things from
distutils... rather than guessing.  For example the dist-packages
directory, where the include files are etc etc.
- I'm going to put in a test step.  So it installs to the temp directory
first, then tries to run the package tests before installing.  The idea
being that if all the tests past *then* install.  This should make it
more robust still.  It will probably try as a separate process with
"python -m apackage.tests.__main__" or something like that.
- it always copies in all of the files.  Which is good and bad.  It
means that there won't be old files, but it also means that it can't
avoid work... like say only one file changes.
- probably many other issues...


cheers,

ps. happy new year!
msg275211 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-08 23:36
Donald, what is your opinion on this issue?
msg277341 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-24 22:21
This is not a security problem per se. It's rather a request to chance the behavior of package installation.
History
Date User Action Args
2016-09-24 22:21:59christian.heimessettype: security -> behavior
messages: + msg277341
2016-09-08 23:36:52christian.heimessetversions: + Python 3.6, - Python 2.7, Python 3.2
nosy: + christian.heimes, dstufft

messages: + msg275211

assignee: tarek -> dstufft
2011-11-08 22:41:08ezio.melottisetnosy: + eric.araujo
2009-12-31 19:04:54illumesetmessages: + msg97107
2009-12-18 19:38:17tareksetpriority: low

messages: + msg96579
versions: + Python 2.7, Python 3.2
2009-11-30 09:01:05illumecreate