This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: make test has horrendous performance on an ecryptfs
Type: performance Stage: needs patch
Components: Versions: Python 3.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, jderose, ncoghlan, ned.deily, pitrou, skrah
Priority: normal Keywords:

Created on 2011-03-25 22:30 by barry, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (12)
msg132171 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2011-03-25 22:30
When your home directory is on a Linux (e.g. Ubuntu 10.10) ecryptfs, 'make test' and company can be horrendously slow.  Of course, some performance hit should be expected, but depending on which combinations of tests I've run, I can see up to 25000x (!) slower on an ecryptfs than on a normal ext4 file system.

regrtest.py changes its cwd to a TEMPDIR, but actually when you're running the tests from inside the Python build directory, this just becomes $srcdir/build so you don't get any advantage of running the tests out of e.g. a much faster tmpfs.

(Aside: I'm not sure under what cases you would *not* be normally running out of the build dir, but I guess if you 'cd /tmp; /path/to/python/configure' and such, it would put you in a normal temp directory.  OTOH, you're already in a tmpdir by then so what's the point of _make_temp_dir_for_build()?)

I'd like to at least provide the option to create regrtest temporary files elsewhere so that they can live on a fast file system.  There are several ways I can think of doing this and I'm not sure what the best way is:

* Remove the special case from _make_temp_dir_for_build() so that it always sets TESTCWD into a tmpdir.

* Add a boolean option --usetmp/-p which enables this override or the moral equivalent.

* Make TESTFN relative to mkdtemp().  A quick and dirty test of this showed that it did significantly improve test times, but there were test failures too.  

I'm open to other ideas, but I really do want to be able to ./configure && ake && make testall in an ecryptfs build dir and get reasonable test times.

You'll need an atexit handler or similar to clean up the tempdir.

This does affect older Pythons, but probably any solution would be classified as a new feature so could only go in Python 3.3.
msg132190 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-03-26 00:15
(Addressing your aside: one case where the tests are not run in a build directory is with binary installers.  For instance, the Mac OS X installers we provide include all of the test modules and it is normal to run them after installation, quite possibly on a system that does not have all the build tools to build a Python much less a build directory.)
msg132311 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-27 11:17
One strong reason for having the test files in the build directory is ease of cleanup, especially on the buildbots where crashes or hangs can lead to progressive disk fillup (and some tests create very large files, e.g. 2GB).

See also 673a5afce4e0.
msg132338 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2011-03-27 15:34
Makes sense.  So, what do you think about adding a --usetmp/-p flag to regrtest to honor mkdtemp's defaults even in a build dir?  I'd add an atexit handler to clean it up but of course if it crashes and you've used the flag, you should know enough to be able to manually clean things up.
msg132343 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-27 15:39
> Makes sense.  So, what do you think about adding a --usetmp/-p flag to
> regrtest to honor mkdtemp's defaults even in a build dir?  I'd add an
> atexit handler to clean it up but of course if it crashes and you've
> used the flag, you should know enough to be able to manually clean
> things up.

Sounds good. It will also help performance on my Windows VM :)

Bikeshedding: since it won't be a widely-used option, perhaps "-P" is
better than "-p"?

Not-so-much-bikeshedding: mkdtemp() could be used inside a (e.g.)
"/tmp/test_python" top dir, to make manual cleanup extra easy.
msg132541 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2011-03-29 22:11
Antoine, -P is fine with me!

Also, since my idea is that --usetmp/-P would just use the mkdtemp() algorithm (which looks for $TMPDIR, $TEMP or $TMP), getting the build into a subdirectory, e.g. /tmp/test_python would be as easy as setting TMP=/tmp/test_python.  So I'm inclined to keep it real simple.  I will add an atexit handler though to clean up under normal situations.  Your right that if Python crashes you'll be left with file cruft, but if you're using --usetmp/-P you should know enough to look for that.
msg132564 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-03-30 01:31
If support for a top level temporary directory is added, test.support should acquire alternatives to the tempfile module tools to make it easy for tests that create their own temporary files to respect that naming scheme. In particular, test.script_helper.temp_dir() should be moved into test.support.
msg132566 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-03-30 02:08
To control where mkdtemp() puts files, you could just use the "dir" argument (and you can use tempfile.gettempdir() beforehand if you want that location to be inside the normal temp directory)
msg143660 - (view) Author: Jason Gerard DeRose (jderose) Date: 2011-09-07 04:25
Barry,

I'm suspicious there might be more to the performance issue than just the ecryptfs overhead.  While experimenting with a read benchmark, I just happened to notice that when reading from an ecryptfs filesystem, the CPU usage is unusually high in the *python3* process.

For example:

./benchmark.py /home/.dmedia
  => 149 MB per second
  => top shows 22-24% CPU usage

./benchmark.py /home/jderose/.dmedia
  => 38.9 MB per second
  => top shows 79-85% CPU usage

It's the same physical drive in both cases, but the one in /home/jderose is ecryptfs.  If it was just ecryptfs overhead, wouldn't there be lower CPU utilization in the python3 process, as there would be a lower throughput coming from the kernel, more time waiting on IO?

In both cases, there were 56 files, for a total of 19.5 GB.  I ran this on 64-bit Ubuntu Oneiric, Python 3.2.2.

Here's the benchmark:

http://bazaar.launchpad.net/~jderose/filestore/multi/view/head:/benchmark.py
msg143661 - (view) Author: Jason Gerard DeRose (jderose) Date: 2011-09-07 04:54
Oops, I think I don't understand the meaning of top CPU usage, as time tells a different story.

Direct ext4:
real	2m14.144s
user	0m0.260s
sys	0m30.350s

ecryptfs over ext4:
real	8m47.130s
user	0m0.080s
sys	7m2.080s
msg202557 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-11-10 19:22
I'm going to close this issue as invalid; it hasn't affected me on ecryptfs $HOME on Ubuntu in a long time, so let's chalk it up to better ecryptfs implementations now.

If you disagree, feel free to re-open this and provide more information.
msg248330 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2015-08-09 17:00
I don't know whether this is worth reopening, but the ecryptfs
performance is still very poor on my Lenovo T400 (see #24831).

For most people an extra option for choosing the tmpdir
would not help, since they'd simply blame the hardware
or the test suite.
History
Date User Action Args
2022-04-11 14:57:15adminsetgithub: 55886
2015-08-09 17:00:24skrahsetnosy: + skrah
messages: + msg248330
2013-11-10 19:22:57barrysetstatus: open -> closed
resolution: not a bug
messages: + msg202557
2011-09-07 04:54:47jderosesetmessages: + msg143661
2011-09-07 04:25:59jderosesetnosy: + jderose
messages: + msg143660
2011-03-30 02:08:45ncoghlansetmessages: + msg132566
2011-03-30 01:31:25ncoghlansetnosy: + ncoghlan
messages: + msg132564
2011-03-29 22:11:53barrysetmessages: + msg132541
2011-03-27 15:39:34pitrousetmessages: + msg132343
2011-03-27 15:34:13barrysetmessages: + msg132338
2011-03-27 11:17:30pitrousetnosy: + pitrou
messages: + msg132311
2011-03-26 00:15:44ned.deilysetnosy: + ned.deily
messages: + msg132190
2011-03-25 22:30:47barrycreate