classification
Title: shutil.rmtree can have O(n^2) performance on large dirs
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: giampaolo.rodola, nh2, pitrou, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-12-30 05:15 by nh2, last changed 2018-06-12 09:35 by giampaolo.rodola.

Files
File name Uploaded Description Edit
bench_rmtree.py serhiy.storchaka, 2017-12-30 12:06
Messages (11)
msg309217 - (view) Author: Niklas Hambüchen (nh2) Date: 2017-12-30 05:15
See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a for the explanation and equivalent fix in coreutils.

The gist ist that deleting entries in inode order can improve deletion performance dramatically.

To obtain inode numbers and sort by them, one needs to `getdents()` all the entries ahead of time, but rmtree() already gets all dirents ahead of the deletion. https://bugs.python.org/issue28564 recently improved shutil.rmtree() performance by using scandir(), but nevertheless the returned generator is list()ed, so we already have all necessary informtaion in memory and would just have to perform an O(n) integer sort.

I propose we check if the improvements made to `rm -r` in coreutils should be ported to shutil.rmtree().
msg309227 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-30 11:22
I don't think filesystem-specific optimizations belong in the Python stdlib.  Python is compatible with multiple operating systems, including Windows, macOS, Android, many POSIX variants.  It would be much better if this were fixed in the kernel...
msg309230 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-30 12:06
I have tested shutil.rmtree() with a large number of files using modified benchmark from issue28564. For 400000 files it takes less than 5 seconds. From the comment to the coreutils benchmark (http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a):

# Using rm -rf to remove a 400k-entry directory takes:
# - 9 seconds with the patch, on a 2-yr-old system
# - 350 seconds without the patch, on a high-end system (disk 20-30% faster) threshold_seconds=60

Running the coreutils benchmark gives the result 3 seconds on my computer.

It seems to me that this issue have been fixed in the kernel.
msg309300 - (view) Author: Niklas Hambüchen (nh2) Date: 2017-12-31 19:33
Serhiy, did you run your benchmark on an SSD or a spinning disk?

The coreutils bug mentions that the problem is seek times.

My tests on a spinning disk with 400k files suggest that indeed rmtree() is ~30x slower than `rm -r`:

    # time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)

    real  0m0.722s
    user  0m0.032s
    sys 0m0.680s

    # time rm -rf dirtest/

    real  0m0.519s
    user  0m0.074s
    sys 0m0.437s

    # time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)

    real  0m0.693s
    user  0m0.039s
    sys 0m0.659s

    # time python -c 'import shutil; shutil.rmtree("dirtest")'

    real  0m0.756s
    user  0m0.225s
    sys 0m0.499s

    # time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)

    real  0m0.685s
    user  0m0.032s
    sys 0m0.658s

    # time python3 -c 'import shutil; shutil.rmtree("dirtest")'

    real  0m0.965s
    user  0m0.424s
    sys 0m0.528s

    # time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)

    real  0m4.249s
    user  0m0.098s
    sys 0m2.804s

    # time rm -rf dirtest/

    real  0m10.782s
    user  0m0.265s
    sys 0m2.213s

    # time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)

    real  0m5.236s
    user  0m0.107s
    sys 0m2.832s

    # time python -c 'import shutil; shutil.rmtree("dirtest")'

    real  3m8.006s
    user  0m1.323s
    sys 0m3.929s

    # time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)

    real  0m4.671s
    user  0m0.097s
    sys 0m2.832s

    # time python3 -c 'import shutil; shutil.rmtree("dirtest")'

    real  2m49.476s
    user  0m2.196s
    sys 0m3.695s

The tests were done with coreutils rm 8.28, Python 2.7.14, Python 3.6.3,  on ext4 (rw,relatime,data=ordered), on a dmraid RAID1 across 2 WDC_WD4000FYYZ disks (WD 4 TB Enterprise).

Also note how deleting 100k files takes ~0.5 seconds with `rm -r` and the Pythons, but deleting 4x more files takes 20x longer with `rm -r` and ~300x longer with the Pythons.

There is clearly some boundary below which we are hitting some nice cached behaviour.
msg309303 - (view) Author: Niklas Hambüchen (nh2) Date: 2017-12-31 20:11
It turns out I was wrong when saying that there's some cache we're hitting.

In fact, `rm -r` is still precisely O(n^2), even with the coreutils patch I linked.

Quick overview table of the benchmark:

 nfiles     real   user     sys

 100000    0.51s  0.07s   0.43s
 200000    2.46s  0.15s   0.89s
 400000   10.78s  0.26s   2.21s
 800000   44.72s  0.58s   6.03s
1600000  180.37s  1.06s  10.70s

Each 2x increase of number of files results in 4x increased deletion time.


Full benchmark output:

    # time (mkdir dirtest && cd dirtest && seq 1 100000 | xargs touch)

    real  0m0.722s
    user  0m0.032s
    sys 0m0.680s

    # time rm -rf dirtest/

    real  0m0.519s
    user  0m0.074s
    sys 0m0.437s

    # time (mkdir dirtest && cd dirtest && seq 1 200000 | xargs touch)

    real  0m1.576s
    user  0m0.044s
    sys 0m1.275s

    # time rm -r dirtest/

    real  0m2.469s
    user  0m0.150s
    sys 0m0.890s

    # time (mkdir dirtest && cd dirtest && seq 1 400000 | xargs touch)

    real  0m4.249s
    user  0m0.098s
    sys 0m2.804s

    # time rm -rf dirtest/

    real  0m10.782s
    user  0m0.265s
    sys 0m2.213s

    # time (mkdir dirtest && cd dirtest && seq 1 800000 | xargs touch)

    real  0m10.533s
    user  0m0.204s
    sys 0m5.758s

    # time rm -rf dirtest/

    real  0m44.725s
    user  0m0.589s
    sys 0m6.037s

    # time (mkdir dirtest && cd dirtest && seq 1 1600000 | xargs touch)

    real  0m34.480s
    user  0m0.382s
    sys 0m12.057s

    # time rm -r dirtest/

    real  3m0.371s
    user  0m1.069s
    sys 0m10.704s
msg309305 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-12-31 20:28
Did you try to sync and flush caches before running `rm -r`?
msg309308 - (view) Author: Niklas Hambüchen (nh2) Date: 2017-12-31 21:14
> Did you try to sync and flush caches before running `rm -r`?

Yes, it doesn't make a difference for me, I still see the same O(n²) behaviour in `rm -r`.

I've sent an email "O(n^2) performance of rm -r" to bug-coreutils@gnu.org just now, unfortunately I can't link it yet because the mailman archive doesn't show it yet. It should appear soon on http://lists.gnu.org/archive/html/bug-coreutils/2017-12/threads.html.
msg309317 - (view) Author: Niklas Hambüchen (nh2) Date: 2018-01-01 02:44
OK, my coreutils email is at http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00054.html
msg309353 - (view) Author: Niklas Hambüchen (nh2) Date: 2018-01-01 23:29
A better location to view the whole coreutils thread is:

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=29921
msg309359 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-01-02 08:22
Thanks for your investigations Niklas. I ran my benchmark on a spinning disk, but significant nonlinearity is observed only for the size of directory around 1000000 and more. And yes, sorting by inode number helps.

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.01 100000
3.80 200000
3.64 300000
4.89 400000
8.72 600000
11.86 800000
56.80 1200000
209.82 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.97 100000
2.42 200000
3.84 300000
4.48 400000
7.07 600000
10.01 800000
15.53 1200000
23.24 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.68 100000
1.34 200000
2.10 300000
3.95 400000
5.95 600000
10.28 800000
47.66 1200000
89.32 1600000

On an SSD:

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.00 100000
1.93 200000
2.90 300000
4.98 400000
7.05 600000
9.87 800000
21.45 1200000
36.19 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.96 100000
1.99 200000
3.09 300000
4.85 400000
7.55 600000
9.44 800000
16.03 1200000
21.28 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.67 100000
1.38 200000
2.41 300000
2.82 400000
5.24 600000
7.02 800000
18.60 1200000
30.58 1600000

Interesting that patched Python is faster than 'rm' (GNU coreutils 8.26) for large numbers.
msg309367 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-01-02 11:26
Yes, so `rm -rf` is quadratic on my SSD too...
History
Date User Action Args
2018-06-12 09:35:52giampaolo.rodolasetnosy: + giampaolo.rodola
2018-01-02 11:26:15pitrousetmessages: + msg309367
2018-01-02 08:22:14serhiy.storchakasetmessages: + msg309359
2018-01-01 23:29:27nh2setmessages: + msg309353
2018-01-01 02:44:53nh2setmessages: + msg309317
2017-12-31 21:14:38nh2setmessages: + msg309308
2017-12-31 20:28:29pitrousetmessages: + msg309305
2017-12-31 20:11:40nh2setmessages: + msg309303
2017-12-31 19:33:57nh2setmessages: + msg309300
2017-12-30 12:06:33serhiy.storchakasetfiles: + bench_rmtree.py
2017-12-30 12:06:12serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg309230
2017-12-30 11:22:40pitrousetnosy: + pitrou
messages: + msg309227
2017-12-30 05:15:09nh2create