New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shutil.rmtree can have O(n^2) performance on large dirs #76634
Comments
See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a for the explanation and equivalent fix in coreutils. The gist ist that deleting entries in inode order can improve deletion performance dramatically. To obtain inode numbers and sort by them, one needs to I propose we check if the improvements made to |
I don't think filesystem-specific optimizations belong in the Python stdlib. Python is compatible with multiple operating systems, including Windows, macOS, Android, many POSIX variants. It would be much better if this were fixed in the kernel... |
I have tested shutil.rmtree() with a large number of files using modified benchmark from bpo-28564. For 400000 files it takes less than 5 seconds. From the comment to the coreutils benchmark (http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a): # Using rm -rf to remove a 400k-entry directory takes: Running the coreutils benchmark gives the result 3 seconds on my computer. It seems to me that this issue have been fixed in the kernel. |
Serhiy, did you run your benchmark on an SSD or a spinning disk? The coreutils bug mentions that the problem is seek times. My tests on a spinning disk with 400k files suggest that indeed rmtree() is ~30x slower than
The tests were done with coreutils rm 8.28, Python 2.7.14, Python 3.6.3, on ext4 (rw,relatime,data=ordered), on a dmraid RAID1 across 2 WDC_WD4000FYYZ disks (WD 4 TB Enterprise). Also note how deleting 100k files takes ~0.5 seconds with There is clearly some boundary below which we are hitting some nice cached behaviour. |
It turns out I was wrong when saying that there's some cache we're hitting. In fact, Quick overview table of the benchmark: nfiles real user sys 100000 0.51s 0.07s 0.43s Each 2x increase of number of files results in 4x increased deletion time. Full benchmark output:
|
Did you try to sync and flush caches before running |
Yes, it doesn't make a difference for me, I still see the same O(n²) behaviour in I've sent an email "O(n^2) performance of rm -r" to bug-coreutils@gnu.org just now, unfortunately I can't link it yet because the mailman archive doesn't show it yet. It should appear soon on http://lists.gnu.org/archive/html/bug-coreutils/2017-12/threads.html. |
OK, my coreutils email is at http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00054.html |
A better location to view the whole coreutils thread is: |
Thanks for your investigations Niklas. I ran my benchmark on a spinning disk, but significant nonlinearity is observed only for the size of directory around 1000000 and more. And yes, sorting by inode number helps. $ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.01 100000
3.80 200000
3.64 300000
4.89 400000
8.72 600000
11.86 800000
56.80 1200000
209.82 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.97 100000
2.42 200000
3.84 300000
4.48 400000
7.07 600000
10.01 800000
15.53 1200000
23.24 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.68 100000
1.34 200000
2.10 300000
3.95 400000
5.95 600000
10.28 800000
47.66 1200000
89.32 1600000 On an SSD: $ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.00 100000
1.93 200000
2.90 300000
4.98 400000
7.05 600000
9.87 800000
21.45 1200000
36.19 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.96 100000
1.99 200000
3.09 300000
4.85 400000
7.55 600000
9.44 800000
16.03 1200000
21.28 1600000
$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.67 100000
1.38 200000
2.41 300000
2.82 400000
5.24 600000
7.02 800000
18.60 1200000
30.58 1600000 Interesting that patched Python is faster than 'rm' (GNU coreutils 8.26) for large numbers. |
Yes, so |
A small update / summary so far: From here this developed into coreutils discussion:
and finally a
Dave Chinner (xfs dev) suggests that on XFS there is no quadratic behaviour once the problem is bound by seek-time of the spinning disk. Somebody should try to confirm that it becomes linear in even larger tests, e.g. way larger than 21 minutes deletion time. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: