Author serhiy.storchaka
Recipients nh2, pitrou, serhiy.storchaka
Date 2018-01-02.08:22:11
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1514881334.16.0.467229070634.issue32453@psf.upfronthosting.co.za>
In-reply-to
Content
Thanks for your investigations Niklas. I ran my benchmark on a spinning disk, but significant nonlinearity is observed only for the size of directory around 1000000 and more. And yes, sorting by inode number helps.

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.01 100000
3.80 200000
3.64 300000
4.89 400000
8.72 600000
11.86 800000
56.80 1200000
209.82 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.97 100000
2.42 200000
3.84 300000
4.48 400000
7.07 600000
10.01 800000
15.53 1200000
23.24 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.68 100000
1.34 200000
2.10 300000
3.95 400000
5.95 600000
10.28 800000
47.66 1200000
89.32 1600000

On an SSD:

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-unpatched -c 'import shutil; shutil.rmtree("x")'; done
1.00 100000
1.93 200000
2.90 300000
4.98 400000
7.05 600000
9.87 800000
21.45 1200000
36.19 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" python3.7-patched -c 'import shutil; shutil.rmtree("x")'; done
0.96 100000
1.99 200000
3.09 300000
4.85 400000
7.55 600000
9.44 800000
16.03 1200000
21.28 1600000

$ for j in 1 2 3 4 6 8 12 16; do i=$((j*100000)); mkdir x; (cd x && seq $i|xargs touch); env time -f "%e $i" env rm -rf x; done
0.67 100000
1.38 200000
2.41 300000
2.82 400000
5.24 600000
7.02 800000
18.60 1200000
30.58 1600000

Interesting that patched Python is faster than 'rm' (GNU coreutils 8.26) for large numbers.
History
Date User Action Args
2018-01-02 08:22:14serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, nh2
2018-01-02 08:22:14serhiy.storchakasetmessageid: <1514881334.16.0.467229070634.issue32453@psf.upfronthosting.co.za>
2018-01-02 08:22:14serhiy.storchakalinkissue32453 messages
2018-01-02 08:22:11serhiy.storchakacreate