This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients abacabadabacaba, akira, benhoyt, giampaolo.rodola, pitrou, socketpair, tim.golden, vstinner
Date 2014-10-09.11:16:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1412853385.73.0.440250288849.issue22524@psf.upfronthosting.co.za>
In-reply-to
Content
My previous benchmark runs were with the system Python 3.3.

New benchmark with a Python 3.5 patched with scandir-1.patch (compiled in release mode: ./configure && make).

"os" (os.scandir) is 2 times faster than "c" (_scandir.scandir_helper): c=0.533 sec, os=0.268 sec.

On all implementations, scandir.walk() is only much faster than os.walk() when "os" (os.scandir) is used: "3.2x as fast" (860 ms => 268 ms).

I guess that on Linux the speedup highly depends on the number of symbolic links.

It would help if benchmark.py created the tree to have more reliable numbers, and being able to compare trees without symlinks, with a few symlinks (ex: 10%), and with a lot of symlinks (ex: 99%).

Benchmark results:

haypo@smithers$ ~/prog/python/default/python setup.py build && for scandir in generic python c os; do echo; echo "=== $scandir ==="; PYTHONPATH=build/lib.linux-x86_64-3.5/ ~/prog/python/default/python benchmark.py /usr/share -c $scandir || break; done
running build
running build_py
copying scandir.py -> build/lib.linux-x86_64-3.5
running build_ext

=== generic ===
Using very slow generic version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk took 0.857s, scandir.walk took 1.627s -- 0.5x as fast

=== python ===
Using slower ctypes version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk took 0.856s, scandir.walk took 0.915s -- 0.9x as fast

=== c ===
Using fast C version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk took 0.857s, scandir.walk took 0.533s -- 1.6x as fast

=== os ===
Using Python 3.5's builtin os.scandir()
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk took 0.860s, scandir.walk took 0.268s -- 3.2x as fast
History
Date User Action Args
2014-10-09 11:16:25vstinnersetrecipients: + vstinner, pitrou, giampaolo.rodola, tim.golden, benhoyt, abacabadabacaba, akira, socketpair
2014-10-09 11:16:25vstinnersetmessageid: <1412853385.73.0.440250288849.issue22524@psf.upfronthosting.co.za>
2014-10-09 11:16:25vstinnerlinkissue22524 messages
2014-10-09 11:16:25vstinnercreate