This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients abacabadabacaba, akira, benhoyt, giampaolo.rodola, pitrou, socketpair, tim.golden, vstinner
Date 2014-10-09.11:24:45
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1412853885.74.0.28035317983.issue22524@psf.upfronthosting.co.za>
In-reply-to
Content
On Windows, I guess that "benchmark.py --size" is faster with scandir() than with os.walk(), because os.stat() is never called.

benchmark.py has a bug in do_os_walk() when the --size option is used: attached do_os_walk_getsize.patch is needed.

Sizes returned by os.walk() and scandir.walk() are different. I guess that the behaviour of symbolic links to directory is different. Because of that, I'm not sure that benchmark timings are reliable, but well, it should give us an idea of performances.

To compute the size of a tree, scandir() is twice faster (2.1x as fast) than os.walk(): os.walk=1.435 sec, scandir.walk=0.675 sec.

"os" is 41% faster than "c": c=1150 ms, os=675 ms.


Results of "benchmark.py --size" on my Linux Fedora 20:

haypo@smithers$ ~/prog/python/default/python setup.py build && for scandir in generic python c os; do echo; echo "=== $scandir ==="; PYTHONPATH=build/lib.linux-x86_64-3.5/ ~/prog/python/default/python benchmark.py -s /usr/share -c $scandir || break; done
running build
running build_py
running build_ext

=== generic ===
Using very slow generic version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk size 3064748475, scandir.walk size 2924332540 -- NOT EQUAL!
os.walk took 1.425s, scandir.walk took 1.147s -- 1.2x as fast

=== python ===
Using slower ctypes version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk size 3064748475, scandir.walk size 2924332540 -- NOT EQUAL!
os.walk took 1.421s, scandir.walk took 1.651s -- 0.9x as fast

=== c ===
Using fast C version of scandir
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk size 3064748475, scandir.walk size 2924332540 -- NOT EQUAL!
os.walk took 1.426s, scandir.walk took 1.150s -- 1.2x as fast

=== os ===
Using Python 3.5's builtin os.scandir()
Comparing against builtin version of os.walk()
Priming the system's cache...
Benchmarking walks on /usr/share, repeat 1/3...
Benchmarking walks on /usr/share, repeat 2/3...
Benchmarking walks on /usr/share, repeat 3/3...
os.walk size 3064748475, scandir.walk size 2924332540 -- NOT EQUAL!
os.walk took 1.435s, scandir.walk took 0.675s -- 2.1x as fast
History
Date User Action Args
2014-10-09 11:24:45vstinnersetrecipients: + vstinner, pitrou, giampaolo.rodola, tim.golden, benhoyt, abacabadabacaba, akira, socketpair
2014-10-09 11:24:45vstinnersetmessageid: <1412853885.74.0.28035317983.issue22524@psf.upfronthosting.co.za>
2014-10-09 11:24:45vstinnerlinkissue22524 messages
2014-10-09 11:24:45vstinnercreate