Author vstinner
Recipients abacabadabacaba, akira, benhoyt, giampaolo.rodola, pitrou, socketpair, tim.golden, vstinner
Date 2014-10-07.21:33:34
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1412717615.29.0.86426587652.issue22524@psf.upfronthosting.co.za>
In-reply-to
Content
Hi Ben,

I started to review your patch, sorry for the delay :-( It's convinient to have the patch on Rietveld to comment it inline.

I didn't expect so much C code. The posixmodule.c is really huge. Well, it's just the longest C file of Python 3.5. Your patch adds 781 more lines to it. It's maybe time to split the file into smaller files.

At the same time, reading C code is boring and it's too easy to "implement bugs". I need to be convinced that the speedup announced in the PEP requires so much C code. In the PEP and https://github.com/benhoyt/scandir#benchmarks it's not clear if the speedup comes from the C code (instead of implementing scandir in pure Python, for example using ctypes) or the lower number of syscalls.

To have a fair benchmark, the best would be to have a C part only implementing opendir/readir and FirstFindFile/FindFileXXX (I'm not sure that ctypes is as fast as C code written with the Python C API), and a Python part which implements the nice Python API.

If the speedup is 10% or lower, I would prefer to not add so much C code and take the compromise of a thin C wrapper and implement scandir in Python.

If we implement os.scandir() in Python, should we still try to share code with os.listdir()? I see that importlib uses os.listdir(), so we need a listdir() function implemented in C at least for importlib.
History
Date User Action Args
2014-10-07 21:33:35vstinnersetrecipients: + vstinner, pitrou, giampaolo.rodola, tim.golden, benhoyt, abacabadabacaba, akira, socketpair
2014-10-07 21:33:35vstinnersetmessageid: <1412717615.29.0.86426587652.issue22524@psf.upfronthosting.co.za>
2014-10-07 21:33:35vstinnerlinkissue22524 messages
2014-10-07 21:33:34vstinnercreate