This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author larry
Recipients larry, neologix, serhiy.storchaka
Date 2012-06-27.10:31:17
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1340793078.97.0.424986247396.issue15200@psf.upfronthosting.co.za>
In-reply-to
Content
> It doesn't have to.
> Right now, it uses O(depth of the directory tree) FDs. 
> It can be changed to only require O(1) FDs

But closing and reopening those file descriptors seems like it might slow it down; would it still be a performance win?

Also, I'm not a security expert, but would the closing/reopening allow the possibility of timing attacks?  If so, that might still be okay for walk which makes no guarantees about safety.  (But obviously it would be unacceptable for fwalk.)


> Anyway, I think that such optimization is useless, because this
> micro-benchmark doesn't make much sense: when you walk a
> directory tree, it's usually to do something with the
> files/directories encountered, and as soon as you do something
> with them - stat(), unlink(), etc - the gain on the walking
> time will become negligible.

I'm not sure that "usually" is true here.  I suggest that "usually" people use os.walk to find *particular files* in a directory tree, generally by filename.  So most of the time os.walk really is quickly iterating over directory trees doing very little.

I think 20% is a respectable gain, and it's hard for me to say "no" to functions that make Python faster for free.  (Well, for the possible cost of a slightly more expensive algorithm.)  So I'm +x for now.
History
Date User Action Args
2012-06-27 10:31:19larrysetrecipients: + larry, neologix, serhiy.storchaka
2012-06-27 10:31:18larrysetmessageid: <1340793078.97.0.424986247396.issue15200@psf.upfronthosting.co.za>
2012-06-27 10:31:18larrylinkissue15200 messages
2012-06-27 10:31:17larrycreate