Message237912
To Victor and Serhiy:
1) Serhiy's point about not needing to build the symlinks set when followlinks is True is a good one, because it'll never get used. Just a "if not followlinks: ..." around that try/except would do it. Though this is a small optimization, as I expect 95% of people use os.walk() with followlinks=False, which is the default.
2) Much as I don't want to admit it :-), I think Serhiy has a point regarding the change in behaviour. Can we accept this tiny change? This happens because the previous implementation of os.walk() calls path.islink(), which does a real syscall, after the yield returns.
So if the caller modifies "dirnames" and adds a symlink to a directory it won't be in the symlinks set. Or if (as Serhiy's example shows) they change a symlink-to-a-directory to a directory the new implementation doesn't do another syscall to check, so differs from the old one -- in Serhiy's example, the old os.walk() will recurse into the changed symlink-to-a-directory-that's-now-a-directory, whereas the new os.walk() won't recurse.
Arguably the new behaviour is just as good here, but it is different. And Serhiy's function unsymlink() is a reasonable example of how this might play out.
The os.walk() docs explicitly allow modifying "dirnames" -- not just removing and reordering, but also adding entries, which could surface the difference in behaviour: "the caller can modify the dirnames list in-place ... even to inform walk() about directories the caller creates or renames before it resumes walk() again". See here:
https://docs.python.org/3.4/library/os.html#os.walk
So I do think it's a real issue. Will this really be an issue in practice? I don't know; I'm tempted to think not.
Obviously if we have to call stat/islink again, it defeats half of the purpose of scandir. But I don't see a way around it if we want to keep the exact old os.walk() behaviour. We could fix the behaviour if a directory/symlink was added to "dirnames" fairly easily by testing whether it changed, but I don't see a way to fix the unsymlinks() example without a syscall.
Thoughts? |
|
Date |
User |
Action |
Args |
2015-03-12 02:53:50 | benhoyt | set | recipients:
+ benhoyt, scott.dial, vstinner, python-dev, serhiy.storchaka |
2015-03-12 02:53:49 | benhoyt | set | messageid: <1426128829.99.0.763230702833.issue23605@psf.upfronthosting.co.za> |
2015-03-12 02:53:49 | benhoyt | link | issue23605 messages |
2015-03-12 02:53:48 | benhoyt | create | |
|