classification
Title: Provide a namedtuple style interface for os.walk values
Type: enhancement Stage: needs patch
Components: Versions: Python 3.3
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, benjamin.peterson, krisys, ncoghlan, pitrou, rhettinger
Priority: normal Keywords: patch

Created on 2011-11-09 07:22 by ncoghlan, last changed 2011-11-10 03:25 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
issue13375.diff krisys, 2011-11-09 12:16 Returns a namedtuple for dirpath, dirnames and filenames
issue13375_tests.diff krisys, 2011-11-09 12:59 Tests for namedtuples in os.walk
Messages (10)
msg147334 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-11-09 07:22
The 3-tuple values yielded by os.walk could be made easier to work with in some use cases by offering a namedtuple style interface (similar to what is done with sys.float_info).

for dirinfo in os.walk(base_dir):
    print(dirinfo.path)
    print(dirinfo.subdirs)
    print(dirinfo.files)
msg147351 - (view) Author: Krishna Bharadwaj (krisys) Date: 2011-11-09 12:16
Have included a patch which alters the walk method to yield a namedtuple and the members can be accessed by dirpath, dirnames and filenames.

Got the following results after running the test. 

Ran 61 tests in 0.080s

OK (skipped=4)

Please let me know if the same can be achieved in a better way.
msg147352 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-11-09 12:21
This patch needs a test at least.
Also, the "walktuple" type should be defined only once, at the module level (but then, there may be a boostrap issue, I don't know).
msg147353 - (view) Author: Krishna Bharadwaj (krisys) Date: 2011-11-09 12:59
Hey Amaury,

can you tell me if the following test cases would suffice? If not, I can think of adding something more comprehensive. Also, can you provide some pointers related to the bootstrap issue so that I can look at the same?
msg147363 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-11-09 19:03
Perhaps Raymond has a different view, but I don't this patch makes anything more clear. There's only three things to remember and its convenient to unpack it in the loop like

for path, dirs, files in os.walk(somewhere):
    ...
msg147375 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-11-09 21:25
Like any named tuple, the benefits lie in the better repr, and the fact
that if you only want some fields you don't have to unpack the whole tuple.
It's also easier to write variant APIs that add additional fields
accessible only by name.
msg147376 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-11-09 21:49
> Like any named tuple, the benefits lie in the better repr, and the fact
> that if you only want some fields you don't have to unpack the whole
> tuple.

But, given the common idiom shown by Benjamin, how likely is it that you manipulate the tuple as-is?
msg147390 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-11-10 02:45
Why provide any namedtuple interface in any context? After all, you can just unpack them to individual variables.

The point is that the values produced by os.walk() *aren't* just an arbitrary 3-tuple - they have a definite API for describing a directory: the base path, then lists of relative names for any subdirectories and the relative names for any files. Why not make that explicit in the objects produced instead of leaving it as merely implied?

This idea actually came out of the proposal for providing an itertools-inspired toolset for manipulating the output of os.walk() style iteration (#13229 and https://bitbucket.org/ncoghlan/walkdir/overview).

I'll be adding this feature to walkdir regardless, but it seems to make more sense to offer it as standard behaviour.
msg147391 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-11-10 02:52
2011/11/9 Nick Coghlan <report@bugs.python.org>:
>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> Why provide any namedtuple interface in any context? After all, you can just unpack them to individual variables.
>
> The point is that the values produced by os.walk() *aren't* just an arbitrary 3-tuple - they have a definite API for describing a directory: the base path, then lists of relative names for any subdirectories and the relative names for any files. Why not make that explicit in the objects produced instead of leaving it as merely implied?

You could make this argument for any function that returns a tuple to
return multiple distinct values. I claim that the API in this case is
already simple enough that adding a nametuple does nothing but feature
bloat. What does having a "dirinfo" object with attributes tell you
that simply unpacking the tuple doesn't? You have to remember names in
both cases.

>
> This idea actually came out of the proposal for providing an itertools-inspired toolset for manipulating the output of os.walk() style iteration (#13229 and https://bitbucket.org/ncoghlan/walkdir/overview).
>
> I'll be adding this feature to walkdir regardless, but it seems to make more sense to offer it as standard behaviour.

Indeed, I think using a namedtuple seems more appropriate for your
"fancier" api.
msg147393 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-11-10 03:25
I'm persuaded that there's no major gain to be had in building this in at the base layer - it's easy enough to add in a higher level API.
History
Date User Action Args
2011-11-10 03:25:45ncoghlansetstatus: open -> closed
resolution: rejected
messages: + msg147393
2011-11-10 02:52:34benjamin.petersonsetmessages: + msg147391
2011-11-10 02:45:43ncoghlansetmessages: + msg147390
2011-11-09 21:49:59pitrousetnosy: + pitrou
messages: + msg147376
2011-11-09 21:25:07ncoghlansetmessages: + msg147375
2011-11-09 19:03:41benjamin.petersonsetnosy: + rhettinger, benjamin.peterson
messages: + msg147363
2011-11-09 12:59:48krisyssetfiles: + issue13375_tests.diff

messages: + msg147353
2011-11-09 12:21:47amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg147352
2011-11-09 12:16:31krisyssetfiles: + issue13375.diff

nosy: + krisys
messages: + msg147351

keywords: + patch
2011-11-09 07:23:01ncoghlansetstage: needs patch
type: enhancement
versions: + Python 3.3
2011-11-09 07:22:38ncoghlancreate