Message 264421 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	ethan.furman, loewis, palaviv, rhettinger, serhiy.storchaka
Date	2016-04-28.08:00:47
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1461830448.06.0.724020923241.issue26860@psf.upfronthosting.co.za>
In-reply-to

Content
Sorry, but I disagree with Raymond in many points. > Classes are normally named with CamelCase. Also, "walk_result" or "WalkResult" seems like an odd name that doesn't really fit. DirEntry or DirInfo is a better match (see the OP's example, "for dir_entry in walk_it: ...") See "stat_result", "statvfs_result", "waitid_result", "uname_result", and "times_result". DirEntry is already used in the os module. And if accept this feature, needed separate types for walk() and fwalk() results. > The "versionchanged" should be a "versionadded". os.walk() is not new. Just it's result is changed. Class "walk_result" can be tagged with "versionadded", but I'm not sure there is a need to document it separately. The documentation of the os module already too large. "uname_result" and "times_result" are not documented. > The docs and code for fwalk() needs to be harmonized with walk() so the the tuple fields use the same names: change (root, dirs, files) to (dirpath, dirnames, filenames). (root, dirs, files) is shorter than (dirpath, dirnames, filenames) and these names were used with os.walk() and os.fwalk() for years. I general, I have doubts about this feature. 1. There is little backward incompatibility. At least pickle is not backward compatible, and I guess other serialization methods. 2. os.walk() and os.fwalk() are purposed to be used in for loop with immediate unpacking result tuple: for root, dirs, files in os.walk(...): ... Adding named tuple doesn't add any benefit for common case. In OP case, you can either use fwalk-based implementation of walk (issue15200): def fwalk_as_walk(args, kwargs): for x in os.fwalk(args, *kwargs): yield x[:-1] or just ignore the rest of tuple items: for root, _ in walk_it: ... 3. Using namedtuple is slower and consumes more memory than using tuple. Even for FS-related operation like os.walk() this can matter. A lot of code is optimized for exact tuples, with namedtuple this optimization is lost. 4. New names (dirpath, dirnames, filenames) are questionable. Why not use undersores (dir_names)? "dir" in dirpath refers to the current proceeded directory, but "dir" in dirnames refers to it's subdirectories. Currently you are free to use short names (root, dirs, files) from examples or what you prefer, but with namedtuple you are sticked with standard names forever. There are no names that satisfy everybody. 5. Third-party walk-like iterators generate tuples, so you can't use attribute access in too general code.

Sorry, but I disagree with Raymond in many points.

> Classes are normally named with CamelCase.  Also, "walk_result" or "WalkResult" seems like an odd name that doesn't really fit.   DirEntry or DirInfo is a better match (see the OP's example, "for dir_entry in walk_it: ...")

See "stat_result", "statvfs_result", "waitid_result", "uname_result", and "times_result". DirEntry is already used in the os module. And if accept this feature, needed separate types for walk() and fwalk() results.

> The "versionchanged" should be a "versionadded".

os.walk() is not new. Just it's result is changed. Class "walk_result" can be tagged with "versionadded", but I'm not sure there is a need to document it separately. The documentation of the os module already too large. "uname_result" and "times_result" are not documented.

> The docs and code for fwalk() needs to be harmonized with walk() so the the tuple fields use the same names:  change (root, dirs, files) to (dirpath, dirnames, filenames).

(root, dirs, files) is shorter than (dirpath, dirnames, filenames) and these names were used with os.walk() and os.fwalk() for years.

I general, I have doubts about this feature.

1. There is little backward incompatibility. At least pickle is not backward compatible, and I guess other serialization methods.

2. os.walk() and os.fwalk() are purposed to be used in for loop with immediate unpacking result tuple:

    for root, dirs, files in os.walk(...):
        ...

Adding named tuple doesn't add any benefit for common case.

In OP case, you can either use fwalk-based implementation of walk (issue15200):

    def fwalk_as_walk(*args, **kwargs):
        for x in os.fwalk(*args, **kwargs):
            yield x[:-1]

or just ignore the rest of tuple items:

    for root, *_ in walk_it:
        ...

3. Using namedtuple is slower and consumes more memory than using tuple. Even for FS-related operation like os.walk() this can matter. A lot of code is optimized for exact tuples, with namedtuple this optimization is lost.

4. New names (dirpath, dirnames, filenames) are questionable. Why not use undersores (dir_names)? "dir" in dirpath refers to the current proceeded directory, but "dir" in dirnames refers to it's subdirectories. Currently you are free to use short names (root, dirs, files) from examples or what you prefer, but with namedtuple you are sticked with standard names forever. There are no names that satisfy everybody.

5. Third-party walk-like iterators generate tuples, so you can't use attribute access in too general code.

History
Date	User	Action	Args
2016-04-28 08:00:48	serhiy.storchaka	set	recipients: + serhiy.storchaka, loewis, rhettinger, ethan.furman, palaviv
2016-04-28 08:00:48	serhiy.storchaka	set	messageid: <1461830448.06.0.724020923241.issue26860@psf.upfronthosting.co.za>
2016-04-28 08:00:48	serhiy.storchaka	link	issue26860 messages
2016-04-28 08:00:47	serhiy.storchaka	create