Author ncoghlan
Recipients eli.bendersky, eric.araujo, giampaolo.rodola, ncoghlan, pitrou, ubershmekel
Date 2012-02-08.23:57:49
SpamBayes Score 1.69356e-06
Marked as misclassified No
Message-id <1328745470.26.0.684848904942.issue13968@psf.upfronthosting.co.za>
In-reply-to
Content
I think it's important to be clear on what the walkdir API aims to be: a composable toolkit of utilities for directory tree processing. It's overall design is inspired directly by the itertools module.

Yes, it started life as a simple proposal to add shutil.filtered_walk (http://bugs.python.org/issue13229), but I soon realised that implementing this solely as a monolothic function would be foolish, since that approach isn't composable. What if you just wanted file filtering? Or depth limiting? Having it as a filtering toolkit lets you choose the exact filters you need for a given use case. walkdir.filtered_walk() is just an API for composing filtering pipelines without needing to pass the same information to multiple pipeline stages.

However, along with that itertools inspired iterator pipeline based design, I've also inherited Raymond's preference that particular *use cases* start life as recipes in the documentation.

A recursive glob is just a basic walkdir pipeline composition:

>>> from walkdir import file_paths, include_files
>>> def globtree(pattern, path='.'):
...     return file_paths(include_files(os.walk(path), pattern))
        
Since filtered_walk() is just a pipeline builder, the composition can also be written:

>>> from walkdir import file_paths, filtered_walk
>>> def globtree(pattern, path='.'):
...     return file_paths(filtered_walk(path, included_files=[pattern]))

That latter approach then suggests an alternative signature for globtree:

def globtree(*patterns, **kwds):
    kwds.setdefault("top", ".")
    return file_paths(filtered_walk(included_files=patterns, **kwds))

>>> print '\n'.join(sorted(globtree('*.rst')))
./index.rst
./py3k_binary_protocols.rst
./venv_bootstrap.rst

>>> print '\n'.join(sorted(globtree('*.rst', '*.py')))
./conf.py
./index.rst
./py3k_binary_protocols.rst
./venv_bootstrap.rst

On a somewhat related note, I'd also like to see us start concentrating higher level shell utilities in the shutil namespace so users don't have to check multiple locations for shell-related functionality quite so often (that is, I'd prefer shutil.globtree over glob.rglob).
History
Date User Action Args
2012-02-08 23:57:50ncoghlansetrecipients: + ncoghlan, pitrou, giampaolo.rodola, eric.araujo, eli.bendersky, ubershmekel
2012-02-08 23:57:50ncoghlansetmessageid: <1328745470.26.0.684848904942.issue13968@psf.upfronthosting.co.za>
2012-02-08 23:57:49ncoghlanlinkissue13968 messages
2012-02-08 23:57:49ncoghlancreate