This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: shutil.copytree glob-style filtering [patch]
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: belopolsky, draghuram, georg.brandl, giampaolo.rodola, gustavo, tarek
Priority: normal Keywords: patch

Created on 2008-04-20 21:28 by tarek, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
copytree.patch tarek, 2008-05-23 21:29
Messages (16)
msg65652 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-20 21:28
Here's a first draft of a small addon to shutil.copytree.

This patch allows excluding some folders or files from the copy, given
glob-style patterns. A callable can also be provided instead of the
patterns, for a more complex filtering.

I didn't upgrade Doc/shutil.rst yet in this patch, as this can be done
when the change will be accepted and in its final shape I guess.
msg65663 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-21 13:41
On the interface, I would suggest renaming 'exclude' to 'ignore' for 
consistency with filecmp.dircmp. Also consider detecting file separator 
in the patterns and interpreting them as an absolute (if 
pattern.startswith(pathsep)) or relative with respect to src. 

On the implementation, consider making 'exclude_files' a set for a 
faster lookup.   It should also be possible to refactor the code to 
avoid checking the type of 'exclude' on every file and every recursion.
msg65670 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-22 07:39
I changed the patch based on all remarks. For the absolute path, I was
wondering if it would be useful since calls are recursive, relative to
the visited directory.
msg65682 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-22 19:56
Is there any reason for rmtree also to not support this exclusion
feature? Both copytree and rmtree explicitly iterate over list of names
and as I see it, this exclusion is really about which names to ignore.
Already, copytree and rmtree have inconsistencies (rmtree has 'onerror'
while 'copytree' doesn't) and it would be nice to not add more.
msg65906 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-27 23:00
Agreed, rmtree should have it as well. I'll add that in the patch as well,
msg65907 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-27 23:16
while working on the patch to add the same feature in rmtree, I realized
this is a non sense since the root folder itself is removed at the end
of the function when all its content is removed.

So, unless we change this behavior, which I doubt it is a good idea, it
won't be possible.

Maybe another API could be added in shutil, in order to do any kind of
treatment in a tree, like removing files, or whatever, and without
copying it like copytree does.
msg65908 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-27 23:32
I have thaught of various ways to write this new API for the deletion
use case, but I think nothing makes it easier and shorter than a simple
os.walk call.
msg65909 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-04-28 00:17
This patch includes the documentation for shutils.rst as well. (I
removed the older patches)
msg65919 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-28 14:12
My update with email failed so I am just copying my response here:

>  while working on the patch to add the same feature in rmtree, I realized
>  this is a non sense since the root folder itself is removed at the end
>  of the function when all its content is removed.

Indeed. Sorry about that.

>  So, unless we change this behavior, which I doubt it is a good idea, it
>  won't be possible.

I agree. But in general, it would be nice to separate file list
generation and the actual operation. Something similar to shell where
it resolves the pattern while the actual command itself cares only
about the files passed to it. This is not necessarily a comment on
this patch which I am hoping I can check it out soon.
msg65924 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-28 17:49
The patch looks good to me.
msg65925 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-04-28 17:53
I forgot to add that the example provided in rst doc is incorrect. The
copytree() in that example should be given destination path as well. In
addition, the docstring for copytree mentions "which is a directory
list". "directory list" is a bit vague and should ideally be replaced by
something like "list of elements" (which is what appears in the doc) or
"list of entries".
msg66149 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-05-03 08:33
Right, thanks.

I have corrected the doc, and pushed some examples at the bottom of the
module documentation.
msg67158 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-05-21 15:04
patch with the new trunk
msg67225 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-05-23 10:02
Hi Tarek,

here's a review:

* The new docs are not very clear about ignore_patterns being a function
factory. E.g.:

"""The callable must return a list of folder and file names relative to
the path, that will be ignored in the copy process.
:func:`ignore_patterns` is an example of such callable."""

Rather, the *return value* of ignore_patterns is an example of such a
callable.

* The new docs should also note that copytree is called recursively, and
therefore the ignore callable will be called once for each directory
that is copied.

* Instead of "path and its elements" the terminology should be
"directory" and "the list of its contents, as returned by os.listdir()".
Likewise, "folder" should be "directory".

* The second new example makes me wonder if *ignore* is the correct name
for the parameter. Is *filter* better?

* A nit; the signature should be "copytree(src, dst[, symlinks[, ignore]])".

* The patch adds a space in the definition of rmtree().
msg67269 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2008-05-23 21:29
Thanks Georg, I have changed the patch accordingly.

There's one issue left: the name of the parameter (ignore)

I have renamed it like this on Alexander suggestion, for consistency
with filecmp.dircmp which uses ignore. 

By the way, I was wondering: do we need to used reStructuredText as well
in function doctstrings ?
msg69276 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-05 10:13
Committed in r64722. Thanks everyone!
History
Date User Action Args
2022-04-11 14:56:33adminsetgithub: 46915
2008-07-05 10:13:53georg.brandlsetstatus: open -> closed
resolution: fixed
messages: + msg69276
2008-06-07 19:30:15giampaolo.rodolasetnosy: + giampaolo.rodola
2008-05-23 21:30:40tareksetfiles: - copytree2.patch
2008-05-23 21:30:06tareksetfiles: + copytree.patch
messages: + msg67269
2008-05-23 10:03:09georg.brandlsetnosy: + georg.brandl
messages: + msg67225
2008-05-21 15:05:45tareksetfiles: - copytree.patch
2008-05-21 15:05:25tareksetfiles: + copytree2.patch
messages: + msg67158
2008-05-03 08:38:15tareksetfiles: - copytree.patch
2008-05-03 08:38:09tareksetfiles: + copytree.patch
2008-05-03 08:34:58tareksetfiles: - copytree.patch
2008-05-03 08:34:51tareksetfiles: + copytree.patch
2008-05-03 08:33:29tareksetfiles: - shutil.copytree.patch
2008-05-03 08:33:18tareksetfiles: + copytree.patch
messages: + msg66149
2008-04-28 17:53:29draghuramsetmessages: + msg65925
2008-04-28 17:49:18draghuramsetmessages: + msg65924
2008-04-28 14:12:04draghuramsetmessages: + msg65919
2008-04-28 00:17:36tareksetfiles: - shutil.copytree.filtering.patch
2008-04-28 00:17:31tareksetfiles: - shutil.copytree.filtering.patch
2008-04-28 00:17:15tareksetfiles: + shutil.copytree.patch
messages: + msg65909
2008-04-27 23:32:34tareksetmessages: + msg65908
2008-04-27 23:16:52tareksetmessages: + msg65907
2008-04-27 23:00:05tareksetmessages: + msg65906
2008-04-22 19:56:34draghuramsetmessages: + msg65682
2008-04-22 07:40:10tareksetfiles: + shutil.copytree.filtering.patch
messages: + msg65670
2008-04-21 13:41:59belopolskysetnosy: + belopolsky
messages: + msg65663
2008-04-21 10:51:10draghuramsetnosy: + draghuram
2008-04-20 22:27:24gustavosetnosy: + gustavo
2008-04-20 21:28:22tarekcreate