msg164781 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-07-07 04:27 |
The documentation for the filecmp.dircmp class doesn't mention dircmp.left and dircmp.right.
Being aware of this up front would make certain simplifications easier to think of. For example, knowing about these attributes opens up the possibility of passing dircmp instances around without having to pass the two paths separately (e.g. in certain recursive algorithms involving dircmp). Knowing this also means you can recover the two paths if using the subdirs attribute (whose values are dircmp instances).
|
msg166084 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-07-21 22:09 |
Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning? It would simply point to the dir1 and dir2 arguments that are being passed.
Also, it may not be a helpful feature. If it is a must, then we can just rename self.left to self._left within the *code*, but otherwise, I do not see any change which is required here. I suggest closing this report as wont-fix.
|
msg166088 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-07-21 23:11 |
> Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning?
It adds more meaning because you can't construct self.left and self.right from self.left_list, self.left_only, etc. The latter are children inside the two directories (expressed relatively), while the former are the parent directories. So it's strictly different information.
As I said, there are cases where being able to access the left and right directories simplifies the calling code in a way that is not otherwise possible.
For example, if you are recursively comparing directories, it is natural to recurse on the dircmp.subdirs attribute, whose values are dircmp instances. The caller did not construct these instances, so there is no other way to obtain the left and right directories that were used to construct those instances. Otherwise, the caller has to construct the dircmp instances manually from common_dirs, which reduces the usefulness of the subdirs attribute and the scenarios in which it can be used.
Secondly, there are cases where it is natural to pass dircmp instances around between functions. If one cannot recover the left and right directories from the dircmp instances, then one would also have to pass the left and right directories to those functions, along with the dircmp instances, or else pass just the left and right directories and then reconstruct the dircmp instances from those arguments, which leads to uglier code.
Thirdly, it is better to reuse existing dircmp instances where possible rather than create new ones, because of the penalty after creating a new instance of needing to recalculate the attributes. As the documentation notes, attributes are computed lazily and cached using __getattr__(), so it's better to hold onto existing instances.
If you are still not convinced, I would like the opportunity to provide actual code examples before you close this issue. Then you can tell me if there is an equally simple alternative that does not involve accessing left and right.
I don't see how it hurts to be able to access left and right as attributes and why you would consider concealing this information from the caller behind private attributes. It's a common idiom for constructor arguments to be stored as public attributes.
In this case, I think it's worth documenting because the names of the attributes ("left" and "right") used to store the constructor arguments don't match the names of the constructor arguments themselves ("a" and "b").
|
msg166305 - (view) |
Author: Eli Bendersky (eli.bendersky) * |
Date: 2012-07-24 16:29 |
Yes, code samples would help clarifying the rationale for this request
|
msg166339 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-07-24 23:21 |
Thanks for taking the time to look at this, Eli.
In response to your question, here is one illustrated rationale.
When recursing through a directory using dircmp, it is simplest and cleanest to be able to recurse on the subdirs attribute without having to pass pairs of paths around or reconstruct dircmp instances. You can see this for example in filecmp's own very concise implementation of dircmp.report_full_closure():
def report_full_closure(self): # Report on self and subdirs recursively
self.report()
for sd in self.subdirs.values():
print()
sd.report_full_closure()
However, dircmp's reporting functionality is self-admittedly "lousy":
def report(self): # Print a report on the differences between a and b
# Output format is purposely lousy
print('diff', self.left, self.right)
...
(Incidentally, observe above that dircmp.report() itself uses the 'left' and 'right' attributes.)
Given the limitations of report_full_closure(), etc, it is natural that one might want to write a custom or replacement reporting function with nicer formatting. When doing this, it would be nice to be able to follow that same clean and concise recursion pattern. For example--
def diff(dcmp):
for sd in dcmp.subdirs.values():
diff(sd)
for name in dcmp.diff_files:
print("%s differs in %s and %s" % (name, dcmp.left, dcmp.right))
dcmp = dircmp('dir1', 'dir2')
diff(dcmp)
If one isn't able to access 'left' and 'right' (or if one simply isn't aware of those attributes, which was the case for me at one point), the alternative would be to do something like the following, which is much uglier and less DRY:
import os
def diff2(dcmp, dir1, dir2):
for name, sd in dcmp.subdirs.items():
subdir1 = os.path.join(dir1, name)
subdir2 = os.path.join(dir2, name)
diff2(sd, subdir1, subdir2)
for name in dcmp.diff_files:
print("%s differs in %s and %s" % (name, dir1, dir2))
dcmp = dircmp('dir1', 'dir2')
diff2(dcmp, dir1='dir1', dir2='dir2')
An example like diff() above might even be worth including in the docs as an example of how subdirs can be used to avoid having to manually call os.path.join(...), etc.
There are also non-recursive situations in which being able to access dircmp.left and dircmp.right makes for cleaner code.
|
msg166349 - (view) |
Author: Eli Bendersky (eli.bendersky) * |
Date: 2012-07-25 02:34 |
Makes sense. I agree that publicly exposing the left/right attributes makes sense. But let's do it properly:
1. Add an example to the documentation
2. Add some tests to Lib/test/test_filecmp.py that verify these attributes behave as expected
In addition, I think it makes a lot of sense to add an optional "stream" argument to the report() and report_*() methods, to at leas allow reporting to some custom channel and not solely stdout. The report() method does a lot more than your simple example demonstrates, and it's not very easy to replace its functionality.
Would you like to submit a full patch for this?
|
msg166418 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-07-25 18:26 |
Thanks a lot, Eli, and for the suggestions. I would be happy to prepare a full patch.
Regarding the stream argument, I think there are other changes to dircmp that would be more useful (e.g. issue 12932), but I agree that some form of your suggestion makes sense. Would you mind if I created a separate issue for it to discuss there? I have some suggestions on it, and I would be happy to work on it there.
|
msg166442 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-07-25 23:49 |
Attaching a patch to address Eli's requests (1) and (2).
Since this patch merely adds documentation and tests for existing functionality, is there any reason why this cannot go into Python 3.3? Thanks.
|
msg166461 - (view) |
Author: Eli Bendersky (eli.bendersky) * |
Date: 2012-07-26 06:35 |
I think it can go into 3.3 but only if it gets reviewed by another core dev (we're in release candidate stage now). Senthil - can you review the patch together with me?
As for customizing the stream, yes, go ahead and open a new issue for it, and add me there as nosy.
|
msg166465 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-07-26 07:51 |
Sounds good. And for the record, new issue created here: issue 15454
|
msg168183 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-08-14 09:29 |
Hi Chris & Eli, - Sorry that I missed this issue.
Chris - agree to your rationale. I can see how having self.left and self.right documented can add value, The diff example was useful. Initially, I did have some doubts in terms how it could be useful when the args are sent by the user, your example clarified. Thanks!.
As Eli has looked at this one too, I shall commit the patch. Everything is good.
|
msg168185 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2012-08-14 09:39 |
As this is not adding any feature, but just an additional clarification to the existing attribute together with some useful documentation, I believe this can go in 2.7, 3.2 and 3.3
Please correct me if I am wrong here.
|
msg168186 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-08-14 09:46 |
Thanks a lot, Senthil. I appreciate it.
|
msg168245 - (view) |
Author: Chris Jerdonek (chris.jerdonek) * |
Date: 2012-08-15 00:04 |
Senthil, here is a recent e-mail and response in which I asked about documentation changes and adding tests during feature freeze:
http://mail.python.org/pipermail/python-dev/2012-July/121138.html
Also, here is a recent example of a documentation clarification that went into 2.7, 3.2, and tip:
http://bugs.python.org/issue15554
Thanks again.
|
msg168253 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-08-15 01:51 |
New changeset 7590dec388a7 by R David Murray in branch '3.2':
#15269: document dircmp.left and right, and add tests for them.
http://hg.python.org/cpython/rev/7590dec388a7
New changeset c592e5a8fa4f by R David Murray in branch 'default':
Merge #15269: document dircmp.left and right, and add tests for them.
http://hg.python.org/cpython/rev/c592e5a8fa4f
New changeset e64d4518b23c by R David Murray in branch '2.7':
#15269: document dircmp.left and right.
http://hg.python.org/cpython/rev/e64d4518b23c
|
msg168254 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2012-08-15 01:52 |
Thanks, Chris.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:32 | admin | set | github: 59474 |
2012-08-15 01:52:28 | r.david.murray | set | status: open -> closed
nosy:
+ r.david.murray messages:
+ msg168254
resolution: fixed stage: resolved |
2012-08-15 01:51:22 | python-dev | set | nosy:
+ python-dev messages:
+ msg168253
|
2012-08-15 00:04:50 | chris.jerdonek | set | messages:
+ msg168245 |
2012-08-14 09:46:44 | chris.jerdonek | set | messages:
+ msg168186 |
2012-08-14 09:39:30 | orsenthil | set | messages:
+ msg168185 versions:
+ Python 2.7, Python 3.2, Python 3.3, - Python 3.4 |
2012-08-14 09:29:28 | orsenthil | set | messages:
+ msg168183 |
2012-08-13 23:48:48 | chris.jerdonek | set | nosy:
+ asvetlov
|
2012-07-26 07:51:40 | chris.jerdonek | set | messages:
+ msg166465 |
2012-07-26 06:35:34 | eli.bendersky | set | messages:
+ msg166461 |
2012-07-25 23:49:10 | chris.jerdonek | set | files:
+ issue-15269-2.patch
messages:
+ msg166442 |
2012-07-25 18:26:34 | chris.jerdonek | set | messages:
+ msg166418 |
2012-07-25 02:34:10 | eli.bendersky | set | messages:
+ msg166349 |
2012-07-24 23:21:22 | chris.jerdonek | set | messages:
+ msg166339 |
2012-07-24 16:29:29 | eli.bendersky | set | nosy:
+ eli.bendersky
messages:
+ msg166305 versions:
+ Python 3.4, - Python 3.3 |
2012-07-21 23:11:36 | chris.jerdonek | set | messages:
+ msg166088 |
2012-07-21 22:09:28 | orsenthil | set | nosy:
+ orsenthil messages:
+ msg166084
|
2012-07-07 05:44:50 | chris.jerdonek | set | files:
+ issue-15269-1.patch keywords:
+ patch |
2012-07-07 04:27:59 | chris.jerdonek | create | |