classification
Title: Document dircmp.left and dircmp.right
Type: Stage: resolved
Components: Documentation Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: asvetlov, chris.jerdonek, docs@python, eli.bendersky, orsenthil, python-dev, r.david.murray
Priority: normal Keywords: easy, patch

Created on 2012-07-07 04:27 by chris.jerdonek, last changed 2012-08-15 01:52 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
issue-15269-1.patch chris.jerdonek, 2012-07-07 05:44 review
issue-15269-2.patch chris.jerdonek, 2012-07-25 23:49
Messages (16)
msg164781 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-07 04:27
The documentation for the filecmp.dircmp class doesn't mention dircmp.left and dircmp.right.

Being aware of this up front would make certain simplifications easier to think of.  For example, knowing about these attributes opens up the possibility of passing dircmp instances around without having to pass the two paths separately (e.g. in certain recursive algorithms involving dircmp).  Knowing this also means you can recover the two paths if using the subdirs attribute (whose values are dircmp instances).
msg166084 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-07-21 22:09
Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning? It would simply point to the dir1 and dir2 arguments that are being passed. 

Also, it may not be a helpful feature. If it is a must, then we can just rename self.left to self._left within the *code*, but otherwise, I do not see any change which is required here.  I suggest closing this report as wont-fix.
msg166088 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-21 23:11
> Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning?

It adds more meaning because you can't construct self.left and self.right from self.left_list, self.left_only, etc.  The latter are children inside the two directories (expressed relatively), while the former are the parent directories.  So it's strictly different information.

As I said, there are cases where being able to access the left and right directories simplifies the calling code in a way that is not otherwise possible.

For example, if you are recursively comparing directories, it is natural to recurse on the dircmp.subdirs attribute, whose values are dircmp instances.  The caller did not construct these instances, so there is no other way to obtain the left and right directories that were used to construct those instances.  Otherwise, the caller has to construct the dircmp instances manually from common_dirs, which reduces the usefulness of the subdirs attribute and the scenarios in which it can be used.

Secondly, there are cases where it is natural to pass dircmp instances around between functions.  If one cannot recover the left and right directories from the dircmp instances, then one would also have to pass the left and right directories to those functions, along with the dircmp instances, or else pass just the left and right directories and then reconstruct the dircmp instances from those arguments, which leads to uglier code.

Thirdly, it is better to reuse existing dircmp instances where possible rather than create new ones, because of the penalty after creating a new instance of needing to recalculate the attributes.  As the documentation notes, attributes are computed lazily and cached using __getattr__(), so it's better to hold onto existing instances.

If you are still not convinced, I would like the opportunity to provide actual code examples before you close this issue.  Then you can tell me if there is an equally simple alternative that does not involve accessing left and right.

I don't see how it hurts to be able to access left and right as attributes and why you would consider concealing this information from the caller behind private attributes.  It's a common idiom for constructor arguments to be stored as public attributes.

In this case, I think it's worth documenting because the names of the attributes ("left" and "right") used to store the constructor arguments don't match the names of the constructor arguments themselves ("a" and "b").
msg166305 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-07-24 16:29
Yes, code samples would help clarifying the rationale for this request
msg166339 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-24 23:21
Thanks for taking the time to look at this, Eli.

In response to your question, here is one illustrated rationale.

When recursing through a directory using dircmp, it is simplest and cleanest to be able to recurse on the subdirs attribute without having to pass pairs of paths around or reconstruct dircmp instances.  You can see this for example in filecmp's own very concise implementation of dircmp.report_full_closure():

    def report_full_closure(self): # Report on self and subdirs recursively
        self.report()
        for sd in self.subdirs.values():
            print()
            sd.report_full_closure()

However, dircmp's reporting functionality is self-admittedly "lousy":

    def report(self): # Print a report on the differences between a and b
        # Output format is purposely lousy
        print('diff', self.left, self.right)
        ...

(Incidentally, observe above that dircmp.report() itself uses the 'left' and 'right' attributes.)

Given the limitations of report_full_closure(), etc, it is natural that one might want to write a custom or replacement reporting function with nicer formatting.  When doing this, it would be nice to be able to follow that same clean and concise recursion pattern.  For example--

    def diff(dcmp):
        for sd in dcmp.subdirs.values():
            diff(sd)
        for name in dcmp.diff_files:
            print("%s differs in %s and %s" % (name, dcmp.left, dcmp.right))

    dcmp = dircmp('dir1', 'dir2')
    diff(dcmp)

If one isn't able to access 'left' and 'right' (or if one simply isn't aware of those attributes, which was the case for me at one point), the alternative would be to do something like the following, which is much uglier and less DRY:

    import os

    def diff2(dcmp, dir1, dir2):
        for name, sd in dcmp.subdirs.items():
            subdir1 = os.path.join(dir1, name)
            subdir2 = os.path.join(dir2, name)
            diff2(sd, subdir1, subdir2)
        for name in dcmp.diff_files:
            print("%s differs in %s and %s" % (name, dir1, dir2))

    dcmp = dircmp('dir1', 'dir2')
    diff2(dcmp, dir1='dir1', dir2='dir2')

An example like diff() above might even be worth including in the docs as an example of how subdirs can be used to avoid having to manually call os.path.join(...), etc.

There are also non-recursive situations in which being able to access dircmp.left and dircmp.right makes for cleaner code.
msg166349 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-07-25 02:34
Makes sense. I agree that publicly exposing the left/right attributes makes sense. But let's do it properly:

1. Add an example to the documentation
2. Add some tests to Lib/test/test_filecmp.py that verify these attributes behave as expected

In addition, I think it makes a lot of sense to add an optional "stream" argument to the report() and report_*() methods, to at leas allow reporting to some custom channel and not solely stdout. The report() method does a lot more than your simple example demonstrates, and it's not very easy to replace its functionality.

Would you like to submit a full patch for this?
msg166418 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-25 18:26
Thanks a lot, Eli, and for the suggestions.  I would be happy to prepare a full patch.

Regarding the stream argument, I think there are other changes to dircmp that would be more useful (e.g. issue 12932), but I agree that some form of your suggestion makes sense.  Would you mind if I created a separate issue for it to discuss there?  I have some suggestions on it, and I would be happy to work on it there.
msg166442 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-25 23:49
Attaching a patch to address Eli's requests (1) and (2).

Since this patch merely adds documentation and tests for existing functionality, is there any reason why this cannot go into Python 3.3?  Thanks.
msg166461 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2012-07-26 06:35
I think it can go into 3.3 but only if it gets reviewed by another core dev (we're in release candidate stage now). Senthil - can you review the patch together with me?

As for customizing the stream, yes, go ahead and open a new issue for it, and add me there as nosy.
msg166465 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-07-26 07:51
Sounds good.  And for the record, new issue created here: issue 15454
msg168183 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-08-14 09:29
Hi Chris & Eli, - Sorry that I missed this issue.

Chris - agree to your rationale. I can see how having self.left and self.right documented can add value, The diff example was useful. Initially, I did have some doubts in terms how it could be useful when the args are sent by the user, your example clarified. Thanks!.

As Eli has looked at this one too, I shall commit the patch. Everything is good.
msg168185 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-08-14 09:39
As this is not adding any feature, but just an additional clarification to the existing attribute together with some useful documentation, I believe this can go in 2.7, 3.2 and 3.3

Please correct me if I am wrong here.
msg168186 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-08-14 09:46
Thanks a lot, Senthil.  I appreciate it.
msg168245 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-08-15 00:04
Senthil, here is a recent e-mail and response in which I asked about documentation changes and adding tests during feature freeze:

http://mail.python.org/pipermail/python-dev/2012-July/121138.html

Also, here is a recent example of a documentation clarification that went into 2.7, 3.2, and tip:

http://bugs.python.org/issue15554

Thanks again.
msg168253 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-08-15 01:51
New changeset 7590dec388a7 by R David Murray in branch '3.2':
#15269: document dircmp.left and right, and add tests for them.
http://hg.python.org/cpython/rev/7590dec388a7

New changeset c592e5a8fa4f by R David Murray in branch 'default':
Merge #15269: document dircmp.left and right, and add tests for them.
http://hg.python.org/cpython/rev/c592e5a8fa4f

New changeset e64d4518b23c by R David Murray in branch '2.7':
#15269: document dircmp.left and right.
http://hg.python.org/cpython/rev/e64d4518b23c
msg168254 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-15 01:52
Thanks, Chris.
History
Date User Action Args
2012-08-15 01:52:28r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg168254

resolution: fixed
stage: resolved
2012-08-15 01:51:22python-devsetnosy: + python-dev
messages: + msg168253
2012-08-15 00:04:50chris.jerdoneksetmessages: + msg168245
2012-08-14 09:46:44chris.jerdoneksetmessages: + msg168186
2012-08-14 09:39:30orsenthilsetmessages: + msg168185
versions: + Python 2.7, Python 3.2, Python 3.3, - Python 3.4
2012-08-14 09:29:28orsenthilsetmessages: + msg168183
2012-08-13 23:48:48chris.jerdoneksetnosy: + asvetlov
2012-07-26 07:51:40chris.jerdoneksetmessages: + msg166465
2012-07-26 06:35:34eli.benderskysetmessages: + msg166461
2012-07-25 23:49:10chris.jerdoneksetfiles: + issue-15269-2.patch

messages: + msg166442
2012-07-25 18:26:34chris.jerdoneksetmessages: + msg166418
2012-07-25 02:34:10eli.benderskysetmessages: + msg166349
2012-07-24 23:21:22chris.jerdoneksetmessages: + msg166339
2012-07-24 16:29:29eli.benderskysetnosy: + eli.bendersky

messages: + msg166305
versions: + Python 3.4, - Python 3.3
2012-07-21 23:11:36chris.jerdoneksetmessages: + msg166088
2012-07-21 22:09:28orsenthilsetnosy: + orsenthil
messages: + msg166084
2012-07-07 05:44:50chris.jerdoneksetfiles: + issue-15269-1.patch
keywords: + patch
2012-07-07 04:27:59chris.jerdonekcreate