Author terry.reedy
Recipients Peter.Waller, chipx86, collinwinter, eli.bendersky, ggenellina, loewis, r.david.murray, rhettinger, terry.reedy, tfaing, tim.peters
Date 2011-06-15.18:34:14
SpamBayes Score 1.21009e-12
Marked as misclassified No
Message-id <1308162855.87.0.913392854159.issue1711800@psf.upfronthosting.co.za>
In-reply-to
Content
I believe this issue should be closed and have set it to pending.

The original report of a 'bug' and the two 'testcases' were and are invalid as they are based on an incorrect understanding of SequenceMatcher. It is not a diff program and in particular not a line diff program but the function used internally by such. It works on sequences of characters, lines, numbers, or anything else. Given the character strings 'abc' and 'abcd\def' it correctly reports that the second is a copy of 3 chars from the first plus insertion of 5 more.

Gabriel correctly suggested the above in suggesting that if one wants to compare sequences of text lines, one might use Differ. One could also use SequenceMatcher directly, but this loses the diff-like formatting and report of within-line differences. I think this issue should have been closed then.

I do not know what functionality Andrew thinks Christian was talking about. Using Differ with

a = ['abc\n']
b = ['abcd\n', 'def\n']
for line in difflib.Differ().compare(a,b): print(line, end='')

# prints
- abc
+ abcd
?    +
+ def

One line is replaced with two, with the extra info that the first new line is the old line with an extra char. I do not believe that 'any diffing program' will report the latter. The '?' lines are easily filtered out if not wanted.

The patch by Peter has no motivation that I can see other than the idea that replacing a subsequence with one of a different length is somehow bad. Tim Peters did not think so and neither do I -- or Guido. Unequal replacement is built into the syntax of Python:

>>> s = [1,2,3]
>>> s[1:2] = [4,5,6]
>>> s
[1, 4, 5, 6, 3]

I would not be surprised it the proposed change broke some existing application or degraded performance a bit.
History
Date User Action Args
2011-06-15 18:34:15terry.reedysetrecipients: + terry.reedy, tim.peters, loewis, collinwinter, rhettinger, ggenellina, chipx86, tfaing, r.david.murray, eli.bendersky, Peter.Waller
2011-06-15 18:34:15terry.reedysetmessageid: <1308162855.87.0.913392854159.issue1711800@psf.upfronthosting.co.za>
2011-06-15 18:34:15terry.reedylinkissue1711800 messages
2011-06-15 18:34:14terry.reedycreate