Author terry.reedy
Recipients Springem Springsbee, terry.reedy, tim.peters, xtreak
Date 2018-10-26.20:25:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1540585526.46.0.788709270274.issue35079@psf.upfronthosting.co.za>
In-reply-to
Content
We can assume that "substring 'CA'" was meant to be "substring 'AC'", but as explained, missing 'AC' is not a bug.  (Tim wrote the module.)

I read the doc, and 'non-overlapping' is implied in the SequenceMatcher entry at the top of the file.

"The idea is to find the longest contiguous matching subsequence that contains no “junk” elements; ... The same idea is then applied recursively to the pieces of the sequences to the left and to the right of the matching subsequence."

However, a user of SequenceMatcher could easily miss that, and its implication, as Springem did.  For clarity, I think we should add 'non-overlapping to the first line of the .get_matching_blocks entry, which is in the middle of the page. "Return list of triples describing non-overlapping matching subsequences."

I also think "i+n != i' or j+n != j'" should be changed to "i+n < i' or j+n < j'" as '>' would mean overlapping.  So != must mean <.

I will prepare a doc PR later.
History
Date User Action Args
2018-10-26 20:25:26terry.reedysetrecipients: + terry.reedy, tim.peters, xtreak, Springem Springsbee
2018-10-26 20:25:26terry.reedysetmessageid: <1540585526.46.0.788709270274.issue35079@psf.upfronthosting.co.za>
2018-10-26 20:25:26terry.reedylinkissue35079 messages
2018-10-26 20:25:26terry.reedycreate