Title: wrong result for difflib.SequenceMatcher
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7
Status: closed Resolution: not a bug
Assigned To: tim.peters Nosy List: Boris Yang, iritkatriel, rhettinger, tim.peters
Created on 2018-11-21 07:13 by Boris Yang, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg330172 - (view) Author: (Boris Yang) Date: 2018-11-21 07:13
How to repeat:

from difflib import SequenceMatcher
seqMatcher = SequenceMatcher(None, u"德阳孩子", u"孩子德阳")

Expect Result:
[Match(a=0, b=3, size=2), Match(a=2, b=0, size=2), Match(a=5, b=5, size=0)]

Current Result:
[Match(a=0, b=3, size=2), Match(a=5, b=5, size=0)]
msg330177 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-11-21 07:57
The "expected result" listed isn't a valid output for get_matching_blocks() which is documented to return "triples are monotonically increasing in i and j".    In your example, the "a" sequence is increasing: 0, 2, 5 but the "b" sequence is not monotonic: 3 0 5.

SequenceMatcher.get_matching_blocks() isn't designed to locate swapped blocks from "abcd" to "cdab".
msg377194 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-09-19 23:17
Can this issue be closed? It looks like Boris simply misunderstood the semantics of difflib, which Raymond has clarified.
