This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Sequence Matcher from diff lib is not implementing longest common substring problem correctly
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Syam Mohan, tim.peters
Priority: normal Keywords:

Created on 2017-07-13 15:05 by Syam Mohan, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
script.py Syam Mohan, 2017-07-13 15:05
Messages (2)
msg298287 - (view) Author: Syam (Syam Mohan) Date: 2017-07-13 15:05
Was seeing this lib from difflib import SequenceMatcher not returning the biggest common substring always..

try the example from attachment..

it is returning wrong value, not the biggest substring
msg298295 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-07-13 16:02
This is an unfortunate consequence of the default "autojunk" feature.  You can turn that off by passing `autojunk=False`, like so:

match = SequenceMatcher(None, string1, string2, autojunk=False)...
                                              ^^^^^^^^^^^^^^^^

Then it returns a match of size 534.
History
Date User Action Args
2022-04-11 14:58:49adminsetgithub: 75103
2017-07-14 20:12:32terry.reedysetstatus: open -> closed
resolution: not a bug
components: + Library (Lib), - Build
stage: resolved
2017-07-13 16:02:55tim.peterssetnosy: + tim.peters
messages: + msg298295
2017-07-13 15:05:55Syam Mohancreate