This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: SequenceMatcher's algorithm is not correct
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Contego, orsenthil, rhettinger, tim.peters
Priority: normal Keywords:

Created on 2016-01-19 11:06 by Contego, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (2)
msg258582 - (view) Author: Contego (Contego) Date: 2016-01-19 11:06
For strings 'aaaaaa', 'aabaaa' SequenceMatcher's algorithm finds only common substring 'aaa', while well-known classic LCS algorithm: http://www.geeksforgeeks.org/printing-longest-common-subsequence/ finds 'aa' and 'aaa'.

Is it the price for "best case time is linear", as mentioned in difflib's documentation? Are there any other reasons not to implement classic LCS algorith (e.g. memory limits?)? If no, maybe it will be usefull to create subclass StrictSequenceMatcher?
msg258701 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2016-01-20 17:53
Please read the responses to this older report:

http://bugs.python.org/issue25391

As they say, it's functioning as designed and documented, so this isn't "a bug".  For that reason I'm closing this as "not a bug".

As they also say, there are many other possible algorithms (LCS isn't the only other one in use).  Opening an enhancement request instead (to implement additional algorithms) could make sense, but won't get anywhere unless someone volunteers to do the work.
History
Date User Action Args
2022-04-11 14:58:26adminsetgithub: 70338
2016-01-20 17:53:02tim.peterssetstatus: open -> closed

nosy: + tim.peters
messages: + msg258701

resolution: not a bug
stage: test needed -> resolved
2016-01-20 08:55:46orsenthilsetnosy: + orsenthil
2016-01-19 13:56:10SilentGhostsetnosy: + rhettinger

components: + Library (Lib)
stage: test needed
2016-01-19 11:06:38Contegocreate