This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients jonathan-lp, tim.peters
Date 2022-02-06.22:42:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1644187361.61.0.218680261016.issue46667@roundup.psfhosted.org>
In-reply-to
Content
SequenceMatcher looks for the longest _contiguous_ match. "UNIQUESTRING" isn't the longest by far when autojunk is False, but is the longest when autojunk is True. All those bpopular characters then effectively prevent finding a longer match than 'QUESTR' (capital 'I" is also in bpopular) directly.

The effects of autojunk can be surprising, and it would have been better if it were False by default. But I don't see anything unexpected here. Learn from experience and force it to False yourself ;-) BTW, it was introduced as a way to greatly speed comparing files of code, viewing them as sequences of lines. In that context, autojunk is rarely surprising and usually helpful. But it more often backfires when comparing strings (viewed as sequences of characters) :-(
History
Date User Action Args
2022-02-06 22:42:41tim.peterssetrecipients: + tim.peters, jonathan-lp
2022-02-06 22:42:41tim.peterssetmessageid: <1644187361.61.0.218680261016.issue46667@roundup.psfhosted.org>
2022-02-06 22:42:41tim.peterslinkissue46667 messages
2022-02-06 22:42:41tim.peterscreate