classification
Title: difflib.SequenceMatcher stores matching blocks as tuples, not Match named tuples
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: drevicko, python-dev, rhettinger, terry.reedy, tim.peters
Priority: high Keywords:

Created on 2014-06-02 10:03 by drevicko, last changed 2014-06-21 19:05 by rhettinger. This issue is now closed.

Messages (5)
msg219565 - (view) Author: (drevicko) Date: 2014-06-02 10:03
difflib.SequenceMatcher.get_matching_blocks() last lines:


        non_adjacent.append( (la, lb, 0) )
        self.matching_blocks = non_adjacent
        return map(Match._make, self.matching_blocks)

should be something like:

        non_adjacent.append( (la, lb, 0) )
        self.matching_blocks = map(Match._make, non_adjacent)
        return self.matching_blocks
msg219906 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-06-07 01:43
Why do you think this is a bug? What behavior both looks wrong and gets improved by the change?
msg221153 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-06-21 04:43
> What behavior both looks wrong and gets improved by the change?

The incorrect behavior is that matching_blocks is incorrectly cached so that calls to get_matching_blocks() returns an answer without the named tuple (in contravention of the documented behavior):

>>> s = SequenceMatcher(None, "abxcd", "abcd")
>>> s.get_matching_blocks()
[Match(a=0, b=0, size=2), Match(a=3, b=2, size=2), Match(a=5, b=4, size=0)]
>>> s.get_matching_blocks()
[(0, 0, 2), (3, 2, 2), (5, 4, 0)]
>>> s.get_matching_blocks()
[(0, 0, 2), (3, 2, 2), (5, 4, 0)]
msg221183 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-06-21 18:27
New changeset f02a563ad1bf by Raymond Hettinger in branch '2.7':
Issue 21635:  Fix caching in difflib.SequenceMatcher.get_matching_blocks().
http://hg.python.org/cpython/rev/f02a563ad1bf
msg221186 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-06-21 18:59
New changeset ed73c127421c by Raymond Hettinger in branch '3.4':
Issue 21635:  Fix caching in difflib.SequenceMatcher.get_matching_blocks().
http://hg.python.org/cpython/rev/ed73c127421c
History
Date User Action Args
2014-06-21 19:05:02rhettingersetstatus: open -> closed
resolution: fixed
2014-06-21 18:59:55python-devsetmessages: + msg221186
2014-06-21 18:27:57python-devsetnosy: + python-dev
messages: + msg221183
2014-06-21 04:43:09rhettingersetpriority: normal -> high

stage: test needed -> needs patch
messages: + msg221153
versions: + Python 3.4, Python 3.5
2014-06-21 04:34:17rhettingersetassignee: rhettinger

nosy: + rhettinger
2014-06-07 01:43:10terry.reedysetnosy: + tim.peters, terry.reedy

messages: + msg219906
stage: test needed
2014-06-02 10:03:56drevickocreate