This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author janpf
Recipients janpf, jimjjewett, rtvd, sjmachin, tim.peters
Date 2009-02-07.10:50:32
SpamBayes Score 3.7309984e-08
Marked as misclassified No
Message-id <1234003839.15.0.615749802724.issue1528074@psf.upfronthosting.co.za>
In-reply-to
Content
hi all,

just got bitten by this, so i took the time to reiterate the issue.

according to the docs:

http://docs.python.org/library/difflib.html

find_longest_match() should return the longest matching string:

"If isjunk was omitted or None, find_longest_match() returns (i, j, k)
such that a[i:i+k] is equal to b[j:j+k], where alo <= i <= i+k <= ahi
and blo <= j <= j+k <= bhi. For all (i', j', k') meeting those
conditions, the additional conditions k >= k', i <= i', and if i == i',
j <= j' are also met. In other words, of all maximal matching blocks,
return one that starts earliest in a, and of all those maximal matching
blocks that start earliest in a, return the one that starts earliest in b."

but after a couple of hours debugging i finally convinced myself that
the bug was in the library ... and i ended up here :) 

any ideas on how to work around this bug/feature, and just get the
longest matching string ? (from a normal/newbie user perspective, that
is, without patching the C++ library code and recompiling?)

from the comments (which i couldn't follow entirely), does it use some
concept of popularity that is not exposed by the API ? How is
"popularity" defined ?

many thanks!
- jan


ps.: using ubuntu's python 2.5.2

ps2.: and example of a string pair where the issue shows up:

s1='Floor Box SystemsFBS Floor Box Systems - Manufacturer &amp; supplier
of FBS floor boxes, electrical ... experience, FBS Floor Box Systems
continue ... raceways, floor box. ...www.floorboxsystems.com'

s2='FBS Floor Box SystemsFBS Floor Box Systems - Manufacturer &amp;
supplier of FBS floor boxes, electrical floor boxes, wood floor box,
concrete floor box, surface mount floor box, raised floor
...www.floorboxsystems.com'
History
Date User Action Args
2009-02-07 10:50:39janpfsetrecipients: + janpf, tim.peters, jimjjewett, sjmachin, rtvd
2009-02-07 10:50:39janpfsetmessageid: <1234003839.15.0.615749802724.issue1528074@psf.upfronthosting.co.za>
2009-02-07 10:50:37janpflinkissue1528074 messages
2009-02-07 10:50:35janpfcreate