Message111372
For 2.6 and 3.1, this is a documentation only issue.
For 2.7, this is a doc + behavior issue.
For 3.2, this is a doc + behavior + new feature issue.
For 2.6.6 (release candidate due Aug 2, 10 days), I propose to add the following paragraph after the current 'Timing:' paragraph in the SequenceMatcher entry ('Heuristic:' should be bold-faced, like 'Timing:')
Heuristic: To speed matching, items that appear more than 1% of the time in sequences of at least 200 items are treated as junk. This has the unfortunate side-effect of giving bad results for sequences constructed from a small set of items. An option to turn off the heuristic will be added to a future version.
I would have said 'to 2.7.1' but that has not happened yet. I thought about putting the heuristic paragraph first, but I think it fits better after the discussion of quadratic run time. I think it should be a separate paragraph and not tacked on the end of the previous paragraph so people will be more likely to take notice.
I have marked this a release blocker because at least 6 issues have been filed for this bug and so I think it important that the explanation be added to the next released doc. I plan to temporarily reassign this to docs@python in a few days. |
|
Date |
User |
Action |
Args |
2010-07-23 18:31:34 | terry.reedy | set | recipients:
+ terry.reedy, tim.peters, barry, georg.brandl, jimjjewett, sjmachin, gjb1002, ggenellina, pitrou, rtvd, vbr, LambertDW, hagna, r.david.murray, eli.bendersky, janpf, mrotondo |
2010-07-23 18:31:34 | terry.reedy | set | messageid: <1279909894.2.0.755636795057.issue2986@psf.upfronthosting.co.za> |
2010-07-23 18:31:32 | terry.reedy | link | issue2986 messages |
2010-07-23 18:31:32 | terry.reedy | create | |
|