classification
Title: False positives given through bisect module (binary search)
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.1
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: kaashif, r.david.murray, rhettinger
Priority: normal Keywords:

Created on 2009-08-24 20:59 by kaashif, last changed 2009-08-24 22:43 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
words.txt kaashif, 2009-08-24 20:48 Word list used to test the bisect module
Messages (2)
msg91940 - (view) Author: kaashif (kaashif) Date: 2009-08-24 20:48
I tried Python's bisect module on a large word list (words.txt contained
in http://www.greenteapress.com/thinkpython/swampy/swampy.1.1.zip)

If I search for something like 'musefully', 'museful' will come up as a
match. Maybe that's a feature... but seems to me like a bug.

Too much optimization going on here it seems, to such an extent that
false positives are given.

Here's the code I tried:

import bisect

fin = open('words.txt')
t = []

for line in fin:
    t.append(line.strip())

print(bisect.bisect(t,'musefully'))
msg91945 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-08-24 22:43
It's not a match, it's an insertion point.  The bisect module doesn't
have a method that can be used directly to determine if an item is not
in the list.  Take a careful look at the example, especially the second
part.

If you think it should have such a method, that would be an enhancement
request.
History
Date User Action Args
2009-08-24 22:43:29r.david.murraysetstatus: open -> closed
priority: normal


nosy: + rhettinger, r.david.murray
messages: + msg91945
resolution: not a bug
stage: resolved
2009-08-24 20:59:16kaashifcreate