Author haypo
Recipients Arfrever, PaulMcMillan, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, georg.brandl, gvanrossum, haypo, pitrou, terry.reedy
Date 2012-01-04.01:54:51
SpamBayes Score 6.35466e-05
Marked as misclassified No
Message-id <1325642093.98.0.114808804311.issue13703@psf.upfronthosting.co.za>
In-reply-to
Content
> https://gist.github.com/0a91e52efa74f61858b5

Please, attach directly a file to the issue, or copy/paste the code in your comment. Interesting part the code:
---

#Proposed replacement
#--------------------------------------
import os, array
size_exponent = 14 #adjust as a memory/security tradeoff
r = array.array('l', os.urandom(2**size_exponent))
len_r = len(r)

def _hash_string2(s):
    """The algorithm behind compute_hash() for a string or a unicode."""
    length = len(s)
    #print s
    if length == 0:
        return -1
    x = (ord(s[0]) << 7) ^ r[length % len_r]
    i = 0
    while i < length:
        x = intmask((1000003*x) ^ ord(s[i]))
        x ^= r[x % len_r]
        i += 1
    x ^= length
    return intmask(x)
---

> r = array.array('l', os.urandom(2**size_exponent))
> len_r = len(r)

r size should not depend on the size of a long. You should write something like:

sizeof_long = ctypes.sizeof(ctypes.c_long)
r_bits = 8
r = array.array('l', os.urandom((2**r_bits) * sizeof_long))
r_mask = 2**r_bits-1

and then replace "% len_r" by "& r_mask".

What is the minimum value of r_bits? For example, would it be safe to use a single long integer? (r_bits=1)
History
Date User Action Args
2012-01-04 01:54:54hayposetrecipients: + haypo, gvanrossum, barry, georg.brandl, terry.reedy, pitrou, christian.heimes, benjamin.peterson, Arfrever, alex, dmalcolm, PaulMcMillan
2012-01-04 01:54:53hayposetmessageid: <1325642093.98.0.114808804311.issue13703@psf.upfronthosting.co.za>
2012-01-04 01:54:52haypolinkissue13703 messages
2012-01-04 01:54:51haypocreate