Message114929
Dictionary keys are commonly numbers, strings, or tuples. Python has optimized numbers and strings to remember their hash values on successive calls. Tuples should do this too since their recursive hash function can take a long time to compute. Tuples are Python's official record type and the one obvious way of making non-scalar dictionary keys.
The code to do this in stringobject.c is short and sweet, so this major speed boost should be an easy thing to.
static long
string_hash(PyStringObject *a)
{
register Py_ssize_t len;
register unsigned char *p;
register long x;
if (a->ob_shash != -1) <==
return a->ob_shash; <==
len = Py_SIZE(a);
p = (unsigned char *) a->ob_sval;
x = *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= Py_SIZE(a);
if (x == -1) <==
x = -2; <==
a->ob_shash = x;
return x;
}
The code in tupleobject.c would just need to add the four lines marked above. Here's what is looks like now.
static long
tuplehash(PyTupleObject *v)
{
register long x, y;
register Py_ssize_t len = Py_SIZE(v);
register PyObject **p;
long mult = 1000003L;
x = 0x345678L;
p = v->ob_item;
while (--len >= 0) {
y = PyObject_Hash(*p++);
if (y == -1)
return -1;
x = (x ^ y) * mult;
/* the cast might truncate len; that doesn't change hash stability */
mult += (long)(82520L + len + len);
}
x += 97531L;
if (x == -1)
x = -2;
return x;
}
Thank you guys for all of your work.
*David |
|
Date |
User |
Action |
Args |
2010-08-25 19:33:17 | dtorp | set | recipients:
+ dtorp |
2010-08-25 19:33:17 | dtorp | set | messageid: <1282764797.19.0.867524122951.issue9685@psf.upfronthosting.co.za> |
2010-08-25 19:33:15 | dtorp | link | issue9685 messages |
2010-08-25 19:33:14 | dtorp | create | |
|