This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author methane
Recipients christian.heimes, lemburg, methane, rhettinger, serhiy.storchaka
Date 2017-02-01.12:03:43
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1485950624.11.0.457171347122.issue29410@psf.upfronthosting.co.za>
In-reply-to
Content
Current Py_HASH_CUTOFF implementation seems weak.
```
        switch(len) {
            /* ((hash << 5) + hash) + *p == hash * 33 + *p */
            case 7: hash = ((hash << 5) + hash) + *p++; /* fallthrough */
...
            case 1: hash = ((hash << 5) + hash) + *p++; break;
            default:
                assert(0);
        }
        hash ^= len;
        hash ^= (Py_uhash_t) _Py_HashSecret.djbx33a.suffix;
        x = (Py_hash_t)hash;
```

HashSecret used at last.  Conflicting pair without secret conflict with secrets too.

```
>>> def djbx33a(bs):
...     h = 5381
...     for i in bs:
...         h = h*33+i
...     h ^= len(bs)
...     h ^~ secret
...     return h & 0xffff_ffff_ffff_ffff
...
>>> for i in [0, 3, 7]: # range(8)
...   for j in [0, 4, 7]: # range(8)
...     for k in [0, 5, 7]: # range(8)
...       print(i,j,k, djbx33a([7-i, 7-j+33*i, 7-k+33*j, 33*k]))
...
0 0 0 6381700318
0 0 5 6381700318
0 0 7 6381700318
0 4 0 6381700318
...
7 4 7 6381700318
7 7 0 6381700318
7 7 5 6381700318
7 7 7 6381700318
>>>
```

So there are 8^(Py_HASH_CUTOFF-1) conflicting sequence can be built.

Maybe, we should remove Py_HASH_CUTOFF completely?
History
Date User Action Args
2017-02-01 12:03:44methanesetrecipients: + methane, lemburg, rhettinger, christian.heimes, serhiy.storchaka
2017-02-01 12:03:44methanesetmessageid: <1485950624.11.0.457171347122.issue29410@psf.upfronthosting.co.za>
2017-02-01 12:03:44methanelinkissue29410 messages
2017-02-01 12:03:43methanecreate