Message 286617 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	methane
Recipients	christian.heimes, lemburg, methane, rhettinger, serhiy.storchaka
Date	2017-02-01.12:03:43
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1485950624.11.0.457171347122.issue29410@psf.upfronthosting.co.za>
In-reply-to

Content
Current Py_HASH_CUTOFF implementation seems weak. ``` switch(len) { /* ((hash << 5) + hash) + p == hash 33 + p / case 7: hash = ((hash << 5) + hash) + p++; / fallthrough / ... case 1: hash = ((hash << 5) + hash) + p++; break; default: assert(0); } hash ^= len; hash ^= (Py_uhash_t) _Py_HashSecret.djbx33a.suffix; x = (Py_hash_t)hash; ``` HashSecret used at last. Conflicting pair without secret conflict with secrets too. ``` >>> def djbx33a(bs): ... h = 5381 ... for i in bs: ... h = h33+i ... h ^= len(bs) ... h ^~ secret ... return h & 0xffff_ffff_ffff_ffff ... >>> for i in [0, 3, 7]: # range(8) ... for j in [0, 4, 7]: # range(8) ... for k in [0, 5, 7]: # range(8) ... print(i,j,k, djbx33a([7-i, 7-j+33i, 7-k+33j, 33k])) ... 0 0 0 6381700318 0 0 5 6381700318 0 0 7 6381700318 0 4 0 6381700318 ... 7 4 7 6381700318 7 7 0 6381700318 7 7 5 6381700318 7 7 7 6381700318 >>> ``` So there are 8^(Py_HASH_CUTOFF-1) conflicting sequence can be built. Maybe, we should remove Py_HASH_CUTOFF completely?

Current Py_HASH_CUTOFF implementation seems weak.
```
        switch(len) {
            /* ((hash << 5) + hash) + *p == hash * 33 + *p */
            case 7: hash = ((hash << 5) + hash) + *p++; /* fallthrough */
...
            case 1: hash = ((hash << 5) + hash) + *p++; break;
            default:
                assert(0);
        }
        hash ^= len;
        hash ^= (Py_uhash_t) _Py_HashSecret.djbx33a.suffix;
        x = (Py_hash_t)hash;
```

HashSecret used at last.  Conflicting pair without secret conflict with secrets too.

```
>>> def djbx33a(bs):
...     h = 5381
...     for i in bs:
...         h = h*33+i
...     h ^= len(bs)
...     h ^~ secret
...     return h & 0xffff_ffff_ffff_ffff
...
>>> for i in [0, 3, 7]: # range(8)
...   for j in [0, 4, 7]: # range(8)
...     for k in [0, 5, 7]: # range(8)
...       print(i,j,k, djbx33a([7-i, 7-j+33*i, 7-k+33*j, 33*k]))
...
0 0 0 6381700318
0 0 5 6381700318
0 0 7 6381700318
0 4 0 6381700318
...
7 4 7 6381700318
7 7 0 6381700318
7 7 5 6381700318
7 7 7 6381700318
>>>
```

So there are 8^(Py_HASH_CUTOFF-1) conflicting sequence can be built.

Maybe, we should remove Py_HASH_CUTOFF completely?

History
Date	User	Action	Args
2017-02-01 12:03:44	methane	set	recipients: + methane, lemburg, rhettinger, christian.heimes, serhiy.storchaka
2017-02-01 12:03:44	methane	set	messageid: <1485950624.11.0.457171347122.issue29410@psf.upfronthosting.co.za>
2017-02-01 12:03:44	methane	link	issue29410 messages
2017-02-01 12:03:43	methane	create