This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters eric.smith, jdemeyer, mark.dickinson, rhettinger, sir-sigurd, tim.peters 2018-09-25.16:54:28 -1.0 Yes <1537894468.79.0.545547206417.issue34751@psf.upfronthosting.co.za>
Content
```> Suppose that there is a hash collision, say hash((3, 3)) ==
> hash((-3, -3)) and you change the hashing algorithm to fix
> this collision.

There are _two_ hash functions at play in that collision:  the tuple hash function, and the integer hash function.  Regardless of whether the _tuple_ hash function does

t ^= t << shift;
or
t += t >> shift;

or anything else involving just `t`, that only directly affects the result of the _int_ hash function.  Whose low order bits cannot be improved in general - they're already optimally spread out in the most important cases.

> If that change does NOT affect the lower N bits,
> then you still have a collision hash((3, 3)) % 2**N ==
> hash((-3, -3)) % 2**N. This is relevant because the dict
> implementation starts by taking the lower bits first. So
> ideally we want to avoid hash collisions in the lower
> bits too.

Which the int hash on its own already does a superb job of doing in the most important cases.  If you feel you just have to mess with low-order bits, do it instead on the _tuple_ hash intermediate results, not on its inputs.  Like

x += x >> 16;

after t is folded in to x.  It's x you care about directly, not t.  Damaging desirable properties in t to _indirectly_ gain something in x is too clever by half.

Mucking with t to avoid the nested-tuple catastrophes is worth it, but I prefer to do that with as little damage to t's desirable properties as reasonably possible.

> This may also be a reason to avoid the standard FNV
> multipliers: the 64-bit FNV multiplier was chosen with
> the full 64-bit hash range in mind. It was never meant
> to work well when truncated to some lower N bits.

Do you believe any other multiplier would work better toward that end?  For any odd multiplier M, the last k bits of i*M are determined solely by the last k bits of i, and

[(last k bits of i*M) for i in range(anything, anything + 2**k)]

is a permutation of range(2**k).  The high order bits of i have nothing to do with it, and any odd multiplier permutes the lower k bits over any stretch of 2**k consecutive multiplicands.

Extracting low-order bits is intended to be done by "xor folding" in FNV, and I _expect_ the same would be prudent for any other multiplier:

https://tools.ietf.org/html/draft-eastlake-fnv-15#page-7

The dict and set lookup functions already do something stronger than xor folding to bring high bits into play, but do so incrementally as the probe sequence unfolds.```
History
Date User Action Args
2018-09-25 16:54:28tim.peterssetrecipients: + tim.peters, rhettinger, mark.dickinson, eric.smith, jdemeyer, sir-sigurd