Message115617
Currently, Python produces hash values with fit in a C "long". This is fine at first sight, but in the context of dict and set implementations, it means that
1) holding hashes and indices in the same field of a structure requires some care (see issue1646068)
2) on platforms where a long is smaller than a Py_ssize_t (e.g. Win64), very big hash tables could suffer from lots of artificial collisions (the hash table being bigger than the range of possible hash values)
3) when a long is smaller than Py_ssize_t, we don't save any size anyway, since having some pointers follow a C "long" in a structure implies some padding to keep all fields naturally aligned
A future-proof option would be to change all hash values to be of Py_ssize_t values rather than C longs. Either directly, or by defining a new dedicated alias Py_hash_t. This would also impact the ABI, I suppose. |
|
Date |
User |
Action |
Args |
2010-09-04 22:11:32 | pitrou | set | recipients:
+ pitrou, tim.peters, loewis, georg.brandl, rhettinger, jimjjewett, mark.dickinson, belopolsky, ked-tao |
2010-09-04 22:11:31 | pitrou | set | messageid: <1283638291.76.0.407786717056.issue9778@psf.upfronthosting.co.za> |
2010-09-04 22:11:29 | pitrou | link | issue9778 messages |
2010-09-04 22:11:28 | pitrou | create | |
|