Author pitrou
Recipients belopolsky, georg.brandl, jimjjewett, ked-tao, loewis, mark.dickinson, pitrou, rhettinger, tim.peters
Date 2010-09-04.22:11:28
SpamBayes Score 1.15186e-07
Marked as misclassified No
Message-id <1283638291.76.0.407786717056.issue9778@psf.upfronthosting.co.za>
In-reply-to
Content
Currently, Python produces hash values with fit in a C "long". This is fine at first sight, but in the context of dict and set implementations, it means that
1) holding hashes and indices in the same field of a structure requires some care (see issue1646068)
2) on platforms where a long is smaller than a Py_ssize_t (e.g. Win64), very big hash tables could suffer from lots of artificial collisions (the hash table being bigger than the range of possible hash values)
3) when a long is smaller than Py_ssize_t, we don't save any size anyway, since having some pointers follow a C "long" in a structure implies some padding to keep all fields naturally aligned

A future-proof option would be to change all hash values to be of Py_ssize_t values rather than C longs. Either directly, or by defining a new dedicated alias Py_hash_t. This would also impact the ABI, I suppose.
History
Date User Action Args
2010-09-04 22:11:32pitrousetrecipients: + pitrou, tim.peters, loewis, georg.brandl, rhettinger, jimjjewett, mark.dickinson, belopolsky, ked-tao
2010-09-04 22:11:31pitrousetmessageid: <1283638291.76.0.407786717056.issue9778@psf.upfronthosting.co.za>
2010-09-04 22:11:29pitroulinkissue9778 messages
2010-09-04 22:11:28pitroucreate