Message238552
This tracker item is for a thought experiment I'm running where I can collect the thoughts and discussions in one place. It is not an active proposal for inclusion in Python.
The idea is to greatly speed-up the language for set/dict lookups of unicode value by skipping the exact comparison when the unicode type is exact and the 64-bit hash values are known to match.
Given the siphash and hash randomization, we get a 1 in 2**64 chance of a false positive (which is better than the error rate for non-ECC DRAM itself).
However, since the siphash isn't cryptographically secure, presumably a malicious chooser of keys could generate a false positive on-purpose.
This technique is currently used by git and mercurial which use hash values for file and version graphs without checking for an exact match (because the chance of a false positive is vanishingly rare).
The Python test suite passes as does the test suites for a number of packages I have installed. |
|
Date |
User |
Action |
Args |
2015-03-19 19:57:42 | rhettinger | set | recipients:
+ rhettinger |
2015-03-19 19:57:42 | rhettinger | set | messageid: <1426795062.72.0.102774381262.issue23712@psf.upfronthosting.co.za> |
2015-03-19 19:57:42 | rhettinger | link | issue23712 messages |
2015-03-19 19:57:42 | rhettinger | create | |
|