Message 325883 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhettinger
Recipients	jdemeyer, rhettinger
Date	2018-09-20.15:06:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1537456010.22.0.956365154283.issue34751@psf.upfronthosting.co.za>
In-reply-to

Content
> It's not hard to come up with a hash collision for tuples: Nor is it hard to come up with collisions for most simple hash functions. For typical use cases of tuples, what we have works out fine (it just combined the effects of the underlying hashes). > The underlying reason is that the hashing code mixes ^ and *. Addition also has a relationship to multiplication. I think the XOR is fine. > A simple Bernstein hash suffices. That is more suited to character data and small hash ranges (as opposed to 64-bit). > On top of that, the hashing code for tuples looks > needlessly complicated. The important thing is that it has passed our testing (see the comment above) and that it compiles nicely (see the tight disassembly below). ---------------------- L82: xorq %r14, %rax imulq %rbp, %rax leaq 82518(%rbp,%r12,2), %rbp movq %r13, %r12 movq %rax, %r14 leaq -1(%r13), %rax cmpq $-1, %rax je L81 movq %rax, %r13 L73: addq $8, %rbx movq -8(%rbx), %rdi call _PyObject_Hash cmpq $-1, %rax jne L82

> It's not hard to come up with a hash collision for tuples:

Nor is it hard to come up with collisions for most simple hash functions.  For typical use cases of tuples, what we have works out fine (it just combined the effects of the underlying hashes).

> The underlying reason is that the hashing code mixes ^ and *. 

Addition also has a relationship to multiplication.   I think the XOR is fine.

> A simple Bernstein hash suffices.

That is more suited to character data and small hash ranges (as opposed to 64-bit).  

>  On top of that, the hashing code for tuples looks 
> needlessly complicated.

The important thing is that it has passed our testing (see the comment above) and that it compiles nicely (see the tight disassembly below).


----------------------

L82:
    xorq    %r14, %rax
    imulq   %rbp, %rax
    leaq    82518(%rbp,%r12,2), %rbp
    movq    %r13, %r12
    movq    %rax, %r14
    leaq    -1(%r13), %rax
    cmpq    $-1, %rax
    je  L81
    movq    %rax, %r13
L73:
    addq    $8, %rbx
    movq    -8(%rbx), %rdi
    call    _PyObject_Hash
    cmpq    $-1, %rax
    jne L82

History
Date	User	Action	Args
2018-09-20 15:06:50	rhettinger	set	recipients: + rhettinger, jdemeyer
2018-09-20 15:06:50	rhettinger	set	messageid: <1537456010.22.0.956365154283.issue34751@psf.upfronthosting.co.za>
2018-09-20 15:06:50	rhettinger	link	issue34751 messages
2018-09-20 15:06:50	rhettinger	create