Message 234308 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhettinger
Recipients	rhettinger
Date	2015-01-19.09:15:40
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1421658942.27.0.429398283891.issue23269@psf.upfronthosting.co.za>
In-reply-to

Content
First draft of patch to switch from a table[(i+j)&mask] style of entry calculation to an entry++ style. The entry computation simplifies from add/shl4/and/lea to a single add16. To do this, the linear probes are limited to the length of table rather than wrapping-around. I haven't timed this one yet (have just looked at the assembly) or worked through the implications. The new approach is a little less attractive but may be easier to reason about than the mask wrap-around. Also, it disadvantages small sets where the wrap-around property assured that the linear probes would always fine a hit without any entry being checked more than once. There is a extra setup time before the linear probes to compute the limit. Also, since the loop size is now variable instead of fixed, the inner loop for set_insert_clean() no longer gets unrolled by either GCC or CLANG. With the change, the inner loop of set_insert_clean() is very tight: L436: addq $16, %rax #, entry entry < limit cmpq %rdx, %rax # limit, entry ja L399 #, entry->key == NULL cmpq $0, (%rax) #,* entry jne L436 The code for lookkey() is also tight (though I can't tell why the entry->key gets loaded from memory twice): L102: cmpq %r15, %r13 # tmp170, D.12706 entry->key == dummy jne L103 #, testq %r12, %r12 # freeslot cmove %r14, %r12 # entry -> freeslot L103: addq $16, %r14 #, entry entry++ cmpq %r14, %rbx # entry, limit jb L146 #, movq (%r14), %r13 # MEM[base: entry_65, offset: 0B], D.12706 testq %r13, %r13 # D.12706 entry->key == NULL je L147 #, cmpq %rbp, 8(%r14) # hash, MEM[base: entry_104, offset: 8B] je L148 #, entry->hash == hash ? movq (%r14), %r13 # MEM[base: entry_104, offset: 0B], D.12706 jmp L102 #

First draft of patch to switch from a table[(i+j)&mask] style of entry calculation to an entry++ style.  The entry computation simplifies from add/shl4/and/lea to a single add16.  To do this, the linear probes are limited to the length of table rather than wrapping-around.

I haven't timed this one yet (have just looked at the assembly) or worked through the implications.  The new approach is a little less attractive but may be easier to reason about than the mask wrap-around.  Also, it disadvantages small sets where the wrap-around property assured that the linear probes would always fine a hit without any entry being checked more than once.  There is a extra setup time before the linear probes to compute the limit.  Also, since the loop size is now variable instead of fixed, the inner loop for set_insert_clean() no longer gets unrolled by either GCC or CLANG.

With the change, the inner loop of set_insert_clean() is very tight:

L436:
        addq    $16, %rax       #, entry
entry < limit    
        cmpq    %rdx, %rax      # limit, entry
        ja      L399    #,
entry->key == NULL
        cmpq    $0, (%rax)      #,* entry
        jne     L436 

The code for lookkey() is also tight (though I can't tell why the entry->key gets loaded from memory twice):

L102:
        cmpq    %r15, %r13      # tmp170, D.12706      entry->key == dummy
        jne     L103    #,
        testq   %r12, %r12      # freeslot
        cmove   %r14, %r12      # entry -> freeslot
L103:
        addq    $16, %r14       #, entry            entry++ 
        cmpq    %r14, %rbx      # entry, limit
        jb      L146    #,
        movq    (%r14), %r13    # MEM[base: entry_65, offset: 0B], D.12706
        testq   %r13, %r13      # D.12706           entry->key == NULL
        je      L147    #,
        cmpq    %rbp, 8(%r14)   # hash, MEM[base: entry_104, offset: 8B]
        je      L148    #,                          entry->hash == hash
?       movq    (%r14), %r13    # MEM[base: entry_104, offset: 0B], D.12706
        jmp     L102    #

History
Date	User	Action	Args
2015-01-19 09:15:42	rhettinger	set	recipients: + rhettinger
2015-01-19 09:15:42	rhettinger	set	messageid: <1421658942.27.0.429398283891.issue23269@psf.upfronthosting.co.za>
2015-01-19 09:15:41	rhettinger	link	issue23269 messages
2015-01-19 09:15:40	rhettinger	create