Message78910
The patch make a huge difference on 64-bit Linux. I get a 20% speed-up
and the lowest run time so far. That is quite impressive!
At first glance, it seems the extra registers of the x86-64 architecture
permit GCC to avoid spilling registers onto the stack (see assembly just
below). However, I don't know why the speed up due to the patch is much
more significant on x86-64 than on x86.
This is the x86 assembly generated by GCC 4.3 (annotated and
slightly edited for readability):
movl -440(%ebp), %eax # tmp = next_instr
movl $145, %esi # opcode = LIST_APPEND
movl 8(%ebp), %ecx # f
subl -408(%ebp), %eax # tmp -= first_instr
movl %eax, 60(%ecx) # f->f_lasti = tmp
movl -440(%ebp), %ebx # next_instr
movzbl (%ebx), %eax # tmp = *next_instr
addl $1, %ebx # next_instr++
movl %ebx, -440(%ebp) # next_instr
movl opcode_targets(,%eax,4), %eax # tmp = opcode_targets[tmp]
jmp *%eax # goto *tmp
And this is the x86-64 assembly generated also by GCC 4.3:
movl %r15d, %eax # tmp = next_instr
subl 76(%rsp), %eax # tmp -= first_instr
movl $145, %ebp # opcode = LIST_APPEND
movl %eax, 120(%r14) # f->f_lasti = tmp
movzbl (%r15), %eax # tmp = *next_instr
addq $1, %r15 # next_instr++
movq opcode_targets(,%rax,8), %rax # tmp = opcode_targets[tmp]
jmp *%rax # goto *tmp
The above assemblies are equivalent to the following C code:
opcode = LIST_APPEND;
f->f_lasti = ((int)(next_instr - first_instr));
goto *opcode_targets[*next_instr++];
On the register-starved x86 architecture, the assembly has 4 stack load
and 1 store operations. While on the x86-64 architecture, most variables
are kept in registers thus it only uses 1 stack store operation. And
from what I saw from the assemblies, the extra registers with the
traditional switch dispatch aren't much used, especially with the opcode
prediction macros which avoid manipulations of f->f_lasti.
That said, I am glad to hear the patch makes Python on PowerPC faster,
because this supports the hypothesis that extra registers are better
used with indirect threading (PowerPC has 32 general-purpose registers). |
|
Date |
User |
Action |
Args |
2009-01-03 00:45:29 | alexandre.vassalotti | set | recipients:
+ alexandre.vassalotti, lemburg, skip.montanaro, rhettinger, pitrou, christian.heimes, blaisorblade |
2009-01-03 00:45:29 | alexandre.vassalotti | set | messageid: <1230943529.41.0.969700026517.issue4753@psf.upfronthosting.co.za> |
2009-01-03 00:45:28 | alexandre.vassalotti | link | issue4753 messages |
2009-01-03 00:45:27 | alexandre.vassalotti | create | |
|