This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author blaisorblade
Recipients ajaksu2, alexandre.vassalotti, bboissin, blaisorblade, christian.heimes, djc, lemburg, pitrou, rhettinger, skip.montanaro, theatrus
Date 2009-01-04.02:28:08
SpamBayes Score 2.99747e-06
Marked as misclassified No
Message-id <>
1st note: is that code from the threaded version? Note that you need to
modify the source to make it accept also ICC to try that.
In case you already did that, I guess the patch is not useful at all
with ICC since, as far as I can see, the jump is shared. It is vital to
this patch that the jump is not shared, something similar to
-fno-crossjumping should be found.

2nd note: the answer to your questions seems yes, ICC has less register
spills. Look for instance at:
       movl    -272(%ebp), %ecx
       movzbl  (%ecx), %eax
       addl    $1, %ecx

       movzbl    (%esi), %ecx
       incl      %esi

This represents the increment of the program counter after loading the
next opcode. In the code you posted, one can see that the program
counter is spilled to memory by GCC, but isn't by ICC. Either the spill
is elsewhere, or ICC is better here. And it's widely known that ICC has
a much better optimizer in many cases, and I remember that GCC register
allocator really needs improvement.

Finally, I'm a bit surprised by "addl $1, %ecx", since any peephole
optimizer should remove that; I'm not shocked just because I've never
seen perfect GCC output.
Date User Action Args
2009-01-04 02:28:10blaisorbladesetrecipients: + blaisorblade, lemburg, skip.montanaro, rhettinger, pitrou, christian.heimes, ajaksu2, alexandre.vassalotti, djc, bboissin, theatrus
2009-01-04 02:28:10blaisorbladesetmessageid: <>
2009-01-04 02:28:10blaisorbladelinkissue4753 messages
2009-01-04 02:28:08blaisorbladecreate