This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients rhettinger, serhiy.storchaka
Date 2015-12-08.09:34:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1449567251.61.0.853791637694.issue25823@psf.upfronthosting.co.za>
In-reply-to
Content
On little-endian machines, the decoding of an oparg can be sped-up by using a single 16-bit pointer deference.

Current decoding:
    leaq    2(%rcx), %rbp
    movzbl  -1(%rbp), %eax
    movzbl  -2(%rbp), %r14d
    sall    $8, %eax
    addl    %eax, %r14d

New decoding:
    leaq    2(%rdx), %r12
    movzwl  -2(%r12), %r8d

The patch uses (unsigned short *) like the struct module does, but it could use uint16_t if necessary.

If next_instr can be advanced after the lookup rather than before, the generated code would be tighter still (removing the data dependency and shortening the movzwl instruction to drop the offset byte):

    movzwl  (%rdx), %r8d
    leaq    2(%rdx), %rbp
History
Date User Action Args
2015-12-08 09:34:11rhettingersetrecipients: + rhettinger, serhiy.storchaka
2015-12-08 09:34:11rhettingersetmessageid: <1449567251.61.0.853791637694.issue25823@psf.upfronthosting.co.za>
2015-12-08 09:34:11rhettingerlinkissue25823 messages
2015-12-08 09:34:10rhettingercreate