Author blaisorblade
Recipients alexandre.vassalotti, blaisorblade, christian.heimes, lemburg, pitrou, rhettinger, skip.montanaro
Date 2009-01-03.02:20:57
SpamBayes Score 5.03153e-06
Marked as misclassified No
Message-id <1230949258.6.0.586239956138.issue4753@psf.upfronthosting.co.za>
In-reply-to
Content
> I'm not an expert in this kind of optimizations. Could we gain more
speed by making the dispatcher table more dense? Python has less than
128 opcodes (len(opcode.opmap) == 113) so they can be squeezed in a
smaller table. I naively assume a smaller table increases the amount of
cache hits.

Well, you have no binary compatibility constraint with a new release, so
it can be tried and benchmarked, or it can be done anyway!
On x86_64 the impact of the jump table is 8 bytes per pointer * 256
pointers = 2KiB, and the L1 data cache of Pentium4 can be 8KiB or 16KiB
wide.
But I don't expect this to be noticeable in most synthetic
microbenchmarks. Matrix multiplication would be the perfect one I guess;
the repeated column access would kill the L1 data cache, if the whole
matrixes don't fit.
History
Date User Action Args
2009-01-03 02:20:58blaisorbladesetrecipients: + blaisorblade, lemburg, skip.montanaro, rhettinger, pitrou, christian.heimes, alexandre.vassalotti
2009-01-03 02:20:58blaisorbladesetmessageid: <1230949258.6.0.586239956138.issue4753@psf.upfronthosting.co.za>
2009-01-03 02:20:58blaisorbladelinkissue4753 messages
2009-01-03 02:20:57blaisorbladecreate