This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients rhettinger, vstinner
Date 2013-08-17.18:57:57
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1376765878.32.0.652855927715.issue18771@psf.upfronthosting.co.za>
In-reply-to
Content
I'm working on a patch for the lookkey() functions in Object/setobject.c.  The core idea is to follow the probe sequence as usual but to also check an adjacent entry for each probe (i.e. check &table[i & mask] as usual and then check &table[(i & mask) ^ 1] before going on the next probes which are scattered around memory).

On modern processors (anything in the last decade), this is likely to reduce the cost of hash collisions and reduce the number of cache lines fetched from memory.

+ Cache line benefit:  improves the odds that the adjacent probe will be on the same cache line, thereby reducing the total number of cache lines fetched.  This benefit will work for set insertions as well as set lookups (a partial write to a cache line requires that a full cache line be read prior to writing, so it is helpful if we've just examined another slot on the same cache line).

+ Parallel execution:  because the second lookup has no data dependency on the first lookup, some of its instructions can be executed in parallel by the out-of-order engine.

+ Reduced loop overhead:  the second lookup doesn't require a new computation of the index *i* (we just do a XOR 1), a new perturb shift, a new masking of *i*, or a jump to the top of the loop.  In other words, it has all the usual benefits of loop unrolling.

- On the downside, even this partial unrolling when increase the total code size which has the negative effect of blowing other code out of the I-cache (typically 32kb).

? I'm unsure whether the additional if-statements will have a net positive or net negative effect on branch prediction rates (positive because each branch can be tracked separately or negative because additional branches use up more space in the branch prediction tables).  Once the patch is ready, I'll run CacheGrind to get a better sense of what is happening.
History
Date User Action Args
2013-08-17 18:57:58rhettingersetrecipients: + rhettinger, vstinner
2013-08-17 18:57:58rhettingersetmessageid: <1376765878.32.0.652855927715.issue18771@psf.upfronthosting.co.za>
2013-08-17 18:57:58rhettingerlinkissue18771 messages
2013-08-17 18:57:57rhettingercreate