This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients mark.dickinson, pitrou, rhettinger, tim.peters
Date 2010-11-13.23:12:16
SpamBayes Score 5.5100986e-06
Marked as misclassified No
Message-id <1289689933.3561.7.camel@localhost.localdomain>
In-reply-to <1289689354.8.0.363857776985.issue10408@psf.upfronthosting.co.za>
Content
> My previous experiments along these lines showed it was a dead-end.
> The number of probes was the most important factor and beat-out any
> effort to improve cache utilization from increased density.  

Can you describe your experiments? What workloads or benchmarks did you
use?

Do note that there are several levels of caches in modern CPUs. L1 is
very fast (latency is 3 or 4 cycles) but rather small (32 or 64KB). L2,
depending on the CPU, has a latency between 10 and 20+ cycles and can be
256KB to 1MB large. L3, when present, is quite larger but also quite
slower (latency sometimes up to 50 cycles).
So, even if access patterns are uneven, it is probably rare to have all
frequently accessed data in L1 (especially with Python since objects are
big).

> Another result from earlier experiments is that benchmarking the
> experiment is laden with pitfalls.  Tight timing loops don't mirror
> real world programs, nor do access patterns with uniform random
> distributions.

I can certainly understand that; can you suggest workloads approaching
"real world programs"?
History
Date User Action Args
2010-11-13 23:12:19pitrousetrecipients: + pitrou, tim.peters, rhettinger, mark.dickinson
2010-11-13 23:12:16pitroulinkissue10408 messages
2010-11-13 23:12:16pitroucreate