Message262622
> This also means that if I want to create something like a tracer that tracks some information for each instruction, I can allocate an array of codesize/2 bytes, then index off of half the instruction index. This isn't currently done in peephole.c, nor does this include halving jump opargs
There is something called "inline caching": put the cache between instructions, in the same memory block. Example of paper on CPython:
"Efficient Inline Caching without Dynamic Translation" by Stefan Brunthaler (2009)
https://www.sba-research.org/wp-content/uploads/publications/sac10.pdf
Yury's approach is a standard lookup table: offset => cache. In the issue #26219, he even used two tables: co->co_opt_opcodemap is an array mapping an instruction offset to the offset in the cache, then the second offset is used to retrieve cache data from a second array. You have 3 structures (co_code, co_opt_opcodemap, co_opt), whereas inline caching propose to only use one flat structure (a single array).
The paper promises "improved data locality and instruction decoding effciency".
but "The new combined data-structure requires significantly more space—two native machine words for each instruction byte. To compensate for the additional space requirements, we use a profiling infrastructure to decide when to switch to this new instruction encoding at run time."
Memory footprint and detection of hot code is handled in the issue #26219. |
|
Date |
User |
Action |
Args |
2016-03-29 20:20:13 | vstinner | set | recipients:
+ vstinner, brett.cannon, georg.brandl, ncoghlan, benjamin.peterson, serhiy.storchaka, yselivanov, Demur Rumed |
2016-03-29 20:20:13 | vstinner | set | messageid: <1459282813.49.0.455482321709.issue26647@psf.upfronthosting.co.za> |
2016-03-29 20:20:13 | vstinner | link | issue26647 messages |
2016-03-29 20:20:12 | vstinner | create | |
|