Message 262622 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	Demur Rumed, benjamin.peterson, brett.cannon, georg.brandl, ncoghlan, serhiy.storchaka, vstinner, yselivanov
Date	2016-03-29.20:20:12
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1459282813.49.0.455482321709.issue26647@psf.upfronthosting.co.za>
In-reply-to

Content
> This also means that if I want to create something like a tracer that tracks some information for each instruction, I can allocate an array of codesize/2 bytes, then index off of half the instruction index. This isn't currently done in peephole.c, nor does this include halving jump opargs There is something called "inline caching": put the cache between instructions, in the same memory block. Example of paper on CPython: "Efficient Inline Caching without Dynamic Translation" by Stefan Brunthaler (2009) https://www.sba-research.org/wp-content/uploads/publications/sac10.pdf Yury's approach is a standard lookup table: offset => cache. In the issue #26219, he even used two tables: co->co_opt_opcodemap is an array mapping an instruction offset to the offset in the cache, then the second offset is used to retrieve cache data from a second array. You have 3 structures (co_code, co_opt_opcodemap, co_opt), whereas inline caching propose to only use one flat structure (a single array). The paper promises "improved data locality and instruction decoding effciency". but "The new combined data-structure requires significantly more space—two native machine words for each instruction byte. To compensate for the additional space requirements, we use a profiling infrastructure to decide when to switch to this new instruction encoding at run time." Memory footprint and detection of hot code is handled in the issue #26219.

> This also means that if I want to create something like a tracer that tracks some information for each instruction, I can allocate an array of codesize/2 bytes, then index off of half the instruction index. This isn't currently done in peephole.c, nor does this include halving jump opargs

There is something called "inline caching": put the cache between instructions, in the same memory block. Example of paper on CPython:

"Efficient Inline Caching without Dynamic Translation" by Stefan Brunthaler (2009)
https://www.sba-research.org/wp-content/uploads/publications/sac10.pdf

Yury's approach is a standard lookup table: offset => cache. In the issue #26219, he even used two tables: co->co_opt_opcodemap is an array mapping an instruction offset to the offset in the cache, then the second offset is used to retrieve cache data from a second array. You have 3 structures (co_code, co_opt_opcodemap, co_opt), whereas inline caching propose to only use one flat structure (a single array).

The paper promises "improved data locality and instruction decoding effciency".

but "The new combined data-structure requires significantly more space—two native machine words for each instruction byte. To compensate for the additional space requirements, we use a profiling infrastructure to decide when to switch to this new instruction encoding at run time."

Memory footprint and detection of hot code is handled in the issue #26219.

History
Date	User	Action	Args
2016-03-29 20:20:13	vstinner	set	recipients: + vstinner, brett.cannon, georg.brandl, ncoghlan, benjamin.peterson, serhiy.storchaka, yselivanov, Demur Rumed
2016-03-29 20:20:13	vstinner	set	messageid: <1459282813.49.0.455482321709.issue26647@psf.upfronthosting.co.za>
2016-03-29 20:20:13	vstinner	link	issue26647 messages
2016-03-29 20:20:12	vstinner	create