Issue 26300: "unpacked" bytecode

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/70488

classification

Title:	"unpacked" bytecode
Type:	enhancement	Stage:	resolved
Components:	Interpreter Core	Versions:

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:		Nosy List:	Mark.Shannon, abarnert, benjamin.peterson, georg.brandl, pitrou, serhiy.storchaka, vstinner, yselivanov
Priority:	normal	Keywords:

Created on 2016-02-05 22:17 by abarnert, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (4)
msg259693 - (view)	Author: Andrew Barnert (abarnert) *	Date: 2016-02-05 22:17
Currently, the compiler starts with a list of arrays of instructions, packs them to 1/3/6-bytes-apiece bytecodes, fixes up all the jumps, and then calls PyCode_Optimize on the result. This makes the peephole optimizer much more complicated. Assuming PEP 511 is accepted, it will also make plug-in bytecode optimizers much more complicated (and probably wasteful--they'll each be repeating the same work to re-do the fixups). The simplest alternative (as suggested by Serhiy on -ideas) is to expose an "unpacked" bytecode to the optimizer (in the code parameter and return value and lnotab_obj in-out parameter for PyCode_Optimize, and similarly for PEP 511) where each instruction takes a fixed 4 bytes. This is much easier to process. After the optimizer returns, the compiler packs opcodes into the usual 1/3/6-byte format, removing NOPs, retargeting jumps, and adjusting the lnotab as it goes. (Note that it already pretty much has code to do all of this except the NOP removal; it's just doing it before the optimizer instead of after.) Negatives: * Arguments can now only go up to 223 instead of 231. I don't think that's a problem (has anyone ever created a code object with 4 million instructions?). * A bit more work for the compiler; we'd need to test to make sure there's no measurable performance impact. We could also expose this functionality through C API PyCode_Pack/Unpack and Python dis.pack_code/unpack_code functions (and also make the dis module know how to parse unpacked code), which would allow import hooks, post-processing decorators, etc. to be simplified as well. This would remove some, but not all, of the need for things like byteplay. I think this may be worth doing, but I'm not sure until I see how complicated it is. We could even allow code objects with unpacked bytecode to be executed, but I think that's unnecessary complexity. Nobody should want to do that intentionally, and if an optimizer lets such code escape by accident, a SystemError is fine. MRAB implied an alternative: exposing some slightly-higher-level label-based format. That would be even nicer to work with. But it's also more complicated for the compiler and for the API, and I think it's already easy enough to handle jumps with fixed-width instructions.
msg259720 - (view)	Author: Andrew Barnert (abarnert) *	Date: 2016-02-06 07:58
Reading more about wpython (slide 23 of https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/wpython2/Cleanup%20and%20new%20optimizations%20in%20WPython%201.1.pdf), one of his optimizations was moving the peephole optimizer into the compiler, so it could just use the linked list of block objects of arrays of instruction objects, instead of raw bytecode. Obviously that idea isn't compatible with PEP 511. But on the off chance that PEP 511 founders, that might be the simplest answer to this problem.
msg389706 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2021-03-29 14:57
PEP 511 was rejected. The "peephole" optimizer now operates on the internal IR, not the bytecode.
msg389738 - (view)	Author: STINNER Victor (vstinner) *	Date: 2021-03-29 20:26
> The "peephole" optimizer now operates on the internal IR, not the bytecode. Python/ast_opt.c is cool ;-) Thanks INADA-san for creating it!

History
Date	User	Action	Args
2022-04-11 14:58:27	admin	set	github: 70488
2021-03-29 20:26:45	vstinner	set	messages: + msg389738
2021-03-29 14:57:07	Mark.Shannon	set	status: open -> closed nosy: + Mark.Shannon messages: + msg389706 resolution: out of date stage: resolved
2020-11-04 21:39:05	brett.cannon	set	nosy: - brett.cannon
2016-02-06 19:04:53	brett.cannon	set	nosy: + brett.cannon
2016-02-06 07:58:28	abarnert	set	messages: + msg259720
2016-02-05 22:17:44	abarnert	create