Message 120562 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	dmalcolm
Recipients	belopolsky, benjamin.peterson, dmalcolm, jhylton, nnorwitz, rhettinger, sdahlbac, thomaslee, titanstar
Date	2010-11-05.23:59:47
SpamBayes Score	0.0
Marked as misclassified	No
Message-id	<1289001594.62.0.713384511884.issue1346238@psf.upfronthosting.co.za>
In-reply-to

Content
I've been working on this against the py3k branch. I'm attaching what I've got so far. I foolishly didn't check the tlee-ast-optimize branch, instead using file6850 as a base. Rune Holm/titanstar, I assume you've signed a PSF contributor agreement? Changes since titanstar's original patch: - rework to apply against py3k - reformatted tabs to 4-space indentation; tried to reformat to PEP-7 as much as possible - added stmt types: With_kind, Nonlocal_kind - added expr types: IfExp_kind, Bytes_kind, SetComp_kind, DictComp_kind, Ellipsis_Kind - removed Print_kind, Exec_kind, Repr_kind - reworked Raise_kind - added "col_offset" and "arena" arguments; pass in the PyArena from the compiler as the context of the visitor - removal of all "free_expr" and "asdl_seq_free" calls on the assumption that PyArena now handles all of this (am I correct in thinking this?) - String -> Bytes in create_ast_from_constant_object - added test_optimize selftest suite, though this is based on bytecode disassembly, rather than direct inspection of the AST - I've fixed it up so it compiles and passes regrtest, but I suspect I've missed optimization possibilities I did a little performance testing using the py3k version of the benchmark suite; currently it's a slight regression for some tests, a slight improvement for others; nothing impressive yet. Thomas Lee's AST optimization branch (branched from r62457) has lots of interesting work: e.g. http://svn.python.org/view/python/branches/tlee-ast-optimize/Python/optimize.c?view=log This appears to not be quite the same starting point; he added a PyCF_NO_OPTIMIZE flag to Include/pythonrun.h (and other places), which seems like a good way to see the effect of the optimization pass. He also removed the peepholer; maybe it's worth doing that, but it seems worth at least keeping the test suite around to ensure a lack of regressions. I can look at cherrypicking Thomas' work/porting it to py3k. Re: "aiming high": I'd love to add new optimizations, but it's not clear to me what's permissable. In particular, is it permissable for an optimization pass to assume that there are no external modifications to the locals within a frame? It's possible to write code like this: frame = inspect.currentframe() inspect.getouterframes(frame)[-depth][0].f_locals[name] = value to manipulate locals; whether or not this actually affects running code in the current implementation of CPython seems hit-or-miss to me right now, I think depending on exactly when fastlocals get written back to the f_locals dictionary (I could have miswritten the precise code). By strategically manipulating locals in other frames, we can break pretty-much any typical compiler optimization: locals can appear or change from under us, or change attribute values, or gain side-effects to their __getattr__ (e.g. writing to disk). If it is permissable for an optimization pass to assume that there are no external modifications to the locals within a frame, then issue 4264 might be one to investigate: this is a patch on top of Tom Lee's work (to do local type-inference to replace list.append with LIST_APPEND). Ideas for other optimizations would be most welcome.

I've been working on this against the py3k branch.  I'm attaching what I've got so far.

I foolishly didn't check the tlee-ast-optimize branch, instead using file6850 as a base.

Rune Holm/titanstar, I assume you've signed a PSF contributor agreement?

Changes since titanstar's original patch:
  - rework to apply against py3k
  - reformatted tabs to 4-space indentation; tried to reformat to PEP-7 as much as possible
  - added stmt types: With_kind, Nonlocal_kind
  - added expr types: IfExp_kind, Bytes_kind, SetComp_kind, DictComp_kind, Ellipsis_Kind
  - removed Print_kind, Exec_kind, Repr_kind
  - reworked Raise_kind
  - added "col_offset" and "arena" arguments; pass in the PyArena from the compiler as the context of the visitor
  - removal of all "free_expr" and "asdl_seq_free" calls on the assumption that PyArena now handles all of this (am I correct in thinking this?)
  - String -> Bytes in create_ast_from_constant_object
  - added test_optimize selftest suite, though this is based on bytecode disassembly, rather than direct inspection of the AST
  - I've fixed it up so it compiles and passes regrtest, but I suspect I've missed optimization possibilities

I did a little performance testing using the py3k version of the benchmark suite; currently it's a slight regression for some tests, a slight improvement for others; nothing impressive yet.

Thomas Lee's AST optimization branch (branched from r62457) has lots of interesting work:
  e.g. http://svn.python.org/view/python/branches/tlee-ast-optimize/Python/optimize.c?view=log

This appears to not be quite the same starting point; he added a PyCF_NO_OPTIMIZE flag to Include/pythonrun.h (and other places), which seems like a good way to see the effect of the optimization pass.  He also removed the peepholer; maybe it's worth doing that, but it seems worth at least keeping the test suite around to ensure a lack of regressions.

I can look at cherrypicking Thomas' work/porting it to py3k.

Re: "aiming high": I'd love to add new optimizations, but it's not clear to me what's permissable.  In particular, is it permissable for an optimization pass to assume that there are no external modifications to the locals within a frame?

It's possible to write code like this:
    frame = inspect.currentframe()
    inspect.getouterframes(frame)[-depth][0].f_locals[name] = value
to manipulate locals; whether or not this actually affects running code in the current implementation of CPython seems hit-or-miss to me right now, I think depending on exactly when fastlocals get written back to the f_locals dictionary (I could have miswritten the precise code).

By strategically manipulating locals in other frames, we can break pretty-much any typical compiler optimization: locals can appear or change from under us, or change attribute values, or gain side-effects to their __getattr__ (e.g. writing to disk).

If it is permissable for an optimization pass to assume that there are no external modifications to the locals within a frame, then issue 4264 might be one to investigate: this is a patch on top of Tom Lee's work (to do local type-inference to replace list.append with LIST_APPEND).   Ideas for other optimizations would be most welcome.

History
Date	User	Action	Args
2010-11-05 23:59:54	dmalcolm	set	recipients: + dmalcolm, jhylton, nnorwitz, rhettinger, belopolsky, sdahlbac, titanstar, thomaslee, benjamin.peterson
2010-11-05 23:59:54	dmalcolm	set	messageid: <1289001594.62.0.713384511884.issue1346238@psf.upfronthosting.co.za>
2010-11-05 23:59:52	dmalcolm	link	issue1346238 messages
2010-11-05 23:59:50	dmalcolm	create