Title: scary frame speed hacks
Type: Stage:
Components: Interpreter Core Versions: Python 2.5
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: arigo, georg.brandl, jhylton, mwh, rhettinger, richard, tim.peters
Priority: normal Keywords: patch

Created on 2004-01-13 16:49 by mwh, last changed 2006-05-23 11:16 by tim.peters. This issue is now closed.

File name Uploaded Description Edit
scary-frame-hacks.diff mwh, 2004-01-13 18:23 mwh#s patch #1 attempt 2
frame_dealloc_clears_more.diff arigo, 2004-01-14 17:35 arigo patch #1
zombie-frames-2.diff arigo, 2004-01-27 17:08 arigo patch #2 attempt 2
Messages (16)
msg45295 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2004-01-13 16:49
In ceval.c we find

		/* XXX Perhaps we should create a specialized
		   PyFrame_New() that doesn't take locals, but does
		   take builtins without sanity checking them.

This patch takes that idea rather further than you
might have expected... it creates a "light" subtype of
frame that assumes certain things about the frame,
gives this type its own free list (so it can assume
more about objects on the freelist) and converts light
frames into "heavy" frames when assumptions stop being

Good for a ~5% improvement on "./python -s 'def f():
pass' 'f()'"; a bit less on pystone.  It also conflicts
slightly with my function reorg patch -- apply that
first, apply this, ignore the reject and edit
func_caller_nofrees in funcobject.c to call

All three patches I just submitted together get ~6% on
msg45296 - (view) Author: Jeremy Hylton (jhylton) Date: 2004-01-13 18:20
Logged In: YES 

I don't see any files attached.

msg45297 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2004-01-13 18:23
Logged In: YES 

msg45298 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-01-14 17:35
Logged In: YES 

(Side note first: I'm not sure 'builtins = back->f_builtins'
is right.)

Is the whole subclassing complexity worth the effort, given
that the invariants of light frames only seem to be that
four specific fields are null?  Changing the type of an
object under Python code's feet is calling for troubles. 
Moreover it is bound to break code that expect
'type(frame)==FrameType', even if such code can be
considered bad style.

Moreover it requires a number of hacks here and there --
e.g. you turn a light frame into a "heavy" frame when
f_trace is set; is it on purpose that you don't do it when
f_locals is set?

I cannot seem to get reliable performance results on my
machine, but maybe you want to compare with the attached
patch which speeds up the regular PyFrame_New by putting
stronger invariants on all the frames in the free_list.
msg45299 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2004-01-15 11:02
Logged In: YES 

I'm fairly sure this made a difference on my iBook; haven't
tried on x86.

It's possible that the correct response to this patch is to
add "... nah, not worth it" to the XXX comment in ceval.c...
msg45300 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-01-27 17:03
Logged In: YES 

Here is yet another try, which seems to perform better on my PentiumIII.  I get the following speed improvements for this patch alone, for a loop calling an empty function:

zombie-frames.diff: 11.4%      (PyStone 3.8%)
scary-frame-hacks.diff: 6.4%   (PyStone 0.85%)

The idea is to get rid of the free_list and instead store the most recently finished ("zombie") frame in an internal field of the code object.  This saves half of the frame creation overhead because half of the fields are already correct when the frame is reused, e.g. f_code, f_nlocals, f_stacksize, f_valuestack...

(you might need to cvs up frameobject.c before you can apply the patch)
msg45301 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2004-02-02 11:20
Logged In: YES 

Did I mention this idea to you or did you come up with it 
independently?  I forget...

I'll try to time stuff on my iBook tomorrow.
msg45302 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-02-02 13:04
Logged In: YES 

I guess the idea was just in the air, after your published attempts.

Ideally I'd have liked to have the cached frame depend on the globals as well as the code object itself; I considered moving the cache field to function objects.  This way you also save the f_globals and f_builtins initialization.  There were problems but maybe we should try harder.
msg45303 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-03-01 12:20
Logged In: YES 

Armin's second patch gives gives the expected speedups on a
Pentium3 running WinME, and the test suite runs without
exception.  I recommend accepting and applying this patch as
is.  Further improvements can be considered separately.
msg45304 - (view) Author: Michael Hudson (mwh) (Python committer) Date: 2004-03-01 12:23
Logged In: YES 

It slows recursive functions down a noticeable amount (did
this get noted anywhere?  Maybe Armin & I just talked about
it on IRC), so that should be considered before this patch
is applied.  But I think it's probably worth it, FWIW.
msg45305 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-03-01 13:25
Logged In: YES 

The effect on recursive functions could be mitigated by
restoring the freelist and falling back to it when
code->co_zombieframe == NULL.

I don't know if that is worth it.  The current patch is a
code simplication as well as an optimization.  Adding back
the freelist, adds a lot of clutter.  Python is not
especially friendly to recursive functions anyway.
msg45306 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2004-03-01 13:59
Logged In: YES 

On a small recursive example it slows down from 2.64s to 3.26s. 
This is a serious difference (20%).  Is it bad enough to keep the 
freelist ?
msg45307 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2004-03-01 15:50
Logged In: YES 

I think that's a question for python-dev.
msg45308 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-02-20 11:44
Logged In: YES 

Can this be reviewed for 2.5? The relevant discussion on
python-dev is at
msg45309 - (view) Author: Richard Jones (richard) * (Python committer) Date: 2006-05-22 17:25
Logged In: YES 

Patch modified and applied to python2.5. Mods:

1. updated to python2.5
2. reinstated use of free list

See the "rjones-funccall" branch in SVN.
msg45310 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-05-23 11:16
Logged In: YES 

Closed as Accepted.  While re-adding the free list removed
the code simplification benefit, the measurable x-platform
speedup is well worth getting.
Date User Action Args
2004-01-13 16:49:51mwhcreate