classification
Title: quadratic-time compilation in the number of 'and' or 'or' expressions
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.5, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, brett.cannon, dalke, georg.brandl, ncoghlan, pitrou, python-dev, rhettinger
Priority: normal Keywords: patch

Created on 2014-05-18 12:31 by dalke, last changed 2014-05-23 10:00 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
quadratic_shortcircuit_compilation.py dalke, 2014-05-18 12:31 Demo quadratic-time compilation with 'and' expressions
stackdepth.patch pitrou, 2014-05-22 18:46 review
stackdepth2.patch pitrou, 2014-05-22 19:02 review
Messages (9)
msg218742 - (view) Author: Andrew Dalke (dalke) * (Python committer) Date: 2014-05-18 12:31
Python's compiler has quadratic-time time behavior based on the number of "and" or "or" expressions. A profile shows that stackdepth_walk is calling itself in a stack at least 512 levels deep. (My profiler doesn't go higher than that.)

I've reduced it to a simple test case. Compiling functions of the form

    def f(x):
        x * x  # Repeat N times

takes linear time in the number of lines N, while functions of the form
        
    def g(x):
        x and x  # Repeat N times

takes quadratic time in N. Here's an example of running the attached demonstration code on a fresh build of Python from version control:

    Results from 3.5.0a0 (default:de01f7c37b53, May 18 2014, 13:18:43)  

      num    using  using
     tests    '*'   'and'
       100   0.002  0.002
       200   0.003  0.004
       400   0.005  0.010
       800   0.012  0.040
      1600   0.023  0.133
      3200   0.042  0.906
      6400   0.089  5.871
     12800   0.188  27.581
     25600   0.404  120.800

The same behavior occurs when I replace 'and' with 'or'.

The same behavior also occurs under Python 2.7.2, 3.3.5, 3.4.0. (I don't have builds of 3.1 or 3.2 for testing.)

However, the demonstration code shows linear time under Python 2.6.6:

    Results from 2.6.6 (r266:84374, Aug 31 2010, 11:00:51)  
    
      num    using  using
     tests    '*'   'and'
       100   0.003  0.001
       200   0.002  0.002
       400   0.006  0.008
       800   0.010  0.010
      1600   0.019  0.022
      3200   0.039  0.045
      6400   0.085  0.098
     12800   0.176  0.203
     25600   0.359  0.423
     51200   0.726  0.839

I came across this problem because my code imports a large machine-generated module. It was originally written for Python 2.6, where it worked just fine. When I tried to import it under new Pythons, the import appeared to hang, and I killed it after a minute or so.

As a work-around, I have re-written the code generator to use chained if-statements instead of the short-circuit and operator.
msg218886 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-05-21 21:25
Andrew, have you tried to bisect the Mercurial repository to find where the regression was introduced?
msg218889 - (view) Author: Andrew Dalke (dalke) * (Python committer) Date: 2014-05-22 00:03
Live and learn. I did my first bisect today.

The first bad revision is:
changeset:   51920:ef8fe9088696
branch:      legacy-trunk
parent:      51916:4e1556012584
user:        Jeffrey Yasskin <jyasskin@gmail.com>
date:        Sat Feb 28 19:03:21 2009 +0000
summary:     Backport r69961 to trunk, replacing JUMP_IF_{TRUE,FALSE} with

I confirmed that the parent did not have the problem.

If you want me to diagnose this further, then I'll need some hints on what to do next.
msg218907 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-05-22 18:34
You're right, CPU time is burnt by stackdepth_walk().
The underlying issue is that max stacksize is computed a bit pessimistically with the new opcodes (JUMP_IF_{TRUE,FALSE}_OR_POP). On normal functions there wouldn't be a sizable difference, but on pathological cases such as yours, a code object could end up claiming a large stack size (as large of the number of such opcodes) rather than a very small number.

Still, of course, stackdepth_walk() should not blow up when computing a large stack size, but fixing the opcode's stack effect in compile.c seems to fix the issue anyway.
msg218909 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-05-22 18:46
Here is a patch.
msg218911 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-05-22 19:02
Updated patch adding some tests.
msg218922 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-05-22 22:15
The patch looks like the correct solution.

That said, I'm more impressed that you were able to test it so cleanly :-)
msg218955 - (view) Author: Roundup Robot (python-dev) Date: 2014-05-23 09:58
New changeset cd62cc331572 by Antoine Pitrou in branch '3.4':
Issue #21523: Fix over-pessimistic computation of the stack effect of some opcodes in the compiler.
http://hg.python.org/cpython/rev/cd62cc331572

New changeset e61462e18112 by Antoine Pitrou in branch 'default':
Issue #21523: Fix over-pessimistic computation of the stack effect of some opcodes in the compiler.
http://hg.python.org/cpython/rev/e61462e18112

New changeset 49588a510ed4 by Antoine Pitrou in branch '2.7':
Issue #21523: Fix over-pessimistic computation of the stack effect of some opcodes in the compiler.
http://hg.python.org/cpython/rev/49588a510ed4
msg218956 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-05-23 10:00
This should be fixed now!
History
Date User Action Args
2014-05-23 10:00:06pitrousetstatus: open -> closed
resolution: fixed
messages: + msg218956

stage: patch review -> resolved
2014-05-23 09:58:19python-devsetnosy: + python-dev
messages: + msg218955
2014-05-22 22:15:30rhettingersetnosy: + rhettinger
messages: + msg218922
2014-05-22 19:02:48pitrousetfiles: + stackdepth2.patch

stage: patch review
messages: + msg218911
versions: - Python 3.3
2014-05-22 18:46:12pitrousetfiles: + stackdepth.patch
keywords: + patch
messages: + msg218909
2014-05-22 18:34:48pitrousetmessages: + msg218907
2014-05-22 00:03:26dalkesetmessages: + msg218889
2014-05-21 21:25:14pitrousetmessages: + msg218886
2014-05-21 21:10:17rhettingersetnosy: + pitrou
2014-05-18 18:34:35pitrousetnosy: + brett.cannon, georg.brandl, ncoghlan, benjamin.peterson
2014-05-18 12:31:45dalkecreate