classification
Title: Negative tuple elements produce inefficient code.
Type: performance Stage: commit review
Components: Versions: Python 3.1, Python 3.2, Python 3.3
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: Arfrever, eltoder, jdharper, mark.dickinson, nadeem.vawda, pitrou, python-dev, r.david.murray, rhettinger
Priority: high Keywords: patch

Created on 2011-02-18 17:41 by jdharper, last changed 2011-03-23 18:20 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
test_peepholer.patch jdharper, 2011-02-18 22:25 Patch to test_peephole that verifies expressions containing negative constants are optimized.
constfold.patch pitrou, 2011-02-18 23:09 review
constfold2.patch pitrou, 2011-02-25 21:09 review
fold-0.patch eltoder, 2011-03-10 21:17
fold-0.2.patch eltoder, 2011-03-11 15:30
Messages (28)
msg128795 - (view) Author: Jeffrey Harper (jdharper) Date: 2011-02-18 17:41
In Python 3.2, a tuple like (1,-2,3) will not be optimized into a constants at compile time.  The tuple is built at run-time.  Earlier versions of Python optimized these tuples at compile time.

Here's an example program.

# test.py
from dis import dis

def x(): return (1,2,3)
def y(): return (1,-2,3)

print ("dis x:")
dis(x)
print()

print("dis y:")
dis(y)

The compiler in 3.2rc3 produces code for function y() that builds the tuple at run-time while the tuple in x() is optimized at compile time.

C:\tmp>c:\python32\python --version
Python 3.2rc3

C:\tmp>c:\python32\python test.py
dis x:
  3           0 LOAD_CONST               4 ((1, 2, 3))
              3 RETURN_VALUE

dis y:
  4           0 LOAD_CONST               1 (1)
              3 LOAD_CONST               4 (-2)
              6 LOAD_CONST               3 (3)
              9 BUILD_TUPLE              3
             12 RETURN_VALUE

However, under 3.1.3, the tuples in both functions are optimized at compile time.

C:\tmp>c:\python31\python test.py
dis x:
  3           0 LOAD_CONST               4 ((1, 2, 3))
              3 RETURN_VALUE

dis y:
  4           0 LOAD_CONST               4 ((1, -2, 3))
              3 RETURN_VALUE

Although the compiled code produced 3.2rc3 is correct, it is much slower than the code generated by 3.1.3.
msg128801 - (view) Author: Jeffrey Harper (jdharper) Date: 2011-02-18 19:05
I have also determined that negative elements interfere with the frozenset optimization described in issue6690.  http://bugs.python.org/issue6690.

Here's an example program:

# test.py
from dis import dis

def x(var): return var in {1,2,3} # Note curly braces.  These are sets.
def y(var): return var in {1,-2,3}

print ("dis x:")
dis(x)
print()

print("dis y:")
dis(y)

Running this produces:

C:\tmp>c:\Python32\python.exe test.py
dis x:
  3           0 LOAD_FAST                0 (var)
              3 LOAD_CONST               4 (frozenset({1, 2, 3}))
              6 COMPARE_OP               6 (in)
              9 RETURN_VALUE

dis y:
  4           0 LOAD_FAST                0 (var)
              3 LOAD_CONST               1 (1)
              6 LOAD_CONST               4 (-2)
              9 LOAD_CONST               3 (3)
             12 BUILD_SET                3
             15 COMPARE_OP               6 (in)
             18 RETURN_VALUE
msg128805 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-02-18 19:25
I wonder if this has anything to do with issue 9011?
msg128806 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-18 19:34
The culprit is r82043: "Issue #9011: Remove buggy and unnecessary ST->AST compilation code".
msg128809 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-18 19:58
The problem is that the "UNARY_NEGATIVE + LOAD_CONST" optimization, which occurs first in bytecode order, inserts a NOP and prevents the tuple optimization to work.
msg128810 - (view) Author: Jeffrey Harper (jdharper) Date: 2011-02-18 20:00
I think r82043 may also explain why 3.1.3 can fold the expression 2 * -3 into -6 while 3.2rc3 cannot.

# test.py
from dis import dis

def y(): 2 * -3

print("dis y:")
dis(y)



C:\tmp>c:\Python32\python.exe --version
Python 3.2rc3

C:\tmp>c:\Python32\python.exe test.py
dis y:
  3           0 LOAD_CONST               1 (2)
              3 LOAD_CONST               3 (-3)
              6 BINARY_MULTIPLY
              7 POP_TOP
              8 LOAD_CONST               0 (None)
             11 RETURN_VALUE

C:\tmp>c:\Python31\python.exe --version
Python 3.1.3

C:\tmp>c:\Python31\python.exe test.py
dis y:
  3           0 LOAD_CONST               3 (-6)
              3 POP_TOP
              4 LOAD_CONST               0 (None)
              7 RETURN_VALUE
msg128812 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-18 20:02
Ouch. Obviously test_peephole doesn't have enough tests...
msg128817 - (view) Author: Jeffrey Harper (jdharper) Date: 2011-02-18 22:25
Here's a patch against the version of test_peepholer.py in 3.2rc3.  It verifies that expressions like the following are optimized:

3*-4
(1,-2,3)
a in {1,-2,3)
msg128825 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-18 23:09
Here is a patch that enables advanced (recursive) constant folding.
Example:

python -c "import dis; f=lambda x: x in {(3*-5)+(-1-6),(1,-2,3)*2,None};dis.dis(f)"

With 3.1:
  1           0 LOAD_FAST                0 (x) 
              3 LOAD_CONST               7 (-15) 
              6 LOAD_CONST               8 (-7) 
              9 BINARY_ADD           
             10 LOAD_CONST              10 ((1, -2, 3, 1, -2, 3)) 
             13 LOAD_CONST              11 (None) 
             16 BUILD_SET                3 
             19 COMPARE_OP               6 (in) 
             22 RETURN_VALUE         

With 3.2+patch:
  1           0 LOAD_FAST                0 (x) 
              3 LOAD_CONST              14 (frozenset({None, -22, (1, -2, 3, 1, -2, 3)})) 
              6 COMPARE_OP               6 (in) 
              9 RETURN_VALUE
msg128841 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-02-19 10:45
Unassigning.  I don't think that r82043 is the *real* culprit here;  that bugfix just happened to expose a deficiency in the peepholer;  one that's already present in other situations:

>>> dis.dis(lambda: 2*(3*4))
  1           0 LOAD_CONST               1 (2)
              3 LOAD_CONST               4 (12)
              6 BINARY_MULTIPLY     
              7 RETURN_VALUE        
>>> dis.dis(lambda: (2*3)*4)
  1           0 LOAD_CONST               5 (24)
              3 RETURN_VALUE        

Antoine, does your patch take care of this case too?
msg128844 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-19 11:04
Le samedi 19 février 2011 à 10:45 +0000, Mark Dickinson a écrit :
> Mark Dickinson <dickinsm@gmail.com> added the comment:
> 
> Unassigning.  I don't think that r82043 is the *real* culprit here;  that bugfix just happened to expose a deficiency in the peepholer;  one that's already present in other situations:
> 
> >>> dis.dis(lambda: 2*(3*4))
>   1           0 LOAD_CONST               1 (2)
>               3 LOAD_CONST               4 (12)
>               6 BINARY_MULTIPLY     
>               7 RETURN_VALUE        
> >>> dis.dis(lambda: (2*3)*4)
>   1           0 LOAD_CONST               5 (24)
>               3 RETURN_VALUE        
> 
> Antoine, does your patch take care of this case too?

It should. Can you test?
msg128876 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-02-19 19:01
> It should. Can you test?

Ah, you're asking me to actually do some (minimal) work instead of just complaining?

Yep, the patch tests fine over here (OS X 10.6), and fixes the 2*(3*4) case.
msg129429 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-25 21:08
This patch has tests and is also able to constant-fold tuples larger than 256 elements.
msg129430 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-02-25 21:09
Forgot to attach new patch, sorry.
msg130529 - (view) Author: Eugene Toder (eltoder) Date: 2011-03-10 21:17
As discussed on the list, peephole refuses to fold -0. The reasons for this are unclear. Folding was disabled with this commit:
http://hg.python.org/cpython/diff/660419bdb4ae/Python/compile.c

Here's a trivial patch to enable the folding again, along with a test case. make test passes with the patch.
The change is independent from Antoine's patches.
msg130550 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-03-11 08:32
fold-0.patch looks good to me, but why do you include tests only for the float case (-0.0) and not the integer case (-0)?

Style nitpick:  "def negzero(): return -(1.0-1.0)" should be on two source lines, not one.
msg130576 - (view) Author: Eugene Toder (eltoder) Date: 2011-03-11 14:12
Mark, looks better now?
msg130579 - (view) Author: Eugene Toder (eltoder) Date: 2011-03-11 15:30
(forgot parens around 0)
msg130582 - (view) Author: Roundup Robot (python-dev) Date: 2011-03-11 16:27
New changeset 14205d0fee45 by Antoine Pitrou in branch 'default':
Issue #11244: The peephole optimizer is now able to constant-fold
http://hg.python.org/cpython/rev/14205d0fee45
msg130583 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-11 16:29
Ok, Eugene's patch is left to apply now. Can I assign this to you, Mark?
msg130640 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-03-11 23:13
Eugene's patch looks like a correct fix to the regression.  I'll review further in the next couple of days.

Antoine, we need to further discuss your checkin to the peephole optimizer.  I believe it was inappropriate.
msg130641 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-11 23:15
> Eugene's patch looks like a correct fix to the regression.

No, it's orthogonal.
msg130642 - (view) Author: Eugene Toder (eltoder) Date: 2011-03-11 23:17
Yes, my patch doesn't fix the regression, only a special case of -0.
msg130665 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-03-12 09:35
Eugene's new patch looks good to me;  +1 on applying it.

Raymond, do you happen to remember why it was necessary to add the zero-check in the first place?
msg131068 - (view) Author: Eugene Toder (eltoder) Date: 2011-03-15 23:41
Is anyone reviewing my patch? It's just 1 line long. Should it be moved to another issue? Though, technically, it's needed to address the regression in question: Python 3.1 folds -0, the current code still does not.
msg131069 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-15 23:48
Eugene, according to Mark your patch is fine. It just needs someone to commit it. I would personally prefer to let Mark handle it, since he's our specialist in negative zeroes (and other numerical subtleties) ;)
msg131901 - (view) Author: Roundup Robot (python-dev) Date: 2011-03-23 18:14
New changeset ead9c1b9f547 by Mark Dickinson in branch 'default':
Issue #11244: Remove outdated peepholer check that was preventing the peepholer from folding -0 and -0.0.  Thanks Eugene Toder for the patch.
http://hg.python.org/cpython/rev/ead9c1b9f547
msg131902 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-03-23 18:20
Fixed in 'default' branch.  Note that the regression still exists in 3.2;  I'm not sure that it's worth backporting the two fixes.
History
Date User Action Args
2011-03-23 18:20:49mark.dickinsonsetstatus: open -> closed

messages: + msg131902
2011-03-23 18:14:13python-devsetmessages: + msg131901
2011-03-22 17:39:25rhettingersetassignee: rhettinger -> mark.dickinson
nosy: rhettinger, mark.dickinson, pitrou, nadeem.vawda, Arfrever, r.david.murray, jdharper, python-dev, eltoder
2011-03-15 23:48:50pitrousetnosy: rhettinger, mark.dickinson, pitrou, nadeem.vawda, Arfrever, r.david.murray, jdharper, python-dev, eltoder
versions: + Python 3.1
messages: + msg131069
stage: patch review -> commit review
2011-03-15 23:41:24eltodersetnosy: rhettinger, mark.dickinson, pitrou, nadeem.vawda, Arfrever, r.david.murray, jdharper, python-dev, eltoder
messages: + msg131068
2011-03-12 10:13:59nadeem.vawdasetnosy: + nadeem.vawda
2011-03-12 09:35:08mark.dickinsonsetnosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, python-dev, eltoder
messages: + msg130665
2011-03-11 23:17:32eltodersetnosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, python-dev, eltoder
messages: + msg130642
2011-03-11 23:15:12pitrousetnosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, python-dev, eltoder
messages: + msg130641
2011-03-11 23:13:09rhettingersetassignee: mark.dickinson -> rhettinger
messages: + msg130640
nosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, python-dev, eltoder
2011-03-11 16:29:08pitrousetassignee: mark.dickinson
messages: + msg130583
nosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, python-dev, eltoder
2011-03-11 16:27:58python-devsetnosy: + python-dev
messages: + msg130582
2011-03-11 15:30:03eltodersetfiles: + fold-0.2.patch
nosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, eltoder
messages: + msg130579
2011-03-11 15:29:39eltodersetfiles: - fold-0.patch
nosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, eltoder
2011-03-11 14:12:58eltodersetfiles: + fold-0.patch
nosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, eltoder
messages: + msg130576
2011-03-11 08:32:48mark.dickinsonsetnosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper, eltoder
messages: + msg130550
2011-03-10 21:17:50eltodersetfiles: + fold-0.patch
nosy: + eltoder
messages: + msg130529

2011-02-25 21:09:17pitrousetfiles: + constfold2.patch
nosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper
messages: + msg129430
2011-02-25 21:08:39pitrousetnosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper
versions: + Python 3.3
messages: + msg129429
stage: patch review
2011-02-19 19:01:45mark.dickinsonsetnosy: rhettinger, mark.dickinson, pitrou, Arfrever, r.david.murray, jdharper
messages: + msg128876
2011-02-19 15:29:11Arfreversetnosy: + Arfrever
2011-02-19 11:04:57pitrousetnosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
messages: + msg128844
2011-02-19 10:45:36mark.dickinsonsetassignee: mark.dickinson -> (no value)
messages: + msg128841
nosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
2011-02-18 23:09:24pitrousetfiles: + constfold.patch
nosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
messages: + msg128825
2011-02-18 22:25:20jdharpersetfiles: + test_peepholer.patch

messages: + msg128817
keywords: + patch
nosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
2011-02-18 22:22:57rhettingersetpriority: normal -> high
nosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
2011-02-18 20:02:01pitrousetnosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
messages: + msg128812
2011-02-18 20:00:24jdharpersetnosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
messages: + msg128810
2011-02-18 19:58:30pitrousetnosy: rhettinger, mark.dickinson, pitrou, r.david.murray, jdharper
messages: + msg128809
2011-02-18 19:34:21pitrousetassignee: mark.dickinson

messages: + msg128806
nosy: + pitrou
2011-02-18 19:25:05r.david.murraysetnosy: + r.david.murray, mark.dickinson
messages: + msg128805
2011-02-18 19:14:22pitrousetnosy: + rhettinger
2011-02-18 19:05:19jdharpersetmessages: + msg128801
2011-02-18 17:41:04jdharpercreate