This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ast_for_factor unary minus optimization changes AST
Type: behavior Stage: needs patch
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: alexhsamuel, amaury.forgeotdarc, antocuni, benjamin.peterson, mark.dickinson, meador.inge, nascheme, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2010-06-16 15:38 by alexhsamuel, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue9011.patch mark.dickinson, 2010-06-17 13:31 review
issue9011_v2.patch mark.dickinson, 2012-11-25 16:00 review
Messages (24)
msg107928 - (view) Author: Alex Samuel (alexhsamuel) Date: 2010-06-16 15:38
The unary negative optimization in ast_for_factor() seems to modify the ST in a way changes its meaning.

In Python 3.1.2, the ST is no longer compilable:

$ cat exprbug.py
import parser

st = parser.expr("-3")
print(st.totuple())
compiled = st.compile()
print(eval(compiled))
print(st.totuple())
print(eval(st.compile()))

$ ~/sw/Python-3.1.2/bin/python3 exprbug.py
(258, (326, (301, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (317, (15, '-'), (317, (318, (319, (2, '3')))))))))))))))))), (4, ''), (0, ''))
-3
(258, (326, (301, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (317, (15, '-'), (317, (318, (319, (2, '-3')))))))))))))))))), (4, ''), (0, ''))
Traceback (most recent call last):
  File "exprbug.py", line 8, in <module>
    print(eval(st.compile()))
ValueError: could not convert string to float: --3


In earlier versions of Python (I have confirmed 2.5 and 2.6), it is compiled to incorrect code and produces wrong results when evaluated:

$ ~/sw/Python-2.6.2/bin/python exprbug.py
(258, (327, (304, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (15, '-'), (316, (317, (318, (2, '3'))))))))))))))))), (4, ''), (0, ''))
-3
(258, (327, (304, (305, (306, (307, (308, (310, (311, (312, (313, (314, (315, (316, (15, '-'), (316, (317, (318, (2, '-3'))))))))))))))))), (4, ''), (0, ''))
-1.0


If I remove the big if statement from the front of ast_to_factor(), the code behaves correctly.  I think this is because STR(pnum) is changed in place and never restored to its previous value.
msg107998 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-17 09:21
Confirmed in trunk and py3k, both of which raise ValueError on the second compile:

Python 2.7rc1+ (trunk:82042, Jun 17 2010, 09:52:12) 
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import parser
>>> st = parser.expr("-3")
>>> st.compile()
<code object <module> at 0x450448, file "<syntax-tree>", line 1>
>>> st.compile()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not convert string to float: --3

This looks nasty; unless there's some rule that says that you're not supposed to hang on to AST objects after compiling them.  (Is there?)

Benjamin, would it be worth just getting rid of this optimization for 2.7, and trying to reinstate a correct version for 2.7.1?

[Removing 2.5 from the versions list since there's nothing we can do about it there (2.5 is only getting security fixes at this point).]

See also issue 1441486.
msg108001 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-17 11:15
N.B.  That if block isn't pure optimization:  removing it gives a minor change in behaviour:

Currently (Python 2.7, on a 64-bit machine):

>>> -9223372036854775808
-9223372036854775808

And with the optimization removed:

>>> -9223372036854775808
-9223372036854775808L

I actually consider the second behaviour more correct than the first, since it follows clearly from the language rules (numeric literals have no sign, so the above *should* be interpreted as the unary minus operator applied to a literal, and that literal really is a PyLong).  But obviously the contributors to issue 1441486 either disagree, or didn't want to introduce a regression from 2.4.

I still consider that removing that if block is the right thing to do for 2.7.  The change in behaviour really shouldn't affect any reasonable code---anywhere that an int is acceptable, a long should be too.



Neil, any comments?
msg108002 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-17 11:36
Results from Jython:

newton:jython2.5.1 dickinsm$ ./jython
Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54) 
[Java HotSpot(TM) 64-Bit Server VM (Apple Inc.)] on java1.6.0_20
Type "help", "copyright", "credits" or "license" for more information.
>>> -2147483648
-2147483648L

On the other hand, IronPython appears to detect this special case and produces an int instead of a long.
msg108007 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-17 12:40
Fixed in r82043 (py3k) and r82044 (release31-maint), simply by removing the relevant 'if' block.

For 2.x, Antoine (on IRC) pointed out that there might well be third party code that depends on -0x80000000 being an int rather than a long (32-bit machine), so changing that behaviour could cause breakage.

Can anyone propose a fix that doesn't change behaviour?
msg108011 - (view) Author: Alex Samuel (alexhsamuel) Date: 2010-06-17 13:04
How about saving the original value of STR(pnum) and restoring it after 
calling ast_for_atom()?

This is not thread-safe, but I don't understand Python's threading model 
well enough to know whether the GIL is held in this function.

On 6/17/2010 8:40 AM, Mark Dickinson wrote:
>
> Mark Dickinson<dickinsm@gmail.com>  added the comment:
>
> Fixed in r82043 (py3k) and r82044 (release31-maint), simply by removing the relevant 'if' block.
>
> For 2.x, Antoine (on IRC) pointed out that there might well be third party code that depends on -0x80000000 being an int rather than a long (32-bit machine), so changing that behaviour could cause breakage.
>
> Can anyone propose a fix that doesn't change behaviour?
>
> ----------
>
> _______________________________________
> Python tracker<report@bugs.python.org>
> <http://bugs.python.org/issue9011>
> _______________________________________
msg108014 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-17 13:31
That sounds like a reasonable quick fix.

Here's a patch.
msg108237 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-20 14:29
Summary of brief discussion on #python-dev:

- This is a hacky patch; it would be better to rewrite that portion of ast.c in a way that doesn't modify the original CST at all. With 2.7 so close, I daren't try anything more invasive than this patch, though. (I don't know whether there are any other portions of this file that deliberately change the CST.)

- Given that no-one noticed this problem in 5 years, the fix can probably wait until 2.7.1, when it can be done properly.
msg108969 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-30 10:21
It turns out that removing this code from py3k wasn't a no-op.  It changed the semantics for complex literals.  In Python 3.1.2:

>>> -7j
-7j
>>> (-7j).real
0.0

But in current release31-maint branch, and in 3.2a0:

Python 3.2a0 (py3k:82347M, Jun 28 2010, 22:25:11) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> -7j
(-0-7j)
>>> (-7j).real
-0.0

So IMO r82044 (the release31-maint branch) should be reverted, to avoid nasty surprises when people upgrade from 3.1.2 to 3.1.3.

But I'd like to keep this change in py3k:  I consider that the 3.1 behaviour is buggy, and that the 3.2 behaviour is more correct (strange though it may seem at first sight), so I'd prefer to leave the current py3k behaviour as is, and add a Misc/NEWS entry describing the change if necessary.

Assigning this to me because I don't want it to get forgotten.  But if anyone else is interested in producing a patch for the cleanup of the CST->AST code, please do!
msg108970 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-30 10:35
r82044 reverted in r82389.
msg108971 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-30 10:56
Added tests for the current Python < 3.2 treatment of imaginary literals in r82390 (release31-maint).  These tests should be backported to trunk, but I'll wait until after the release of 2.7 for that.
msg108973 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-30 11:18
Added tests for Python 3.2's corrected treatment of negated imaginary literals in r82391, along with a Misc/NEWS entry indicating the changed behaviour.

Restored version numbers for this issue, which I seem to have accidentally deleted earlier.  (Including 3.1, since it could also benefit from the same cleanup.)
msg119089 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2010-10-18 22:32
I'm late to the party but looks like Mark has a good handle on the issues.  I would recommend not changing behavior in a bugfix release.  I'm pretty sure code "in the wild" would be broken.
msg124180 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-17 03:42
Mark, are you still planning to do something for 3.1/2.7, or should this be closed?
msg124206 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-12-17 15:03
Yes, sorry;  I'm not likely to find time to do anything with this.  Unassigning, and downgrading priority.

Is it worth leaving this open in case anyone wants to do something about it?
msg124209 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-12-17 15:16
Given the long projected lifetime of 2.7, I suppose it is.
msg173258 - (view) Author: Antonio Cuni (antocuni) * Date: 2012-10-18 12:02
there is still an inconsistency in handling negative imaginary literals:

>>> -1j.real
-0.0
>>> complex('-1j').real
0.0
msg173259 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-18 12:04
No, that's expected behaviour.

1j is complex(0.0, 1.0)
-1j is complex(-0.0, -1.0)

so -1j.real is -0.0.  There's not really any other sensible way to handle this.

The complex-from-string constructor, on the other hand, is more careful about interpreting signs.
msg173260 - (view) Author: Antonio Cuni (antocuni) * Date: 2012-10-18 12:13
I would say that the complex-from-string constructor should be fixed to handle this special case "correctly".
I find very confusing that we get a different result whether we use a string literal or not.

For example, in pypy we use the same code for parsing literals and converting strings, so you get -0.0 in both cases.
msg173261 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-18 12:18
With the string, the minus sign applies only to the imaginary part;  with the expression '-1j', it applies to the whole complex number (both real and imaginary parts).

I don't see any sensible way to 'fix' the string to complex conversion; indeed, I think any change would make it worse than before.  It's a known issue with complex arithmetic that x + 1j*y doesn't give you complex(x, y);  the conversions from string and the complex(x, y) form are there to make it possible to carefully create a complex number with known real and imaginary parts.


> For example, in pypy we use the same code for parsing literals and
> converting strings, so you get -0.0 in both cases.

But -1j isn't a literal.  It's unary minus applied to a the complex number given by the literal '1j'.  Python's code *does* give the same results both for converting strings and parsing literals.
msg173262 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-10-18 12:22
Indeed, -1j is not a literal:

>>> dis.dis(lambda :-1j.real)
  1           0 LOAD_CONST               0 (1j)
              3 LOAD_ATTR                0 (real)
              6 UNARY_NEGATIVE      
              7 RETURN_VALUE
msg173263 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-10-18 12:31
And -1j.real is -(1j.real), of course, not (-1j).real.  My bad.
msg176368 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-11-25 16:00
New patch.
msg176375 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-11-25 17:11
New changeset ea36de7926f8 by Mark Dickinson in branch '2.7':
Issue #9011: AST creation no longer modifies CST for negated numeric literals.
http://hg.python.org/cpython/rev/ea36de7926f8
History
Date User Action Args
2022-04-11 14:57:02adminsetgithub: 53257
2012-11-25 17:12:08mark.dickinsonsetstatus: open -> closed
resolution: fixed
2012-11-25 17:11:43python-devsetnosy: + python-dev
messages: + msg176375
2012-11-25 16:00:42mark.dickinsonsetfiles: + issue9011_v2.patch
assignee: mark.dickinson
messages: + msg176368
2012-11-25 15:18:18mark.dickinsonsetversions: - Python 2.6, Python 3.1
2012-10-18 12:31:49mark.dickinsonsetmessages: + msg173263
2012-10-18 12:22:20amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg173262
2012-10-18 12:18:24mark.dickinsonsetmessages: + msg173261
2012-10-18 12:13:35antocunisetmessages: + msg173260
2012-10-18 12:04:01mark.dickinsonsetmessages: + msg173259
2012-10-18 12:02:18antocunisetnosy: + antocuni
messages: + msg173258
2010-12-17 15:16:49r.david.murraysetnosy: nascheme, mark.dickinson, benjamin.peterson, alexhsamuel, r.david.murray, meador.inge
messages: + msg124209
2010-12-17 15:03:14mark.dickinsonsetpriority: high -> normal

messages: + msg124206
assignee: mark.dickinson -> (no value)
nosy: nascheme, mark.dickinson, benjamin.peterson, alexhsamuel, r.david.murray, meador.inge
2010-12-17 03:42:46r.david.murraysetnosy: + r.david.murray

messages: + msg124180
stage: commit review -> needs patch
2010-10-18 22:32:20naschemesetmessages: + msg119089
2010-06-30 11:18:23mark.dickinsonsetmessages: + msg108973
versions: + Python 2.6, Python 3.1, Python 2.7
2010-06-30 10:56:53mark.dickinsonsetmessages: + msg108971
2010-06-30 10:35:11mark.dickinsonsetmessages: + msg108970
2010-06-30 10:21:42mark.dickinsonsetassignee: mark.dickinson
messages: + msg108969
versions: - Python 2.6, Python 2.7
2010-06-21 12:42:56meador.ingesetnosy: + meador.inge
2010-06-20 14:29:24mark.dickinsonsetmessages: + msg108237
2010-06-17 13:31:42mark.dickinsonsetfiles: + issue9011.patch
keywords: + patch
messages: + msg108014

stage: commit review
2010-06-17 13:04:13alexhsamuelsetmessages: + msg108011
2010-06-17 12:40:27mark.dickinsonsetversions: - Python 3.1, Python 3.2
2010-06-17 12:40:18mark.dickinsonsetmessages: + msg108007
2010-06-17 11:36:09mark.dickinsonsetmessages: + msg108002
2010-06-17 11:15:59mark.dickinsonsetpriority: normal -> high
nosy: + nascheme
messages: + msg108001

2010-06-17 09:29:21mark.dickinsonsetversions: - Python 2.5
2010-06-17 09:21:03mark.dickinsonsetmessages: + msg107998
versions: + Python 2.7, Python 3.2
2010-06-17 08:49:09mark.dickinsonsetnosy: + mark.dickinson
2010-06-17 01:03:41r.david.murraysetnosy: + benjamin.peterson
2010-06-16 15:38:58alexhsamuelcreate