Title: ast.literal_eval() shouldn't accept booleans as numbers in AST
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.8
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, rhettinger, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2018-02-21 09:56 by serhiy.storchaka, last changed 2020-05-16 19:59 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 5798 closed serhiy.storchaka, 2018-02-21 10:02
PR 340 Rosuav, 2019-05-13 15:49
Messages (6)
msg312485 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-21 09:56
Currently ast.literal_eval() accepts AST representing expressions like "+True" or "True+2j" if constants are represented as Constant. This is because the type of the value is tested with `isinstance(left, (int, float))` and since bool is a subclass of int it passes this test.

The proposed PR makes ast.literal_eval() using tests for exact type. I don't think it is worth backporting since it affects only passing AST to ast.literal_eval(). Usually ast.literal_eval() is used for evaluating strings.
msg312498 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-02-21 18:04
What harm comes from accepting expressions like "+True" or "True+2j"?  While weird looking, those are valid Python expressions, so this doesn't seem like a bug.  A key feature of booleans is that they are interoperable with integers.
msg312500 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-21 18:30
This doesn't look like Python literal. And if the function accepts a one particular non-literal the user can except that it accepts other looking similarly non-literal, that is false.

Actually ast.literal_eval("+True") is error. But ast.literal_eval(ast.UnaryOp(ast.UAdd(), ast.Constant(True))) is successful by oversight.

And look at this from other side. What is the benefit of accepting "+True"? This doesn't make the code simpler.
msg312687 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-02-24 02:04
Whoops, my previous response was wrong as written because I wrongly thought that if literal_eval accepts number_literal + imaginary_literal, it would also accept number_literal + number_literal. I am replacing it with the following.

The ast.literal_eval doc says 
"The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None."

Since 'True', '(True,)', '[True]', '{True:True}', '+1', '-1' and '1+1j' are accepted as 'literal structures', it seems arbitrary that at least '+True' and 'True+1j' are not.  ('+1', '1+1j' and especially '-1' seem to violate the limitations to 'literal', 'container', and 'no operator', but that is a different issue.)

I strongly agree that the acceptable string inputs and acceptable AST inputs should match.  The question is which version of the domain should be changed.  I would at least mildly prefer that the issue be "ast.literal_eval should consistly treat False and True the same as 0 and 1.", which means expanding the string version.  As Raymond said, this is the general rule in Python.  What is the benefit of having a different rule for this one function?
msg312700 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-02-24 06:55
The definition of "Python literal structures" is not specified, but it is implied that ast.literal_eval() should accept signed numbers and tuple/list/set/dict displays consisting of "Python literal structures".

ast.literal_eval() accepts unary minus for supporting negative numbers. In Python 2 "-42" was parsed as a single AST node Num(42), but in Python 3 it is parsed as UnaryOp(USub(), Num(42)). ast.literal_eval() accepts binary plus and minus for supporting complex numbers (there are oddities in this). The support of unary plus was added somewhere for unknown reason (see 884c71cd8dc6, no issue number), but it doesn't harm.

Below in the documentation: "It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing." Arbitrarily complex expressions involving addition and subtraction (like "2017-10-10") was supported in Python 3 (but not in Python 2), but it was fixed in 3.7 (see issue31778). It doesn't support other operations or operations with non-numeric literals. It is oversight that UnaryOp(USub(), Constant(True)) is still accepted.
msg369063 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-16 19:59
No longer reproduced.
Date User Action Args
2020-05-16 19:59:42serhiy.storchakasetstatus: open -> closed
resolution: out of date
messages: + msg369063

stage: patch review -> resolved
2019-05-13 15:49:45Rosuavsetpull_requests: + pull_request13196
2018-07-03 17:56:15serhiy.storchakalinkissue32888 dependencies
2018-02-24 06:55:07serhiy.storchakasetmessages: + msg312700
2018-02-24 02:04:23terry.reedysetmessages: + msg312687
2018-02-24 01:17:14terry.reedysetmessages: - msg312682
2018-02-24 01:10:24terry.reedysetnosy: + terry.reedy
messages: + msg312682
2018-02-21 18:30:09serhiy.storchakasetmessages: + msg312500
2018-02-21 18:04:03rhettingersetnosy: + rhettinger
messages: + msg312498
2018-02-21 10:02:21serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request5577
2018-02-21 09:56:39serhiy.storchakasetnosy: + benjamin.peterson
2018-02-21 09:56:21serhiy.storchakacreate