Issue 215126: Over restricted type checking on eval() function

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/33178

classification

Title:	Over restricted type checking on eval() function
Type:	enhancement	Stage:
Components:	Interpreter Core	Versions:

process

Status:	closed	Resolution:
Dependencies:		Superseder:
Assigned To:	arigo	Nosy List:	aaiyer, arigo, gregsmith, mcherm, pfremy, rhettinger, tim.peters
Priority:	high	Keywords:

Created on 2000-09-22 19:36 by anonymous, last changed 2022-04-10 16:02 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
any_locals.diff	arigo, 2004-06-22 21:07	Any object can be used a f_locals now.
geneval3.diff	rhettinger, 2004-06-27 09:21	Patch with unittests, ready for 2nd review
geneval4a.diff	rhettinger, 2004-06-29 08:51	Revised patch allowing stores and deletes

Messages (24)
msg1555 - (view)	Author: Nobody/Anonymous (nobody)	Date: 2000-09-22 19:36
The built-in function eval() takes a string argument and a dictionary. The second argument should allow any instance which defines __getitem__ as opposed to just dictionaries. The following example creates a type error: eval, argument 2: expected dictionary, instance found class SpreadSheet: _cells = {} def __setitem__( self, key, formula ): self._cells[key] = formula def __getitem__( self, key ): return eval( self._cells[key], self ) ss = SpreadSheet() ss['a1'] = '5' ss['a2'] = 'a1*5' ss['a2']
msg1556 - (view)	Author: Tim Peters (tim.peters) *	Date: 2000-09-22 19:42
Changed Group to Feature Request. Should be added to PEP 42 (I'll do that if nobody beats me to it). CPython requires a genuine dict now for speed. I believe JPython allows any mapping object.
msg1557 - (view)	Author: Tim Peters (tim.peters) *	Date: 2000-09-23 04:18
Added to PEP 42.
msg1558 - (view)	Author: Michael Chermside (mcherm)	Date: 2003-12-16 15:37
Logged In: YES user_id=99874 I'll just add my voice as somebody who would find this to be "darn handy" if it could ever done without noticably impacting the speed of python code that DOESN'T use eval().
msg1559 - (view)	Author: Gregory Smith (gregsmith)	Date: 2004-01-12 19:38
Logged In: YES user_id=292741 This may be a compromise solution, which could also be useful in other contexts: What if the object passed is derived from dict - presumably that doesn't help because the point is to do low-level calls to the actual dict's lookup functions. Now, suppose we modify the basic dict type, so that before throwing a KeyError, it checks to see if it is really a derived object with a method __keyerror__, and if so, calls that and returns its result (or lets it throw)? Now you can make objects that look like dicts, and act like them at the low level, but can automatically populate themselves when non-existent keys are requested. Of course, __keyerror__('x') does not have to make an 'x' entry in the dict; it could make no change, or add several entries, depending on the desired semantics regarding future lookups. It could be set up so that every lookup fails and is forwarded by __keyerror__ to the __getitem__ of another object, for instance. The cost of this to the 'normal' dict lookup is that the need to do PyDict_CheckExact() each time a lookup fails.
msg1560 - (view)	Author: Michael Chermside (mcherm)	Date: 2004-01-12 22:01
Logged In: YES user_id=99874 Hmm... I like this! Of course, I am wary of adding yet another special double- underscore function to satisfy a single special purpose, but this would satisfy all of my needs, and without any performance impact for lookups that are FOUND. Lookups that are NOT found would have a slight performance degrade (I know better than to speculate about the size of the effect without measuring it). I'm not really sure what percentage of dict lookups succeed. At any rate, what are these "other contexts" you mention in which a __keyerror__ would "also be useful"? Because if we can find other places where it is useful, that significantly improves the usefulness. OTOH, if the tests can be done ONLY for eval (I don't really think that's possible), then I'm certain no one cares about the performance of eval.
msg1561 - (view)	Author: Bluebird (pfremy)	Date: 2004-03-03 09:40
Logged In: YES user_id=233844 I have exactly the same need as the original poster. The only difference in my case is that I inhreted a dictionnary. I want to use the eval() function as a simple expression evaluator. I have the follwing dictionnary: d['a']='1' d['b']='2' d['c']='a+b' I want the following results: d[a] -> 1 d[b] -> 2 d[c] -> 3 To do that, I was planning to use the eval() function and overloading the __getitem__ of the global or local dictionnary: class MyDict( dict ) : def __getitem__( self, key ): print "__getitem__", key val = dict.__getitem__( self, key ) print "val = '%s'" % val return eval( val , self ) But it does not work: d[a]: __getitem__ a val = '1' -> 1 d[b]: __getitem__ b val = '2' -> 2 d[c]: __getitem__ c val = 'e+1' ERROR Traceback (most recent call last): File "test_parse_jaycos_config.py", line 83, in testMyDict self.assertEquals( d['c'], 2 ) File "parse_config_file.py", line 10, in __getitem__ return eval( val , self ) File "<string>", line 0, in ? TypeError: cannot concatenate 'str' and 'int' objects d['c'] did fetch the 'a+1' value, which was passed to eval. However, eval() tried to evaluate the expression using the content of the dictionnary instead of using __getitem__. So it fetched the string '1' for 'a' instead of the value 1, so the result is not suitable for the addition.
msg1562 - (view)	Author: Armin Rigo (arigo) *	Date: 2004-05-19 18:21
Logged In: YES user_id=4771 With minimal work and performance impact, we could allow frame->f_locals to be something else than a real dictionary. It looks like it would be easily possible as long as we don't touch the time-critical frame->f_globals. Code compiled by eval() always uses LOAD_NAME and STORE_NAME, which is anyway allowed to be a bit slower than LOAD_FAST or LOAD_GLOBAL. Note that PyDict_GetItem() & co., as called by LOAD/STORE_NAME, do a PyDict_Check() anyway. We could just replace it with PyDict_CheckExact() and if it fails fall back to calling the appropriate ob_type->tp_as_mapping method. This would turn some of the PyDict_Xxx() API into a general mapping API, half-way between its current dict-only usage and the fully abstract PyObject_Xxx() API. This is maybe a bit strange from the user's point of view, because eval("expr", mydict) still wouldn't work: only eval("expr", {}, mydict) would. which does which does Now, there is no way I can think of that code compiled by eval() could contain LOAD_GLOBAL or STORE_GLOBAL. The only way to tell the difference between eval("expr", mydict) and eval("expr", {}, mydict) appears to be by calling globals() or somehow inspecting the frame directly. Therefore it might be acceptable to redefine the two-argument eval(expr, dict) to be equivalent not to eval(expr, dict, dict) but eval(expr, {}, dict). This hack might be useful enough even if it won't work with the exec statement.
msg1563 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-05-22 16:53
Logged In: YES user_id=80475 +1 Amrin's idea provides most of the needed functionality with zero performance impact. Also, using a local dictionary for the application variables is likely to be just exactly what a programmer would want to do.
msg1564 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-12 07:27
Logged In: YES user_id=80475 Armin, can you whip-up a quick patch so that we can explore the implications of your suggestion. Ideally, I would like to see something like this go in for Py2.4.
msg1565 - (view)	Author: Armin Rigo (arigo) *	Date: 2004-06-22 21:07
Logged In: YES user_id=4771 Quick patch attached. I didn't try to use the PyDict_GetItem trick described, but just systematically use PyObject_GetItem/SetItem/DelItem when working with f_locals. This might confuse some extension modules that expect PyEval_GetLocals() to return a dict object. The eval trick is now: eval(code, nondict) --> eval(code, globals(), nondict). Besides eval() I removed the relevant typecheck from execfile() and the exec statement. Any other place I am missing? We might want to still somehow check the type of the locals, to avoid strange errors caused by e.g. eval("a", "b"). PyMapping_Check() is the obvious candidate, but it looks like a hack. More testing is needed. test_descrtut.py line 84 now succeeds, unexpectedly, which is interpreted as a test failure. Needs some docs, too.
msg1566 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-23 16:59
Logged In: YES user_id=80475 The quick patch works fine. Do change the PyArg_ParseTuple() into the faster PyArg_UnpackTuple(). Does this patch show any changes to pystone or other key metrics? Would the PyDict_GetItem trick have better performance? My guess is that it would. +1 on using PyMapping_Check() for checking the locals argument to eval(). That is as good as you can get without actually running it.
msg1567 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-27 07:50
Logged In: YES user_id=80475 Attaching my minimal version of the patch. It runs the attached demo without exception and does not measurably show on any of my timings. The approach is to restrict the generalization to eval() instead of exec(). Since eval() can't set values in the locals dict, no changes are needed to the setitem and delitem calls. Instead of using PyObject_GetItem() directly, I do a regular lookup and fallback to the generalizaiton if necessary -- this is why the normal case doesn't get slowed down (the cost is a PyDict_Check which uses values already in cache, and a branch predicatable comparison).. While the demo script runs, and the test_suite passes, it is slightly too simple and doesn't yet handle eval('dir()', globals(), M()) where M is a non-dict mapping.
msg1568 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-27 08:37
Logged In: YES user_id=80475 Okay, the patch is ready for second review. Attaching the revision with unittests.
msg1569 - (view)	Author: Armin Rigo (arigo) *	Date: 2004-06-27 11:33
Logged In: YES user_id=4771 eval() can set items in the locals, e.g. by using list comprehension. Moreover the ability to exec in custom dicts would be extremely useful too, e.g. to catch definitions while they appear. If you really want to avoid any performance impact, using PyDict_CheckExact() in all critical parts looks good (do not use PyDict_Check(), it's both slower and not what we want because it prevents subclasses of dicts to override __xxxitem__).
msg1570 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-27 17:36
Logged In: YES user_id=80475 Changed to PyDict_CheckExact(). Are you sure about having to change the sets and dels. I've tried several things at the interactive prompt and can't get it to fail: >>> [locals() for i in (2,3)] Do you have any examples (in part, so I can use them as test cases)?
msg1571 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-27 18:08
Logged In: YES user_id=80475 Nix, that last comment. Have examples that call setitem().
msg1572 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-29 08:51
Logged In: YES user_id=80475 Okay, a new patch is ready for second review. It is essentially, Armin's patch with the following changes: * leaves the syntax for eval() intact with no automatic globals=locals trick * has the faster approach for LOAD_NAME, incorporating your suggestion for PyDict_CheckExact * omits the code to enable the same liberties for '.exec'. That is beyond the OP's request and well beyond something whose implications I've thought through. Am not opposed to it, but would like it to be left for a separate patch with thorough unittests.
msg1573 - (view)	Author: Armin Rigo (arigo) *	Date: 2004-06-29 10:36
Logged In: YES user_id=4771 Doing the type check in exec and execfile() but not in eval() is not something that seems particularly useful to me. Any program can be written as an expression in Python if you are crazy enough to do that... So it doesn't offer any extra security to be more strict in exec than in eval(). People who really want to do it would have to go through incredible pain just to work around the type check. For the implications, I believe it is sufficient (and necessary) to carefully review all usages of f_locals throughout the code, and document f_locals as no longer necessary a dictionary for those extension writers that would have used this fact.
msg1574 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-06-29 20:38
Logged In: YES user_id=80475 Are you comfortable doing this in two steps. Commit the patch as-is so that eval() works properly and then craft a separate patch to thoroughly implement, test, and document the same thing for exec and execfile()?
msg1575 - (view)	Author: Armin Rigo (arigo) *	Date: 2004-07-01 15:06
Logged In: YES user_id=4771 Patch reviewed: As eval(expr, mapping) is not accepted but eval(expr, {}, mapping) is, we could have a helpful error message along the line of if (!PyDict_Check(globals)) { PyErr_SetString(PyExc_TypeError, PyMapping_Check(globals) ? "globals must be a real dict; try eval(expr, {}, mapping)" : "globals must be a dict"); } LOAD_NAME: you are doing a speed hack here we will bypass a user-defined __getitem__ on a subclass of dict if the key actually exists in the dictionary. In other words your test with the subclass of dict would fail if the dict aditionally had another real value for the key 'a'. Better use PyDict_CheckExact() in all cases and only call PyDict_GetItem() if it is true, as this is really cheap. Also, no PyErr_Clear() masking arbitrary exceptions please!
msg1576 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-07-02 06:41
Logged In: YES user_id=80475 Made the requested changes, tested, and applied. Will tackle exec and execfile on another day.
msg1577 - (view)	Author: Anand Aiyer (aaiyer)	Date: 2006-08-01 06:20
Logged In: YES user_id=1554342 There is a missing PyErr_Clear() in the GetItem at LOAD_NAMES
msg1578 - (view)	Author: Armin Rigo (arigo) *	Date: 2006-08-01 10:49
Logged In: YES user_id=4771 I don't see any missing PyErr_Clear() in LOAD_NAME, but I may be overseeing it. Can you be more specific? Where and why?

History
Date	User	Action	Args
2022-04-10 16:02:25	admin	set	github: 33178
2000-09-22 19:36:01	anonymous	create