Issue 23967: Make inspect.signature expression evaluation more powerful

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/68155

classification

Title:	Make inspect.signature expression evaluation more powerful
Type:	enhancement	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.5

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:	larry	Nosy List:	Eric Wieser, larry, ncoghlan, pdmccormick, serhiy.storchaka, yselivanov, zach.ware
Priority:	normal	Keywords:	patch

Created on 2015-04-15 18:52 by larry, last changed 2022-04-11 14:58 by admin.

Files
File name	Uploaded	Description	Edit
larry.improved.signature.expressions.1.txt	larry, 2015-04-15 18:52		review
pdm-argument_clinic-mixed_py_and_c_defaults-v1.patch	pdmccormick, 2015-04-16 06:33	Argument Clinic patch simplifying the use of the improved signatures
larry.improved.signature.expressions.2.txt	larry, 2015-04-19 18:43		review
larry.improved.signature.expressions.3.txt	larry, 2015-04-23 08:31		review

Messages (11)
msg241140 - (view)	Author: Larry Hastings (larry) *	Date: 2015-04-15 18:52
Peter's working on converting socket to use Argument Clinic. He had a default that really should look like this: min(SOME_SOCKET_MODULE_CONSTANT, 128) "min" wasn't something we'd needed before. I thought about it and realized we could do a much better job of simulating the evaluation context of a shared module. Initially I thought, all I needed was to bolster the environment we used for eval() to add the builtins. (Which I've done.) But this wasn't sufficient because we deliberately used ast.literal_eval(), which doesn't support function calls by design for superior security. Or subscripting, or attribute access. We already worked around those I think. But how concerned are we about security? What is the attack vector here? If the user is able to construct an object that has a villainous __text_signature__ on it... surely they could already do as they like? So here's a first draft at modifying the __text_signature__ evaluation environment so it can handle much more sophisticated expressions. It can use anything from builtins, or anything in sys.modules, or anything in the current module; it can call functions, and subscript, and access attributes, and everything. To make this work I had to write an ast printer that produces evaluatable Python code. Note that it's not complete, I know it's not complete, it's missing loads of operators. Assume that if this is a good idea I will add all the missing operators. Nick was worried that in the future we might expose a "turn this string into a signature" function. That might make an easier attack vector. So he asked that the "trusted=" keyword flag be added, and the full-on eval only happen if the string is trusted.
msg241204 - (view)	Author: Peter McCormick (pdmccormick) *	Date: 2015-04-16 06:33
This definitely works for the _socket.listen use case! In terms of generating such a signature using Argument Clinic, currently this is required: backlog: int(py_default="builtins.min(SOMAXCONN, 128)", c_default="Py_MIN(SOMAXCONN, 128)") = 000 The attached patch lets Tools/clinic/clinic.py make an exception when both C and Python defaults are specified, simplifying the above to: backlog: int(py_default="builtins.min(SOMAXCONN, 128)", c_default="Py_MIN(SOMAXCONN, 128)")
msg241205 - (view)	Author: Peter McCormick (pdmccormick) *	Date: 2015-04-16 07:04
I missed the fact that Larry's patch obviates the need for the `builtins.` prefix, shortening the Argument Clinic parameter specification into: backlog: int(py_default="min(SOMAXCONN, 128)", c_default="Py_MIN(SOMAXCONN, 128)")
msg241478 - (view)	Author: Larry Hastings (larry) *	Date: 2015-04-19 05:40
I should mention that evalify_node() is pretty hacked up here, and is not ready to be checked in. (I'm proposing separately that we simply add something like this directly into the standard library, see issue #24002.)
msg241533 - (view)	Author: Larry Hastings (larry) *	Date: 2015-04-19 18:42
Thanks to #24002 I now know how to write evalify_node properly. This patch is now much better. Note that I deliberately made the new function _eval_ast_expr() as a "private" module-level routine. I need that same functionality in Argument Clinic too, so if both patches are accepted I'll have Clinic switch to calling this version.
msg241534 - (view)	Author: Larry Hastings (larry) *	Date: 2015-04-19 18:43
Whoops. Here's the revised patch.
msg241850 - (view)	Author: Larry Hastings (larry) *	Date: 2015-04-23 08:31
Cleaned up the patch some more--the code was stupid in a couple places. I think it's ready to go in.
msg241853 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2015-04-23 08:49
Using complex expressions is deceitful. In Python functions the default value is evaluated only once, at function creation time, but inspect.signature will evaluate it every time. For example foo(x={}) and foo(x=dict()) means the same in function declaration, but different in signature. It could also affect security, because allow arbitrary code execution at the place where it was not allowed before. I think this issue should be discussed on Python-Dev. I'm not sure that it is pythonic.
msg241855 - (view)	Author: Larry Hastings (larry) *	Date: 2015-04-23 09:20
It's only used for signatures in builtins. Any possible security hole here is uninteresting because the evil hacker already got to run arbitrary C code in the module init. Because it's only used for signatures in builtins, we shouldn't encounter a function with a mutable default value like {} or [] which gets mutated later. Builtins don't have those. In case you're wondering about the "trusted" parameter, that was suggested by Nick Coghlan at the PyCon sprints. He's thinking that other callers may use _signature_fromstr() in the future, and he wanted the API to make it clear that future uses may be on non-trustworthy sources. And, finally, consider that the original version already calls eval(). Admittedly it uses eval() in a way that should be much harder to exploit. But it's not an enormous difference between the two calls. I don't really think we need to post to python-dev about this.
msg242006 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2015-04-25 08:02
Right, Larry and I had a fairly long discussion about this idea at the sprints, and I was satisfied that all the cases where he's proposing to use this are safe: in order to exploit them you need to be able to set __text_signature__ on arbitrary objects, and if an attacker can do that, you've already lost control of the process. However, a natural future extension is to expose this as a public alternative constructor for Signature objects, and for that, the fact that it ultimately calls eval() under the hood presents more of a security risk. The "trusted=False" default on _signature_fromstr allows the function to be used safely on untrusted data, while allowing additional flexibility when you do trust the data you're evaluating.
msg365315 - (view)	Author: Eric Wieser (Eric Wieser)	Date: 2020-03-30 14:20
> To make this work I had to write an ast printer that produces evaluatable Python code. Note that it's not complete, I know it's not complete, it's missing loads of operators. Assume that if this is a good idea I will add all the missing operators. Now that `ast.unparse` is in (bpo-38870), can this patch be simplified?

History
Date	User	Action	Args
2022-04-11 14:58:15	admin	set	github: 68155
2020-05-29 17:47:00	brett.cannon	set	nosy: - brett.cannon
2020-03-30 14:20:16	Eric Wieser	set	messages: + msg365315
2020-03-30 14:17:14	Eric Wieser	set	nosy: + Eric Wieser
2015-04-25 08:02:33	ncoghlan	set	messages: + msg242006
2015-04-23 09:20:46	larry	set	messages: + msg241855
2015-04-23 08:49:52	serhiy.storchaka	set	messages: + msg241853
2015-04-23 08:31:06	larry	set	files: + larry.improved.signature.expressions.3.txt messages: + msg241850
2015-04-19 18:43:06	larry	set	files: + larry.improved.signature.expressions.2.txt messages: + msg241534
2015-04-19 18:42:40	larry	set	nosy: + brett.cannon messages: + msg241533
2015-04-19 05:40:12	larry	set	messages: + msg241478
2015-04-16 07:04:11	pdmccormick	set	messages: + msg241205
2015-04-16 06:33:07	pdmccormick	set	files: + pdm-argument_clinic-mixed_py_and_c_defaults-v1.patch keywords: + patch messages: + msg241204
2015-04-15 18:52:37	larry	create