classification
Title: Improve the C code for calling Python code: _PyEval_EvalCode()
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Mark.Shannon, brett.cannon, gvanrossum, petr.viktorin, rhettinger, serhiy.storchaka, vstinner, yselivanov
Priority: normal Keywords: patch

Created on 2021-01-21 16:05 by Mark.Shannon, last changed 2021-03-17 21:00 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24298 merged Mark.Shannon, 2021-01-22 15:01
PR 24368 merged Mark.Shannon, 2021-01-29 14:27
PR 24559 merged vstinner, 2021-02-17 18:01
PR 24564 merged vstinner, 2021-02-18 14:20
PR 24566 merged vstinner, 2021-02-18 17:53
Messages (25)
msg385432 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-21 16:05
Currently, to make a call to Python (modules, classes, etc, not just functions) from C has to use the monster that is `_PyEval_EvalCode`.
As Python has adding features over the years, _PyEval_EvalCode has grown and grown. It is time for a refactor.

`_PyEval_EvalCode` takes 16, yes sixteen parameters!
Those 16 parameters fall into two main groups, one describing the function being called, and the other describing the arguments to the call.
Due to the jumbo size and complexity of _PyEval_EvalCode, we also have specialised forms of it, e.g. _PyFunction_Vectorcall that handle common cases and then bail to _PyEval_EvalCode in other cases.

In outline _PyEval_EvalCode performs the following actions:
1. Make a new frame.
2. Fill in that frame using the arguments and defaults provided.
3. If the code object has flags set for a generator, or coroutine, return a new generator or coroutine.
4. Otherwise call the interpreter with the newly created frame.

As _PyEval_EvalCode or its earlier equivalents have gained arguments, they have a left of list of legacy functions. It is not clear what is the "correct" function to use in C extensions. Hopefully extension code uses the `PyObject_Call` API, but `PyEval_EvalCodeEx` is part of the C-API.


To improve this situation, I propose:

A new C struct, the "frame descriptor" which contains the code object, defaults, globals, and names that describe the code to be executed. `PyFunctionObject` would wrap this, to simplify the common case of calling a Python function. 

Split the Eval functions into vector and tuple/dict forms, both forms taking a "frame descriptor", as well as the arguments.

Mark the older forms of the Eval functions as "legacy", creating a local "frame descriptor" and transforming the arguments in vector or tuple/dict forms, in the legacy functions.

Factor out the frame creation part of the Eval functions.


The above changes will be necessary for PEP 651, but I think would be a worthwhile improvement even if PEP 651 is not accepted.
msg385491 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-22 11:36
PyObject *
_PyEval_EvalCode(PyThreadState *tstate,
           PyObject *_co, PyObject *globals, PyObject *locals,
           PyObject *const *args, Py_ssize_t argcount,
           PyObject *const *kwnames, PyObject *const *kwargs,
           Py_ssize_t kwcount, int kwstep,
           PyObject *const *defs, Py_ssize_t defcount,
           PyObject *kwdefs, PyObject *closure,
           PyObject *name, PyObject *qualname)

Hum no, only 16 arguments, everything is fine! :-D

More seriously, I already considered to rework this code.

The pseudo code is:

  if (...) return <new generator>;
  frame = (...);
  retval = _PyEval_EvalFrame(tstate, f, 0);
  _PyObject_GC_TRACK(f);
  return retval;

I like the idea of splitting these two parts:

  f = create_frame_or_gen(...);
  if (<is generator>) return f;
  retval = _PyEval_EvalFrame(tstate, f, 0);
  _PyObject_GC_TRACK(f);
  return retval;

I see one advantage: stack memory consumation, we don't have to hold the 16 arguments on the stack, only 3 parameters (tstate, f, 0).

> Split the Eval functions into vector and tuple/dict forms, both forms taking a "frame descriptor", as well as the arguments.

Hum, it seems like you have a different idea how to refactor the code.

Would it be worth it to have more specialized create_frame_or_gen() functions for the common cases?

--

I would also be interested by the ability to not create a frame at all, but I guess that it's a way larger refactoring.
msg385500 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-22 14:54
Rather than:

  f = create_frame_or_gen(...);
  if (<is generator>) return f;
  retval = _PyEval_EvalFrame(tstate, f, 0);
  _PyObject_GC_TRACK(f);
  return retval;

I was thinking:

  f = create_frame(...);
  if (<is generator>) return make_gen(f);
  retval = _PyEval_EvalFrame(tstate, f, 0);
  _PyObject_GC_TRACK(f);
  return retval;

The complicated part is create_frame(...), so I want to clean that up first.
msg385908 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-01-29 13:25
New changeset d6c33fbd346765c6a8654dccacb2338006bf2b47 by Mark Shannon in branch 'master':
bpo-42990: Introduce 'frame constructor' struct to simplify API for PyEval_CodeEval and friends (GH-24298)
https://github.com/python/cpython/commit/d6c33fbd346765c6a8654dccacb2338006bf2b47
msg386061 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-02-01 10:42
New changeset 0332e569c12d3dc97171546c6dc10e42c27de34b by Mark Shannon in branch 'master':
bpo-42990: Further refactoring of PyEval_ functions. (GH-24368)
https://github.com/python/cpython/commit/0332e569c12d3dc97171546c6dc10e42c27de34b
msg387187 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-02-17 19:58
+1 on exposing f,__builtins__.

Of course, the thing I'd really want is a way to state that all references to builtins are meant to have the exact semantics of those builtins, so the compiler can translate e.g. len(x) into a new opcode that just calls PyObject_Size().  (I can dream, can't I?)

Another dream: assume that globals that refer to modules, classes or functions don't change, so they can be cached more aggressively.

I suppose enough checking of dict version tags can get us there, or at least close enough.
msg387190 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-17 22:00
> +1 on exposing f,__builtins__.

This change is related to bpo-43228 "Regression in function builtins": cloudpickle is broken by this issue (new PyFunctionObject.func_builtins member).

> Of course, the thing I'd really want is a way to state that all references to builtins are meant to have the exact semantics of those builtins, so the compiler can translate e.g. len(x) into a new opcode that just calls PyObject_Size().  (I can dream, can't I?)

I tried to implement such optimization in my old https://faster-cpython.readthedocs.io/fat_python.html project. I implemented guards to de-optimize the code if a builtin is overriden.

> I suppose enough checking of dict version tags can get us there, or at least close enough.

The dict version comes from my FAT Python project: PEP 509. It is used to optimize method calls. See also PEP 510 and PEP 511 ;-)
msg387192 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-02-17 22:14
Well maybe I'll be picking up those ideas again... Thanks for documenting
so much of what you did then!
msg387196 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2021-02-17 23:48
> I tried to implement such optimization in my old https://faster-cpython.readthedocs.io/fat_python.html project. I implemented guards to de-optimize the code if a builtin is overriden.

FWIW the globals opcode cache handles all of this now. There's no point in specifically optimizing the builtins lookup since we optimize all global lookups for a code object that's hot enough.
msg387199 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-02-18 02:28
> FWIW the globals opcode cache handles all of this now. There's no point in specifically optimizing the builtins lookup since we optimize all global lookups for a code object that's hot enough.

So you think that even a dedicated "LEN" opcode would not be any faster? (This is getting in Paul Sokolovsky territory -- IIRC he has a variant of Python that doesn't allow overriding builtins.)
msg387223 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-18 11:36
New changeset a3c3ffa68e6fc4524b1149a6a14d56c3a2e9b612 by Victor Stinner in branch 'master':
bpo-42990: Add __builtins__ attribute to functions (GH-24559)
https://github.com/python/cpython/commit/a3c3ffa68e6fc4524b1149a6a14d56c3a2e9b612
msg387233 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-18 14:26
I reopen the issue since there is bpo-43228 regression, caused by this issue, which is still under discussion, and Mark also proposed to add a new builtins parameter to the function constructor (FunctionType).

I wrote PR 24564 to help fixing bpo-43228 regression: with this change, functions now inherit the current builtins if the globals namespace is overriden, but the new globals has no "__builtins__" key. This change is backward incompatible on purpose. If someone really wants to run a function in a different builtins namespace, globals['__builtins__'] must be set explicitly.

Once PR 24564 will be merged, I plan to write a 3rd PR to add an optional builtins keyword-only parameter to the function constructor (FunctionType).
msg387243 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2021-02-18 16:30
> So you think that even a dedicated "LEN" opcode would not be any faster? (This is getting in Paul Sokolovsky territory -- IIRC he has a variant of Python that doesn't allow overriding builtins.)

Yeah, a dedicated LEN opcode could only be faster if it would not be possible to shadow builtins (or if there was a "len" operator in Python).  If that's not the case, this hypothetical LEN opcode would still have to check if "len" was shadowed or not, and that's slower than the optimized LOAD_GLOBAL we have now.
msg387244 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-02-18 16:56
Victor
> the new globals has no "__builtins__" key. This change is backward incompatible on purpose. If someone really wants to run a function in a different builtins namespace, globals['__builtins__'] must be set explicitly.

You say it's on purpose, what's the purpose? Aren't you worried this is going to break stuff? And why is this necessary given the LOAD_GLOBAL cache?

Yury
> this hypothetical LEN opcode would still have to check if "len" was shadowed or not, and that's slower than the optimized LOAD_GLOBAL we have now.

It could use the same check though? Just check the version tags.
msg387245 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-18 17:01
> You say it's on purpose, what's the purpose? Aren't you worried this is going to break stuff?

There is a subtle behavior difference between Python 3.9 and Python 3.10. func_builtins2.py of bpo-43228 works on Python 3.9 but fails on Python 3.10. With my PR 24564, func_builtins2.py works again on Python 3.10.

See bpo-43228 for the details.

> And why is this necessary given the LOAD_GLOBAL cache?

My PR 24564 is not related to LOAD_GLOBAL, but how a frame fills its f_builtins member from a function.

LOAD_GLOBAL uses f_globals and f_builtins members of a frame.
msg387260 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-18 18:20
New changeset 44085a3fc9a150478aec1872dd1079c60dcc42f6 by Victor Stinner in branch 'master':
bpo-42990: Refactor _PyFrame_New_NoTrack() (GH-24566)
https://github.com/python/cpython/commit/44085a3fc9a150478aec1872dd1079c60dcc42f6
msg387304 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-19 11:15
I rephrased PR 24564 to clarify the scope of the incompatible change: in practice, only the types.FunctionType constructor changes.

Defining functions in Python using "def function(...): ...", eval(code, {}) and exec(code, {}) are not affected. eval() and exec() already inherit the current builtins if globals does not contain "__builtins__" key.
msg387306 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-19 11:29
Updated PR documentation:
---
The types.FunctionType constructor now inherits the current builtins
if the globals parameter is used and the globals dictionary has no
"__builtins__" key, rather than rather than using {"None": None} as
builtins: same behavior than eval() and exec() functions.

Defining a function with "def function(...): ..." in Python is not
affected, globals cannot be overriden with this syntax: it also
inherits the current builtins.
---

This PR makes FunctionType makes more consistent with other Python functions.

Also, it doesn't prevent people to attempt building a "sandbox", it remains possible to override __builtins__ in FunctionType, eval(), exec(), etc.

Usally, such sandbox pass a modified builtins namespace to eval() and exec() and the functions simply inherit it, functions defines with "def function(...): ..." and functions created with types.FunctionType constructor: my PR only impacts a very specific case, when types.FunctionType is called with a different globals dictionary which has no "__builtins__" key.
msg387340 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-02-19 18:01
Thanks, that's clearer.

I'm still worried about the change in semantics where globals["__builtins__"] is assigned a different dict after the function object has been created (similar to https://bugs.python.org/file49816/func_builtins2.py).

I.e.

def foo(): return len("abc")
code = foo.__code__
g = {"__builtins__": {"len": len}}
f = FunctionType(code, g)
f()  # Succeeds
g["__builtins__"] = {}
f()  # Fails in 3.9 and before, passes in 3.10

Assuming code uses len, does f() succeed or fail?

I realize this is a pretty esoteric, but it does show the change in semantics (from later to earlier binding). Should we care? I like early binding because it allows more optimizations[1], but traditionally Python's semantics use late binding.

[1] Not in this case, the user could still change the meaning of len() with e.g.

g["__builtins__"]["len"] = lambda x: return 42
msg387341 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-02-19 18:44
In Python 3.9 the binding is more late-ish binding, than true late binding.

Because globals['__builtins__'] is cached for each function activation, executing functions don't see updates.

Example:

>>> def f():
...     print(len("test"))
...     bltns = f.__globals__["__builtins__"]
...     if hasattr(bltns, "__dict__"):
...         bltns = bltns.__dict__
...     new = bltns.copy()
...     new["len"] = lambda x : 7
...     f.__globals__["__builtins__"] = new
...     print(len("test"))
... 
>>> 
>>> f()
4
4
>>> f()
7
7

True late binding would print:

>>> f()
4
7
>>> f()
7
7
msg387345 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-19 19:33
Guido: "I'm still worried about the change in semantics where globals["__builtins__"] is assigned a different dict after the function object has been created (...)"

Well, there is a semantics change of Python 3.10 documented at:
https://docs.python.org/dev/whatsnew/3.10.html#other-language-changes

"Functions have a new __builtins__ attribute which is used to look for builtin symbols when a function is executed, instead of looking into __globals__['__builtins__']. (Contributed by Mark Shannon in bpo-42990.)"

And the function __builtins__ attribute is read-only.


Your example is not affected by PR 24564 because the globals has the "__builtins__" key.


In Python 3.10, you can modify func.__builtins__ (new attribute):
---
def foo(s): return len(s)
code = foo.__code__
FunctionType = type(foo)
f = FunctionType(code, {"__builtins__": {"len": len}})
print(f("abc"))
f.__builtins__.clear()
print(f("abc"))
---

Output:
---
3
Traceback (most recent call last):
  (...)
NameError: name 'len' is not defined
---


Mark: "Because globals['__builtins__'] is cached for each function activation, executing functions don't see updates."

In Python 3.10, if someone wants to hack builtins while the function is running, modifying the builtins namespace in-place works as expected:
---
def f():
    print(len("test"))
    builtins_ns = f.__globals__['__builtins__'].__dict__
    #builtins_ns = f.__builtins__
    builtins_ns['len'] = lambda x: 7
    print(len("test"))

f()
---

Output:
---
4
7
---

It also works with "builtins_ns = f.__builtins__".


Guido: "I realize this is a pretty esoteric, but it does show the change in semantics (from later to earlier binding). Should we care? I like early binding because it allows more optimizations[1], but traditionally Python's semantics use late binding."

Modifying built-in functions/types is commonly done in tests. Example:
---
import unittest.mock

def func():
    with unittest.mock.patch('builtins.chr', return_value='mock'):
        return chr(65)

print(func())
---

The expected output is: "mock". Overriding an attribute of the builtins module immediately updates func.__builtins__. It works because func.__builtins__ is builtins.__dict__.

In FAT Python, I implemented an optimization which copies builtin functions to constants, replace LOAD_GLOBAL with LOAD_CONST:
https://fatoptimizer.readthedocs.io/en/latest/optimizations.html#copy-builtin-to-constant

This optimization breaks this Python semantics, it is no longer possible to override builtin functions in tests:
https://fatoptimizer.readthedocs.io/en/latest/semantics.html#builtin-functions-replaced-in-the-middle-of-a-function
msg387377 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-02-20 01:03
Thanks for the clarifications, Victor!

So I now understand: the *identity* of the builtins dict used by a function's code is captured when the function object is created. And I understand why: it's so that in the fast path for the LOAD_GLOBAL opcode we won't have to do a dict lookup in globals to find the builtins.

Regarding the text in What's New 3.10 about this at https://docs.python.org/dev/whatsnew/3.10.html#other-language-changes, I recommend adding there that func.__builtins__ is initialized from globals["__builtins__"], if it exists, else from the frame's builtins, when the function object is created; like you state in https://github.com/python/cpython/pull/24564. Or perhaps make one of these paragraphs refer to the other for details, since they are duplicate mentions of the same behavior change (once the latter PR lands).

Also, thanks to you and Mark and everyone else who has worked on optimizations like the globals cache and the method cache.
msg387401 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-20 12:14
> Regarding the text in What's New 3.10 about this at https://docs.python.org/dev/whatsnew/3.10.html#other-language-changes, I recommend adding there that func.__builtins__ is initialized from globals["__builtins__"], if it exists, else from the frame's builtins, when the function object is created; like you state in https://github.com/python/cpython/pull/24564.

Good idea, I updated my PR.

> Or perhaps make one of these paragraphs refer to the other for details, since they are duplicate mentions of the same behavior change (once the latter PR lands).

IMO it's useful to have two different paragraphs. One about the new attribute which is not really a semantics change, and one about the semantics changes when globals["__builtins__"] doesn't exist. For people who only care about Python 3.10 incompatible changes, they can simply read the Porting to Python 3.10 > Changes in the Python API section ;-)
msg387408 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-02-20 14:17
New changeset 46496f9d12582bf11f4911ad0f23315d6f277907 by Victor Stinner in branch 'master':
bpo-42990: Functions inherit current builtins (GH-24564)
https://github.com/python/cpython/commit/46496f9d12582bf11f4911ad0f23315d6f277907
msg388969 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-17 21:00
I checked manually cloudpickle_bug.py attached to bpo-43228: it works as expected. The bpo-43228 regression has been fixed. I close again the issue.

It was proposed to add a builtins parameter to the types.FunctionType constructor. If someone wants to add it: please go ahead ;-)
History
Date User Action Args
2021-03-17 21:00:33vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg388969

stage: patch review -> resolved
2021-02-20 14:17:25vstinnersetmessages: + msg387408
2021-02-20 12:14:33vstinnersetmessages: + msg387401
2021-02-20 01:03:12gvanrossumsetmessages: + msg387377
2021-02-19 19:33:20vstinnersetmessages: + msg387345
2021-02-19 18:44:36Mark.Shannonsetmessages: + msg387341
2021-02-19 18:01:23gvanrossumsetmessages: + msg387340
2021-02-19 11:29:58vstinnersetmessages: + msg387306
2021-02-19 11:15:08vstinnersetmessages: + msg387304
2021-02-18 18:20:43vstinnersetmessages: + msg387260
2021-02-18 17:53:23vstinnersetstage: resolved -> patch review
pull_requests: + pull_request23347
2021-02-18 17:01:38vstinnersetmessages: + msg387245
2021-02-18 16:56:27gvanrossumsetmessages: + msg387244
2021-02-18 16:30:10yselivanovsetmessages: + msg387243
2021-02-18 14:26:32vstinnersetversions: + Python 3.10
2021-02-18 14:26:24vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg387233
2021-02-18 14:20:37vstinnersetpull_requests: + pull_request23345
2021-02-18 11:36:09vstinnersetmessages: + msg387223
2021-02-18 02:28:54gvanrossumsetmessages: + msg387199
2021-02-17 23:48:55yselivanovsetmessages: + msg387196
2021-02-17 22:14:29gvanrossumsetmessages: + msg387192
2021-02-17 22:00:40vstinnersetmessages: + msg387190
2021-02-17 19:58:16gvanrossumsetmessages: + msg387187
2021-02-17 18:01:34vstinnersetpull_requests: + pull_request23340
2021-02-02 00:10:18Mark.Shannonsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-02-01 10:42:36Mark.Shannonsetmessages: + msg386061
2021-01-29 14:27:31Mark.Shannonsetpull_requests: + pull_request23192
2021-01-29 13:25:07Mark.Shannonsetmessages: + msg385908
2021-01-22 18:00:18gvanrossumsetnosy: + gvanrossum
2021-01-22 15:01:52Mark.Shannonsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request23121
2021-01-22 14:54:36Mark.Shannonsetmessages: + msg385500
2021-01-22 11:36:23vstinnersettitle: Improve the C code for calling Python code -> Improve the C code for calling Python code: _PyEval_EvalCode()
2021-01-22 11:36:07vstinnersetmessages: + msg385491
2021-01-22 01:31:39gvanrossumsetnosy: + brett.cannon, rhettinger, vstinner, petr.viktorin, serhiy.storchaka, yselivanov
2021-01-21 16:05:52Mark.Shannoncreate