Title: Python function call optimization: avoid temporary tuple to pass **kwargs
Components: Interpreter Core Versions: Python 3.8
Nosy List: inada.naoki, jdemeyer, serhiy.storchaka, vstinner, xtreak
Created on 2018-07-10 23:13 by vstinner, last changed 2019-04-05 10:27 by jdemeyer.

On the following code, f() uses CALL_FUNCTION_EX bytecode to call g(). The bytecode loads 'kw' variable which is a dictionary. But internally, the dictionary is converted to a temporary tuple, and later a new dictionary is created. Maybe the temporary tuple could be avoided?

def g(*args, **kw):

def f(*args, **kw):
    g(*args, **kw)

In Python 3.6, before FASTCALL, CALL_FUNCTION_EX calls:

* do_call_core(): kw dict
* PyObject_Call(): kw dict
* function_call(): kw dict -> create a temporary tuple of keys and names: (key[0], value[0], ...)
* _PyEval_EvalCodeWithName(): if CO_VARKEYWORDS, rebuild a new dictionary for keyword arguments (**kw)

In Python master branch (future 3.8) with FASTCALL, CALL_FUNCTION_EX calls:

* do_call_core(): kw dict
* _PyFunction_FastCallDict(): kw dict -> a temporary tuple for keyword names ('kwnames') is created
* _PyEval_EvalCodeWithName(): if CO_VARKEYWORDS, rebuild a new dictionary for keyword arguments (**kw)

To be clear: FASTCALL didn't make this specific function call (Python => Python with **kw) worse nor better.
Related issues with similar discussion in the past : - Similar discussion where PyDict_Copy was proposed

Yeah, I recall these issues since I wrote them :-D

The fix for bpo-29318 was to document that we must copy the kwargs dict ;-)

But this issue is not about the copy of the kwargs dict, but about two useless conversions: kwargs dict -> kwnames tuple -> kwargs dict.
This might be solvable using PEP 580 by using METH_VARARGS instead of METH_FASTCALL for such functions. This would still require a temporary tuple for the positional args but no additional dict would need to be allocated.
