Title: Use FASTCALL in dict.update()
Type: enhancement Stage:
Components: Argument Clinic Versions: Python 3.7
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: haypo, inada.naoki, larry, python-dev, rhettinger
Priority: normal Keywords: patch

Created on 2017-01-18 17:07 by haypo, last changed 2017-01-19 11:45 by python-dev. This issue is now closed.

File name Uploaded Description Edit
dict_update_fastcall.patch haypo, 2017-01-18 17:07
dict_update_fastcall-2.patch haypo, 2017-01-19 10:03 review
Messages (8)
msg285744 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-18 17:07
Follow-up of the issue #29311 "Argument Clinic: convert dict methods".

The dict.update() method hs a special prototype:

   def update(arg=None **kw): ...

I don't think that Argument Clinic supports it right now, so I propose to first optimize dict_update() to use METH_FASTCALL.

Attached patch is a first step: convert kwnames tuple to a dict.

A second step would be to avoid the temporary dict and update the dict using args + kwnames directly.
msg285745 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-18 17:10
My patch doesn't use _PyStack_AsDict() since this function currently fails with an assertion error if a key is not exactly a string (PyUnicode_CheckExact). dict_update_common() already checks if all keys are string.
msg285747 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-18 17:14
See also issue #20291: "Argument Clinic should understand *args and **kwargs parameters".
msg285766 - (view) Author: INADA Naoki (inada.naoki) * (Python committer) Date: 2017-01-19 01:38
Using FASTCALL for methods accepting **kwargs can't skip creating dict in most cases.  Because they accepts dict.

Even worth, when calling it like `d.update(**config)` (yes, it's abuse of
**, but it can be happen in some C methods), FASTCALL may unpack the passed
dict, and pack it again.

So, when considering METH_FASTCALL, supporting **kwargs is lowest priority.
(Off course, supporting it by AC with METH_KEYWORDS is nice to have)
msg285768 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-01-19 06:48
I like the other AC changes to dict in 29311, but this one seems like it shouldn't be done.  There is too much twisting around existing code to force it to use AC and the benefit will be almost nothing.   dict.update() is mainly used with a list of tuples argument or with another mapping.  The O(1) time spent on the method call is inconsequential compared to the O(n) step of looping over all the inputs and putting them in the dict.  Accordingly, I think this method should be skipped, leaving the current clear, stable, fast-enough code in-place.
msg285770 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-19 10:03
Oops, I forgot a DECREF: fixed in the patch version 2.


Oh wait, I misunderstood how dict.update() is called. In fact, they are two bytecodes to call a function with keyword arguments.

(1) Using **kwargs:

>>> def f():
...  d.update(**d2)
>>> dis.dis(f)
  2           0 LOAD_GLOBAL              0 (d)
              2 LOAD_ATTR                1 (update)
              4 BUILD_TUPLE              0
              6 LOAD_GLOBAL              2 (d2)
              8 CALL_FUNCTION_EX         1
             10 POP_TOP
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

(2) Using a list of key=value:

>>> def g():
...  d.update(x=1, y=2)
>>> dis.dis(g)
  2           0 LOAD_GLOBAL              0 (d)
              2 LOAD_ATTR                1 (update)
              4 LOAD_CONST               1 (1)
              6 LOAD_CONST               2 (2)
              8 LOAD_CONST               3 (('x', 'y'))
             10 CALL_FUNCTION_KW         2
             12 POP_TOP
             14 LOAD_CONST               0 (None)
             16 RETURN_VALUE

The problem is that the dict.update() method has a single implementation, the C dict_update() function.

For (2), there is a speedup, but it's minor:

$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}' 'd.update(x=1, y=2)' -p10 --compare-to=../default-ref/python 
Median +- std dev: [ref] 185 ns +- 62 ns -> [patched] 177 ns +- 2 ns: 1.05x faster (-5%)

For (1), I expected that **kwargs would be unpacked *before* calling dict.update(), but kwargs is passed unchanged to dict.update() directly! With my patch, CALL_FUNCTION_EX calls PyCFunction_Call() which uses _PyStack_UnpackDict() to create kwnames and then dict_update() rebuilds a new temporary dictionary. It's completely inefficient! As Raymond expected, it's much slower:

haypo@smithers$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}; d2=dict(d)' 'd.update(**d2)' -p10 --compare-to=../default-ref/python
Median +- std dev: [ref] 114 ns +- 1 ns -> [patched] 232 ns +- 21 ns: 2.04x slower (+104%)

I expect that (1) dict.update(**kwargs) is more common than (2) dict.update(x=1, y=2). Moreover, the speedup for (2) is low (5%), so I prefer to reject this issue.


Naoki: "So, when considering METH_FASTCALL, supporting **kwargs is lowest priority. (Off course, supporting it by AC with METH_KEYWORDS is nice to have)"

Hum, dict.update() is the first function that I found that really wants a Python dict at the end.

For dict1.update(**dict2), METH_VARARGS|METH_KEYWORDS is already optimal.

So I don't think that it's worth it to micro-optimize the way to pass positional arguments. The common case is to call dict1.update(dict2) which requires to build a temporary tuple of 1 item. PyTuple_New() uses a free list for such small tuple, so it should be fast enough.

I found a few functions which pass through keyword arguments, but they are "proxy". I'm converting all METH_VARARGS|METH_KEYWORDS to METH_FASTCALL, so most functions will expects a kwnames tuple at the end of the call for keyword arguments. In this case, using METH_FASTCALL for the proxy is optimum for func(x=1, y=2) (CALL_FUNCTION_KW), but slower for func(**kwargs) (CALL_FUNCTION_EX).

With METH_FASTCALL, CALL_FUNCTION_EX requires to unpack the dictionary if I understood correctly.
msg285774 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-19 11:34
When analyzing how FASTCALL handles "func(**kwargs)" calls for Python functions, I identified a missed optimization. I created the issue #29318: "Optimize _PyFunction_FastCallDict() for **kwargs".
msg285778 - (view) Author: Roundup Robot (python-dev) Date: 2017-01-19 11:45
New changeset e371686229e7 by Victor Stinner in branch 'default':
Add a note explaining why dict_update() doesn't use METH_FASTCALL
Date User Action Args
2017-01-19 11:45:26python-devsetnosy: + python-dev
messages: + msg285778
2017-01-19 11:34:29hayposetresolution: rejected
messages: + msg285774
2017-01-19 10:03:50hayposetstatus: open -> closed
files: + dict_update_fastcall-2.patch
messages: + msg285770
2017-01-19 06:48:40rhettingersetnosy: + rhettinger
messages: + msg285768
2017-01-19 01:38:56inada.naokisetmessages: + msg285766
2017-01-18 17:14:13hayposetmessages: + msg285747
2017-01-18 17:13:27hayposetnosy: + larry
components: + Argument Clinic
2017-01-18 17:10:15hayposetmessages: + msg285745
2017-01-18 17:07:42haypocreate