classification
Title: ENH: Fix performance issue in keyword extraction
Type: Stage: resolved
Components: C API Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: methane, pablogsal, seberg, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-12-12 04:04 by seberg, last changed 2019-12-18 06:51 by methane. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17576 merged seberg, 2019-12-12 04:05
Messages (4)
msg358287 - (view) Author: Sebastian Berg (seberg) * Date: 2019-12-12 04:04
The keyword argument extraction/finding function seems to have a performance bug/enhancement (unless I am missing something here). It reads:


```
    for (i=0; i < nkwargs; i++) {
        PyObject *kwname = PyTuple_GET_ITEM(kwnames, i);

        /* ptr==ptr should match in most cases since keyword keys
           should be interned strings */
        if (kwname == key) {
            return kwstack[i];
        }
        assert(PyUnicode_Check(kwname));
        if (_PyUnicode_EQ(kwname, key)) {
            return kwstack[i];
        }
    }
```

However, it should be split into two separate for loops, using the `PyUnicode_EQ` check only if it failed for _all_ other arguments.

I will open a PR for this (it seemed like a bpo number is wanted for almost everything.
msg358290 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-12-12 06:17
Could you please provide benchmarks which demonstrate the performance issue?
msg358303 - (view) Author: Sebastian Berg (seberg) * Date: 2019-12-12 16:11
Fair enough, we had code that does it the other way, so it seemed "too obvious" since the current check seems mainly useful with few kwargs. However, a single kwarg is super common in python, while many seem super rare (in any argument clinic function)

Which is why I had even trouble finding a function where it could make a difference, since it needs many kwargs.

With changes:

sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)'
1000000 loops, best of 5: 205 nsec per loop
sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)'
1000000 loops, best of 5: 207 nsec per loop

On master:

sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)'
1000000 loops, best of 5: 221 nsec per loop
sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)'
1000000 loops, best of 5: 218 nsec per loop

Which, at close to 5% is barely noticeable, I suppose with more kwargs it could start to make a difference. With less than 3, it just does not matter I guess. Also, I am currently not sure how likely non-interned strings could actually happen. But I am not sure it matters to optimize for them (unless PyArg_From* functions use them).

In any case, sorry if this was noise, happy to fix things up or just drop it if many kwargs are considered non-existing.
msg358614 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-12-18 06:51
New changeset 75bb07e92baa7267a61056d03d7e6b475588e793 by Inada Naoki (Sebastian Berg) in branch 'master':
bpo-39028: Performance enhancement in keyword extraction (GH-17576)
https://github.com/python/cpython/commit/75bb07e92baa7267a61056d03d7e6b475588e793
History
Date User Action Args
2019-12-18 06:51:48methanesetmessages: + msg358614
2019-12-18 06:51:42methanesetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: - Python 3.8
2019-12-12 16:11:03sebergsetmessages: + msg358303
2019-12-12 12:09:50pablogsalsetnosy: + pablogsal
2019-12-12 07:59:47methanesetnosy: + methane
2019-12-12 06:17:43serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg358290
2019-12-12 04:05:12sebergsetkeywords: + patch
stage: patch review
pull_requests: + pull_request17049
2019-12-12 04:04:55sebergcreate