string method lookup is too slow #61372

gvanrossum · 2013-02-09T15:59:31Z

BPO	17170
Nosy	@gvanrossum, @warsaw, @terryjreedy, @jcea, @amauryfa, @ncoghlan, @pitrou, @scoder, @vstinner, @larryhastings, @ezio-melotti, @florentx, @markshannon, @serhiy-storchaka, @1st1, @MojoVampire
Files	getargs_freelist.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-06-22.16:38:56.013>
created_at = <Date 2013-02-09.15:59:30.975>
labels = ['interpreter-core', 'performance']
title = 'string method lookup is too slow'
updated_at = <Date 2014-06-22.16:38:56.012>
user = 'https://github.com/gvanrossum'

bugs.python.org fields:

activity = <Date 2014-06-22.16:38:56.012>
actor = 'pitrou'
assignee = 'none'
closed = True
closed_date = <Date 2014-06-22.16:38:56.013>
closer = 'pitrou'
components = ['Interpreter Core']
creation = <Date 2013-02-09.15:59:30.975>
creator = 'gvanrossum'
dependencies = []
files = ['29024']
hgrepos = []
issue_num = 17170
keywords = ['patch']
message_count = 27.0
messages = ['181741', '181742', '181743', '181744', '181753', '181754', '181755', '181761', '181774', '181775', '181776', '181933', '181952', '181965', '181969', '182001', '182002', '182006', '182007', '182250', '182607', '182612', '182613', '183721', '212851', '221085', '221274']
nosy_count = 19.0
nosy_names = ['gvanrossum', 'barry', 'terry.reedy', 'jcea', 'amaury.forgeotdarc', 'ncoghlan', 'pitrou', 'scoder', 'vstinner', 'larry', 'ezio.melotti', 'flox', 'BreamoreBoy', 'Mark.Shannon', 'python-dev', 'serhiy.storchaka', 'yselivanov', 'isoschiz', 'josh.r']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = None
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue17170'
versions = ['Python 3.5']

gvanrossum · 2013-02-09T15:59:31Z

I'm trying to speed up a web template engine and I find that the code needs to do a lot of string replacements of this form:

  name = name.replace('_', '-')

Characteristics of the data: the names are relatively short (1-10 characters usually), and the majority don't contain a '_' at all.

For this combination I've found that the following idiom is significantly faster:

  if '_' in name:
      name = name.replace('_', '-')

I'd hate for that idiom to become popular. I looked at the code (in the default branch) briefly, but it is already optimized for this case. So I am at a bit of a loss to explain the speed difference...

Some timeit experiments:

bash-3.2$ ./python.exe -m timeit -s "a = 'hundred'" "'x' in a"
./python.exe -m timeit -s "a = 'hundred'" "'x' in a"

bash-3.2$ ./python.exe -m timeit -s "a = 'hundred'" "a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hundred'" "a.replace('x', 'y')"

bash-3.2$ ./python.exe -m timeit -s "a = 'hundred'" "if 'x' in a: a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hundred'" "if 'x' in a: a.replace('x', 'y')"

bash-3.2$ ./python.exe -m timeit -s "a = 'hunxred'" "a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hunxred'" "a.replace('x', 'y')"

bash-3.2$ ./python.exe -m timeit -s "a = 'hunxred'" "if 'x' in a: a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hunxred'" "if 'x' in a: a.replace('x', 'y')"

pitrou · 2013-02-09T16:07:30Z

Characteristics of the data: the names are relatively short (1-10
characters usually)

$ ./python -m timeit -s "a = 'hundred'" "'x' in a"
10000000 loops, best of 3: 0.0431 usec per loop
$ ./python -m timeit -s "a = 'hundred'" "a.find('x')"
1000000 loops, best of 3: 0.206 usec per loop
$ ./python -m timeit -s "a = 'hundred'" "a.replace('x', 'y')"
10000000 loops, best of 3: 0.198 usec per loop

Basically, it's simply the overhead of method calls over operator calls. You only see it because the strings are very short, and therefore the cost of finding / replacing is tiny.

gvanrossum · 2013-02-09T16:18:55Z

Hm, you seem to be right. Changing the bug title.

So, can we speed up method lookup? It's a shame that I have to start promoting this ugly idiom. There's a similar issue where s[:5]=='abcde' is faster than s.startswith('abcde'):

./python.exe -m timeit -s "a = 'hundred'" "a.startswith('foo')"
1000000 loops, best of 3: 0.281 usec per loop

./python.exe -m timeit -s "a = 'hundred'" "a[:3] == 'foo'"
10000000 loops, best of 3: 0.158 usec per loop

serhiy-storchaka · 2013-02-09T16:51:54Z

There are two overheads: an attribute lookup and a function call.

$ ./python -m timeit -s "a = 'hundred'"  "'x' in a"
10000000 loops, best of 3: 0.0943 usec per loop
$ ./python -m timeit -s "a = 'hundred'"  "a.__contains__('x')"
1000000 loops, best of 3: 0.271 usec per loop
$ ./python -m timeit -s "a = 'hundred'"  "a.__contains__"
10000000 loops, best of 3: 0.135 usec per loop

Time of "a.__contains__('x')" is greater than the sum of times of "a.__contains__" and "'x' in a".

pitrou · 2013-02-09T19:43:53Z

Indeed the function call cost actually dominates:

$ ./python -m timeit -s "a = 'hundred'" "a.find('x')"
1000000 loops, best of 3: 0.206 usec per loop
$ ./python -m timeit -s "a = 'hundred'; f=a.find" "f('x')"
10000000 loops, best of 3: 0.176 usec per loop
$ ./python -m timeit -s "a = 'hundred'" "'x' in a"
10000000 loops, best of 3: 0.0431 usec per loop

pitrou · 2013-02-09T19:58:54Z

Some crude C benchmarking on this computer:

calling PyUnicode_Replace is 35 ns (per call)
calling "hundred".replace is 125 ns
calling PyArg_ParseTuple with the same signature as "hundred".replace is 80 ns

Therefore, most of the overhead (125 - 35 = 90 ns) is in calling PyArg_ParseTuple() to unpack the method arguments.

serhiy-storchaka · 2013-02-09T20:22:48Z

And PyArg_ParseTupleAndKeywords() is even more slow.

$ ./python -m timeit "str(b'', 'utf-8', 'strict')"
1000000 loops, best of 3: 0.554 usec per loop
$ ./python -m timeit "str(object=b'', encoding='utf-8', errors='strict')"
1000000 loops, best of 3: 1.74 usec per loop

pitrou · 2013-02-09T21:00:54Z

Here is a patch yielding a decent speedup (~ 40%) on PyArg_ParseTuple itself.
More generally though, this would be improved by precompiling some of the information (like Argument Clinic does, perhaps).

(note: PyArg_ParseTupleAndKeywords is a completely separate implementation...)

pitrou · 2013-02-10T00:02:32Z

Updated patch to also handle PyArg_ParseTupleAndKeywords.

gvanrossum · 2013-02-10T00:16:33Z

Great to see some action. Would there be a problem in backporting this? It's not a new feature after all...

pitrou · 2013-02-10T00:20:37Z

That would be left to the discretion of release managers.
In all honesty the real-world benefit should be small (around 2% on the benchmark suite, apparently).
Also, the principle of this patch doesn't apply to 2.7.

terryjreedy · 2013-02-11T21:15:50Z

A related issue: the speed of finding and hence replacing chars in strings is known to have regressed in 3.3 relative to 3.2, especially on Windows. For long strings, that will negate in 3.3 the speedup for the initial method call. See bpo-16061, with patches. The holdup seems to be deciding which of two good patches to apply.

amauryfa · 2013-02-12T11:01:22Z

I left some comments on Rietveld.

I wonder if PyArg_ParseTupleAndKeywords can be replaced by something that would compute and cache the set of keywords; a bit like _Py_IDENTIFIER.

gvanrossum · 2013-02-12T16:15:36Z

What's the status of Argument Clinic? Won't that make this obsolete?

--Guido van Rossum (sent from Android phone)

pitrou · 2013-02-12T17:06:59Z

I left some comments on Rietveld.

I wonder if PyArg_ParseTupleAndKeywords can be replaced by something
that would compute and cache the set of keywords; a bit like
_Py_IDENTIFIER.

It would make sense indeed.

ncoghlan · 2013-02-13T07:44:49Z

To answer Guido's question about clinic, see http://bugs.python.org/issue16612

Mostly positive feedback, but several of us would like a PEP to make sure we're happy with the resolution of the limited negative feedback.

larryhastings · 2013-02-13T08:05:18Z

Argument Clinic has languished for lack of time. I didn't get much feedback, though a couple people were shouting for a PEP, which I was resisting. I figured, if they have something to say, they can go ahead and reply on the tracker issue, and if they don't have something to say, why do we need a PEP?

I need to reply to one bit of thorough feedback, and after that--I don't know. I'd like to get things moving before PyCon so we can point sprinters at it.

larryhastings · 2013-02-13T08:57:10Z

Oh, and, as to whether Argument Clinic would solve this problem, the answer is "not yet". Right now Argument Clinic literally generates calls to PyArg_ParseTupleAndKeywords. (In special cases it switches to PyArg_ParseTuple.)

I'm more interested in Argument Clinic from the API perspective; I wanted to make a better way of specifying arguments to functions so we got all the metadata we needed without having to endlessly repeat ourselves. Truthfully I was hoping someone else would pick up the gauntlet once it was checked in and make a new argument processing API / hack up the Argument Clinic output to make it faster.

pitrou · 2013-02-13T10:05:52Z

Truthfully I was hoping someone else would pick up the gauntlet once it
was checked in and make a new argument processing API / hack up the
Argument Clinic output to make it faster.

Argument Clinic's preprocessing would be a very nice building block to generate faster parsing sequences.
Like Nick I'd still like to see a PEP, though ;-)

python-dev · 2013-02-17T00:09:16Z

New changeset 4e985a96a612 by Antoine Pitrou in branch 'default':
Issue bpo-17170: speed up PyArg_ParseTuple[AndKeywords] a bit.
http://hg.python.org/cpython/rev/4e985a96a612

scoder · 2013-02-21T20:34:34Z

Let me throw in a quick reminder that Cython has substantially faster argument parsing than the C-API functions provide because it translates function signatures like

    def func(int a, b=1, *, list c, d=2):
        ...

into tightly specialised unpacking code, while keeping it as compatible as possible with the equivalent Python function (better than manually implemented C functions, BTW). Might be an alternative to the Argument Clinic, one that has been working for a couple of years now and has already proven its applicability to a large body of real world code.

terryjreedy · 2013-02-21T22:20:27Z

(Stefan) > into tightly specialised unpacking code,

Are you suggesting that func.__call__ should be specialized to func's
signature, more than it is now (which is perhaps not at all), or
something else?

scoder · 2013-02-21T22:32:26Z

Cython does that in general, sure. However, this ticket is about a specific case where string methods (which are implemented in C) are slow when called from Python. Antoine found out that the main overhead is not so much from the method lookup itself but from argument parsing inside of the function. The unpacking code that Cython generates for the equivalent Python signature would speed this up, while keeping or improving the compatibility with Python call semantics.

vstinner · 2013-03-08T01:56:10Z

More generally though, this would be improved by precompiling some of the information (like Argument Clinic does, perhaps).

The same idea was already proposed to optimize str%args and str.format(args). struct.unpack() does also compile the format into an optimize structure (and have a cache).

We may do something like Martin von Loewis's _Py_IDENTIFIER API: compile at runtime at the first call, and cache the result in a static variable.

It's not a tiny project, and I don't know exactly how to build a "JIT compiler" for getargs.c, nor how complex it would be. But it would speed up *all* Python calls, so any Python application.

BreamoreBoy · 2014-03-06T22:52:33Z

What's the status of this issue? Code was committed to the default branch over a year ago, see msg182250

BreamoreBoy · 2014-06-20T13:04:51Z

I don't think there's anything to do here so can it be closed? If anything else needs discussing surely it can go to python-ideas, python-dev or a new issue as appropriate.

pitrou · 2014-06-22T16:38:56Z

Indeed keeping this issue open wouldn't be very productive since it relates to the more general problem of Python's slow interpretation.

gvanrossum added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Feb 9, 2013

gvanrossum changed the title ~~string replace is too slow~~ string method lookup is too slow Feb 9, 2013

pitrou closed this as completed Jun 22, 2014

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

string method lookup is too slow #61372

string method lookup is too slow #61372

gvanrossum commented Feb 9, 2013

gvanrossum commented Feb 9, 2013

pitrou commented Feb 9, 2013

gvanrossum commented Feb 9, 2013

serhiy-storchaka commented Feb 9, 2013

pitrou commented Feb 9, 2013

pitrou commented Feb 9, 2013

serhiy-storchaka commented Feb 9, 2013

pitrou commented Feb 9, 2013

pitrou commented Feb 10, 2013

gvanrossum commented Feb 10, 2013

pitrou commented Feb 10, 2013

terryjreedy commented Feb 11, 2013

amauryfa commented Feb 12, 2013

gvanrossum commented Feb 12, 2013

pitrou commented Feb 12, 2013

ncoghlan commented Feb 13, 2013

larryhastings commented Feb 13, 2013

larryhastings commented Feb 13, 2013

pitrou commented Feb 13, 2013

python-dev mannequin commented Feb 17, 2013

scoder commented Feb 21, 2013

terryjreedy commented Feb 21, 2013

scoder commented Feb 21, 2013

vstinner commented Mar 8, 2013

BreamoreBoy mannequin commented Mar 6, 2014

BreamoreBoy mannequin commented Jun 20, 2014

pitrou commented Jun 22, 2014

string method lookup is too slow #61372

string method lookup is too slow #61372

Comments

gvanrossum commented Feb 9, 2013

gvanrossum commented Feb 9, 2013

pitrou commented Feb 9, 2013

gvanrossum commented Feb 9, 2013

serhiy-storchaka commented Feb 9, 2013

pitrou commented Feb 9, 2013

pitrou commented Feb 9, 2013

serhiy-storchaka commented Feb 9, 2013

pitrou commented Feb 9, 2013

pitrou commented Feb 10, 2013

gvanrossum commented Feb 10, 2013

pitrou commented Feb 10, 2013

terryjreedy commented Feb 11, 2013

amauryfa commented Feb 12, 2013

gvanrossum commented Feb 12, 2013

pitrou commented Feb 12, 2013

ncoghlan commented Feb 13, 2013

larryhastings commented Feb 13, 2013

larryhastings commented Feb 13, 2013

pitrou commented Feb 13, 2013

python-dev mannequin commented Feb 17, 2013

scoder commented Feb 21, 2013

terryjreedy commented Feb 21, 2013

scoder commented Feb 21, 2013

vstinner commented Mar 8, 2013

BreamoreBoy mannequin commented Mar 6, 2014

BreamoreBoy mannequin commented Jun 20, 2014

pitrou commented Jun 22, 2014