This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Allow subtypes of unicode/str to hit the optimized unicode_concatenate block
Type: performance Stage:
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: ammar2, benjamin.peterson, ezio.melotti, lemburg, rhettinger, serhiy.storchaka, steven.daprano, vstinner
Priority: normal Keywords: patch

Created on 2016-07-05 20:10 by ammar2, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.py ammar2, 2016-07-05 21:44 Convenience script to trace concatenation execution path
python.diff ammar2, 2016-07-06 03:17 review
bench.py ammar2, 2016-07-06 19:20
Messages (10)
msg269849 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2016-07-05 20:10
So currently as far as string concatenation goes. ceval has this nice little branch it can take if both operators are unicode types. However, since this check is an Exact check, it means that subtypes of unicode end up going through the slow code path through: PyNumber_Add -> PyUnicode_Concat.

This patch aims to allow subtypes to take that optimized branch without breaking any existing behavior and without any more memory copy calls than necessary.

The motivation for this change is that some templating engines (Mako/Jinja2/Cheetah) use stuff like MarkupSafe which is implemented with a unicode subtype called `Markup`. Concatenating these custom objects (pretty common for templating engines) is fairly slow. This change modifies and uses the existing cpython code to make it a fair bit faster.

I think the only real "dangerous" change in here is in the cast_unicode_subtype_to_base function which uses a trick at the end to prevent deallocation of memory. I've made sure to keep it well commented but I'd appreciate any feedback on it.

From what I can tell from running the test suite, all tests pass and there don't seem to be any new reference leaks.
msg269854 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2016-07-05 21:44
Side note, I was using this script which uses gdb to trace the execution path when it concatenates strings.
msg269862 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2016-07-06 01:34
I haven't studied your code in detail (I won't be qualified to judge it) but I notice this comment:

  /* Hit the faster unicode_concatenate method if and only if
     all the following conditions are true:
     1. The left operand is a unicode type
     2. The right operand is a unicode type
     3. The left operand's __add__ method isn't overriden */


I don't think that's sufficient. You also need to check the right operand's __radd__ method if it is a subclass of the left:

class A(str):
    pass

class B(str):
    def __radd__(self, other):
        return B(other.uppper() + str(self))

a = A("hello ")
b = B("world")
assert a + b == "HELLO world"



Also you have some tests, but I don't think they're sufficient. You have a comment about "Longer strings to prevent interning" but I'm not sure that a mere 21 characters is guaranteed now and forever to do that. I'd be much happier to see a long string which is not an identifier:

s = S("# ! ~"*50)  # hopefully will not be interned

plus an assertion to check that it is not interned. That way, if the rules for interning ever change, the test will fail loudly and clearly rather than silently do the wrong thing.

(Better still would be an actual language API for un-interning strings, if one exists.)

Also, I see that you have tests to check that the optimized path is taken for subclasses, but do you have tests to check that it is NOT taken when it shouldn't be?
msg269868 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2016-07-06 03:17
Thank you very much for the prompt feedback.

I didn't even realize there was a __radd__ method, great catch. I've fixed this and added an appropriate test.

Very good point about the interning. Seeing as its an implementation detail, I've changed the interned tests to cpython only and added the assertion to ensure they are not interned. I don't think there's any language level features to ensure strings are not interned.

As far as testing the execution path goes. I was updating the source code in my test.py script between concatenating with subtypes vs just plain old unicode objects since doing multiple traces was starting to get a lot more complicated with gdb scripting :)
msg269869 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-06 07:55
1. It would be interesting to see an example showing the benefit of this change. How large is the benefit, and how common is this case?

2. The optimization of string concatenation is CPython specific, and AFAIK it was decided not to extend it to other cases (e.g. to bytes). The recommended way for efficient string concatenation is using str.join or io.StringIO. Or classic string formatting -- this is yet one CPython specific optimization.

3. Non-compact string representation is legacy. It is kept for compatibility with existing code, but will be removed in future. The work with non-compact strings is not always efficient.
msg269888 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-07-06 16:46
I'm not interested, sorry.
msg269890 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-07-06 17:30
We really don't want to encourage any reliance on this optimization.  It was put there only to help mitigate the performance impact of a common mistake.

I don't think subclasses warrant extended coverage.  This was already a hack at the outer limits of how far we were willing to go.   It doesn't make sense to propagate it further.
msg269898 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-07-06 19:01
I agree with Raymond. Use ''.join(list of str) or a similar pattern.
msg269903 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2016-07-06 19:20
> We really don't want to encourage any reliance on this optimization.  It was put there only to help mitigate the performance impact of a common mistake.

Aah, I didn't realize the extra context behind why the unicode_concatenate path actually exists in ceval. Makes sense though since it was the only exceptional case that popped out in ceval.

> It would be interesting to see an example showing the benefit of this change

I'm no expert at benchmarking but I threw together a quick script to compare the different ways of string concatenation and it seems like an INPLACE_ADD is the fastest. (which makes sense because it avoids the overhead of having a list object that ''.join brings in, not sure why StringIO is slower than both though, maybe that would be a better place to improve?)

Either way if you guys think this adds too much complexity on top of an existing hack, this is fine to close.


Benchmarking results:

  INPLACE_ADD short ascii
0.4307783489348367
  ''.join     short ascii
0.6934443039353937
  StringIO    short ascii
0.9447220619767904

  INPLACE_ADD short unicode
0.4411839219974354
  ''.join     short unicode
0.666951927007176
  StringIO    short unicode
0.9783720930572599

  INPLACE_ADD long ascii
3.6157665309729055
  ''.join     long ascii
6.938268916099332
  StringIO    long ascii
5.279585674987175

  INPLACE_ADD long unicode
3.7768358619650826
  ''.join     long unicode
4.641092017991468
  StringIO    long unicode
7.6051657549105585
msg269904 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-07-06 19:21
I close the issue.
History
Date User Action Args
2022-04-11 14:58:33adminsetgithub: 71645
2016-07-06 19:21:20vstinnersetstatus: open -> closed
resolution: wont fix
messages: + msg269904
2016-07-06 19:20:35ammar2setfiles: + bench.py

messages: + msg269903
2016-07-06 19:01:48vstinnersetmessages: + msg269898
2016-07-06 17:30:23rhettingersetnosy: + rhettinger
messages: + msg269890
2016-07-06 16:47:22gvanrossumsetnosy: - gvanrossum
2016-07-06 16:46:58gvanrossumsetnosy: lemburg, gvanrossum, vstinner, benjamin.peterson, ezio.melotti, steven.daprano, serhiy.storchaka, ammar2
messages: + msg269888
2016-07-06 07:55:54serhiy.storchakasetnosy: + gvanrossum, serhiy.storchaka
messages: + msg269869
2016-07-06 07:25:50pitrousetnosy: - pitrou
2016-07-06 03:17:33ammar2setfiles: - python.diff
2016-07-06 03:17:07ammar2setfiles: + python.diff

messages: + msg269868
2016-07-06 01:34:53steven.dapranosetnosy: + steven.daprano

messages: + msg269862
versions: + Python 3.6
2016-07-05 21:44:31ammar2setfiles: + test.py

messages: + msg269854
2016-07-05 20:10:11ammar2create