Issue 27458: Allow subtypes of unicode/str to hit the optimized unicode_concatenate block

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/71645

classification

Title:	Allow subtypes of unicode/str to hit the optimized unicode_concatenate block
Type:	performance	Stage:
Components:	Interpreter Core	Versions:	Python 3.6

process

Status:	closed	Resolution:	wont fix
Dependencies:		Superseder:
Assigned To:		Nosy List:	ammar2, benjamin.peterson, ezio.melotti, lemburg, rhettinger, serhiy.storchaka, steven.daprano, vstinner
Priority:	normal	Keywords:	patch

Created on 2016-07-05 20:10 by ammar2, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
test.py	ammar2, 2016-07-05 21:44	Convenience script to trace concatenation execution path
python.diff	ammar2, 2016-07-06 03:17		review
bench.py	ammar2, 2016-07-06 19:20

Messages (10)
msg269849 - (view)	Author: Ammar Askar (ammar2) *	Date: 2016-07-05 20:10
So currently as far as string concatenation goes. ceval has this nice little branch it can take if both operators are unicode types. However, since this check is an Exact check, it means that subtypes of unicode end up going through the slow code path through: PyNumber_Add -> PyUnicode_Concat. This patch aims to allow subtypes to take that optimized branch without breaking any existing behavior and without any more memory copy calls than necessary. The motivation for this change is that some templating engines (Mako/Jinja2/Cheetah) use stuff like MarkupSafe which is implemented with a unicode subtype called `Markup`. Concatenating these custom objects (pretty common for templating engines) is fairly slow. This change modifies and uses the existing cpython code to make it a fair bit faster. I think the only real "dangerous" change in here is in the cast_unicode_subtype_to_base function which uses a trick at the end to prevent deallocation of memory. I've made sure to keep it well commented but I'd appreciate any feedback on it. From what I can tell from running the test suite, all tests pass and there don't seem to be any new reference leaks.
msg269854 - (view)	Author: Ammar Askar (ammar2) *	Date: 2016-07-05 21:44
Side note, I was using this script which uses gdb to trace the execution path when it concatenates strings.
msg269862 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2016-07-06 01:34
I haven't studied your code in detail (I won't be qualified to judge it) but I notice this comment: /* Hit the faster unicode_concatenate method if and only if all the following conditions are true: 1. The left operand is a unicode type 2. The right operand is a unicode type 3. The left operand's __add__ method isn't overriden / I don't think that's sufficient. You also need to check the right operand's __radd__ method if it is a subclass of the left: class A(str): pass class B(str): def __radd__(self, other): return B(other.uppper() + str(self)) a = A("hello ") b = B("world") assert a + b == "HELLO world" Also you have some tests, but I don't think they're sufficient. You have a comment about "Longer strings to prevent interning" but I'm not sure that a mere 21 characters is guaranteed now and forever to do that. I'd be much happier to see a long string which is not an identifier: s = S("# ! ~"50) # hopefully will not be interned plus an assertion to check that it is not interned. That way, if the rules for interning ever change, the test will fail loudly and clearly rather than silently do the wrong thing. (Better still would be an actual language API for un-interning strings, if one exists.) Also, I see that you have tests to check that the optimized path is taken for subclasses, but do you have tests to check that it is NOT taken when it shouldn't be?
msg269868 - (view)	Author: Ammar Askar (ammar2) *	Date: 2016-07-06 03:17
Thank you very much for the prompt feedback. I didn't even realize there was a __radd__ method, great catch. I've fixed this and added an appropriate test. Very good point about the interning. Seeing as its an implementation detail, I've changed the interned tests to cpython only and added the assertion to ensure they are not interned. I don't think there's any language level features to ensure strings are not interned. As far as testing the execution path goes. I was updating the source code in my test.py script between concatenating with subtypes vs just plain old unicode objects since doing multiple traces was starting to get a lot more complicated with gdb scripting :)
msg269869 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-07-06 07:55
1. It would be interesting to see an example showing the benefit of this change. How large is the benefit, and how common is this case? 2. The optimization of string concatenation is CPython specific, and AFAIK it was decided not to extend it to other cases (e.g. to bytes). The recommended way for efficient string concatenation is using str.join or io.StringIO. Or classic string formatting -- this is yet one CPython specific optimization. 3. Non-compact string representation is legacy. It is kept for compatibility with existing code, but will be removed in future. The work with non-compact strings is not always efficient.
msg269888 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2016-07-06 16:46
I'm not interested, sorry.
msg269890 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2016-07-06 17:30
We really don't want to encourage any reliance on this optimization. It was put there only to help mitigate the performance impact of a common mistake. I don't think subclasses warrant extended coverage. This was already a hack at the outer limits of how far we were willing to go. It doesn't make sense to propagate it further.
msg269898 - (view)	Author: STINNER Victor (vstinner) *	Date: 2016-07-06 19:01
I agree with Raymond. Use ''.join(list of str) or a similar pattern.
msg269903 - (view)	Author: Ammar Askar (ammar2) *	Date: 2016-07-06 19:20
> We really don't want to encourage any reliance on this optimization. It was put there only to help mitigate the performance impact of a common mistake. Aah, I didn't realize the extra context behind why the unicode_concatenate path actually exists in ceval. Makes sense though since it was the only exceptional case that popped out in ceval. > It would be interesting to see an example showing the benefit of this change I'm no expert at benchmarking but I threw together a quick script to compare the different ways of string concatenation and it seems like an INPLACE_ADD is the fastest. (which makes sense because it avoids the overhead of having a list object that ''.join brings in, not sure why StringIO is slower than both though, maybe that would be a better place to improve?) Either way if you guys think this adds too much complexity on top of an existing hack, this is fine to close. Benchmarking results: INPLACE_ADD short ascii 0.4307783489348367 ''.join short ascii 0.6934443039353937 StringIO short ascii 0.9447220619767904 INPLACE_ADD short unicode 0.4411839219974354 ''.join short unicode 0.666951927007176 StringIO short unicode 0.9783720930572599 INPLACE_ADD long ascii 3.6157665309729055 ''.join long ascii 6.938268916099332 StringIO long ascii 5.279585674987175 INPLACE_ADD long unicode 3.7768358619650826 ''.join long unicode 4.641092017991468 StringIO long unicode 7.6051657549105585
msg269904 - (view)	Author: STINNER Victor (vstinner) *	Date: 2016-07-06 19:21
I close the issue.

History
Date	User	Action	Args
2022-04-11 14:58:33	admin	set	github: 71645
2016-07-06 19:21:20	vstinner	set	status: open -> closed resolution: wont fix messages: + msg269904
2016-07-06 19:20:35	ammar2	set	files: + bench.py messages: + msg269903
2016-07-06 19:01:48	vstinner	set	messages: + msg269898
2016-07-06 17:30:23	rhettinger	set	nosy: + rhettinger messages: + msg269890
2016-07-06 16:47:22	gvanrossum	set	nosy: - gvanrossum
2016-07-06 16:46:58	gvanrossum	set	nosy: lemburg, gvanrossum, vstinner, benjamin.peterson, ezio.melotti, steven.daprano, serhiy.storchaka, ammar2 messages: + msg269888
2016-07-06 07:55:54	serhiy.storchaka	set	nosy: + gvanrossum, serhiy.storchaka messages: + msg269869
2016-07-06 07:25:50	pitrou	set	nosy: - pitrou
2016-07-06 03:17:33	ammar2	set	files: - python.diff
2016-07-06 03:17:07	ammar2	set	files: + python.diff messages: + msg269868
2016-07-06 01:34:53	steven.daprano	set	nosy: + steven.daprano messages: + msg269862 versions: + Python 3.6
2016-07-05 21:44:31	ammar2	set	files: + test.py messages: + msg269854
2016-07-05 20:10:11	ammar2	create