This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ammar2
Recipients ammar2, benjamin.peterson, ezio.melotti, lemburg, pitrou, vstinner
Date 2016-07-05.20:10:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1467749411.27.0.801295605617.issue27458@psf.upfronthosting.co.za>
In-reply-to
Content
So currently as far as string concatenation goes. ceval has this nice little branch it can take if both operators are unicode types. However, since this check is an Exact check, it means that subtypes of unicode end up going through the slow code path through: PyNumber_Add -> PyUnicode_Concat.

This patch aims to allow subtypes to take that optimized branch without breaking any existing behavior and without any more memory copy calls than necessary.

The motivation for this change is that some templating engines (Mako/Jinja2/Cheetah) use stuff like MarkupSafe which is implemented with a unicode subtype called `Markup`. Concatenating these custom objects (pretty common for templating engines) is fairly slow. This change modifies and uses the existing cpython code to make it a fair bit faster.

I think the only real "dangerous" change in here is in the cast_unicode_subtype_to_base function which uses a trick at the end to prevent deallocation of memory. I've made sure to keep it well commented but I'd appreciate any feedback on it.

From what I can tell from running the test suite, all tests pass and there don't seem to be any new reference leaks.
History
Date User Action Args
2016-07-05 20:10:12ammar2setrecipients: + ammar2, lemburg, pitrou, vstinner, benjamin.peterson, ezio.melotti
2016-07-05 20:10:11ammar2setmessageid: <1467749411.27.0.801295605617.issue27458@psf.upfronthosting.co.za>
2016-07-05 20:10:11ammar2linkissue27458 messages
2016-07-05 20:10:10ammar2create