Title: The "lazy strings" patch
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 2.7
Status: closed Resolution: out of date
Dependencies: Superseder: Speed up using + for string concatenation
View: 1569040
Assigned To: Nosy List: ajaksu2, christian.heimes, collinwinter, larry, paulhankin, pitrou
Priority: high Keywords: patch

Created on 2006-11-04 06:30 by larry, last changed 2010-08-11 20:18 by eric.araujo. This issue is now closed.

File name Uploaded Description Edit
python.lch.lazy.string.patch.52618.diff larry, 2006-11-04 06:30 Patch against 2.6 trunk, revision 52618.
lazy.strings.patch.monograph.txt larry, 2006-11-04 06:33 An in-depth description of the patch and its ramifications, as of revision 52618.
Messages (7)
msg51321 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2006-11-04 06:30
This patch consists of three changes to CPython:
 * changing PyStringObject.ob_sval,
 * "lazy concatenations", and
 * "lazy slices".
None of these changes adds new functionality to CPython;
they are all speed or memory optimizations.

In detail:

PyStringObject.ob_sval was changed from a char[] array
to a char *.  This is not in and of itself particularly
desirable.  It was necessary in order to implement the
other two changes.

"lazy concatenations" change string concatenation ("a" + "b") so that,
instead of directly calculating the resulting string, it returns a
placeholder object representing the result.  As a result, string
concatenation in CPython is now more than 150% faster on average (as
reported by pystone 2.0), and is approximately as fast as the standard
string concatenation idiom ("".join([a + b + c])).

"lazy slices" changes string slicing ("abc"[1], "a".strip()) so
that, instead of directly calculating the resulting string, it
returns a placeholder object representing the result.  As a result,
string slicing in CPython is now more than 60% faster on average
(as reported by pystone 2.0).

When considering this patch, please keep in mind that the "lazy" changes
are distinct, and could be incorporated independently.  In particular
I'm guessing that "lazy concatenations" have a lot higher chance of
being accepted than "lazy slices".

These changes were implemented almost entirely in
Include/stringobject.h and Objects/stringobject.c.

With this patch applied, trunk builds and passes all expected tests
on Win32 and Linux.

For a more thorough discussion of this patch, please see the attached
text file(s).
msg51322 - (view) Author: Paul Hankin (paulhankin) Date: 2007-03-11 17:27
I really like the idea of the lazy cats, and can believe that it's a really good optimisation, but before I review this code properly I'd like to see:
a. convincing that it doesn't break strict aliasing (a casual reading suggests it does)
b. lazy slices removed into their own patch (or just removed) - I don't want to recommend a patch containing them
c. adherence to coding standard
d. a little more explanation of how the cat objects work: it's important because they're a future minefield of bugs.

msg51323 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2007-03-11 18:19
Howdy!  Much has transpired since I posted this patch.
* Guido expressed interest in having it in Py3k.
* I ported it to Py3k; it's Python patch #1629305 on SourceForge.
* Guido didn't like it, specifically discussing the pathological behavior of "lazy slices".
* I created a "v2 lazy slices" that eliminated the pathological behavior but added a lot of complexity.
* I ran a poll on the Py3k mailing list to see how interested people were in "lazy concatenation" and "v2 lazy slices".  Most people were +1 on lazy concatenation, and -1 on lazy slices (v1 or v2), a position I can completely endorse.  However, no Python luminaries replied, which--given the patch's checkered past--seemed like a vote of no-confidence.
* Guido closed patch #1629305.

Is there life after Guido patch-closing?  I'd be happy to spend the time answering your questions if my patch had some sort of future.  (Though you'll have to tell me what you mean by "break strict aliasing".)
msg51324 - (view) Author: Paul Hankin (paulhankin) Date: 2007-03-11 20:16
Hi Larry,
It doesn't sound too promising - I'm new and have no powers of resurrection :(

By strict aliasing, I just meant it's illegal to access members of one type if the object is of a different (incompatible) type (actually I was wrong, this isn't the strict aliasing rule - it's a more fundamental one). In your case, it means it's illegal to pass a concat object where a string object is expected, even if the function accesses members that are common to them both. If this is happening, the answer is to make a union with the string object and cat object as members, and to use this union type instead but it's not pretty.

I suggest this patch is closed anyway. If you still believe in your code and think that lazy string cats have support, I suggest making a new patch with just those in (fixed up to be correct C, and PEP 7 compliant).
msg58745 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-12-18 12:16
I'm raising the level to draw more attention to this featue. It should
either be considered for inclusion or closed.
msg84583 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-03-30 17:23
ISTM that 2.x won't receive this kind on enhancement anymore. 

Collin, I'm adding you to the nosy list because you may be interested in
having this (either on unladen or CPython). If so, also take a look at
issue 1569040.
msg84588 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-30 17:25
Either this bug or #1569040 should be closed as duplicate of the other
(it's really the same approach by the same author at two different times
Date User Action Args
2010-08-11 20:18:44eric.araujosetstatus: open -> closed
resolution: out of date
superseder: Speed up using + for string concatenation
stage: resolved
2009-03-30 17:25:47pitrousetnosy: + pitrou
messages: + msg84588
2009-03-30 17:23:20ajaksu2setnosy: + ajaksu2, collinwinter

messages: + msg84583
versions: + Python 2.7, - Python 2.6
2007-12-18 12:16:02christian.heimessetpriority: normal -> high
type: enhancement
messages: + msg58745
nosy: + christian.heimes
2006-11-04 06:30:59larrycreate