Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The "lazy strings" patch #44194

Closed
larryhastings opened this issue Nov 4, 2006 · 7 comments
Closed

The "lazy strings" patch #44194

larryhastings opened this issue Nov 4, 2006 · 7 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@larryhastings
Copy link
Contributor

BPO 1590352
Nosy @pitrou, @larryhastings, @tiran, @devdanzin
Superseder
  • bpo-1569040: Speed up using + for string concatenation
  • Files
  • python.lch.lazy.string.patch.52618.diff: Patch against 2.6 trunk, revision 52618.
  • lazy.strings.patch.monograph.txt: An in-depth description of the patch and its ramifications, as of revision 52618.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-08-11.20:18:44.543>
    created_at = <Date 2006-11-04.06:30:59.000>
    labels = ['interpreter-core', 'type-feature']
    title = 'The "lazy strings" patch'
    updated_at = <Date 2010-08-11.20:18:44.543>
    user = 'https://github.com/larryhastings'

    bugs.python.org fields:

    activity = <Date 2010-08-11.20:18:44.543>
    actor = 'eric.araujo'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-08-11.20:18:44.543>
    closer = 'eric.araujo'
    components = ['Interpreter Core']
    creation = <Date 2006-11-04.06:30:59.000>
    creator = 'larry'
    dependencies = []
    files = ['7603', '7604']
    hgrepos = []
    issue_num = 1590352
    keywords = ['patch']
    message_count = 7.0
    messages = ['51321', '51322', '51323', '51324', '58745', '84583', '84588']
    nosy_count = 6.0
    nosy_names = ['collinwinter', 'pitrou', 'larry', 'christian.heimes', 'ajaksu2', 'paulhankin']
    pr_nums = []
    priority = 'high'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = '1569040'
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1590352'
    versions = ['Python 2.7']

    @larryhastings
    Copy link
    Contributor Author

    This patch consists of three changes to CPython:

    • changing PyStringObject.ob_sval,
    • "lazy concatenations", and
    • "lazy slices".
      None of these changes adds new functionality to CPython;
      they are all speed or memory optimizations.

    In detail:

    PyStringObject.ob_sval was changed from a char[] array
    to a char *. This is not in and of itself particularly
    desirable. It was necessary in order to implement the
    other two changes.

    "lazy concatenations" change string concatenation ("a" + "b") so that,
    instead of directly calculating the resulting string, it returns a
    placeholder object representing the result. As a result, string
    concatenation in CPython is now more than 150% faster on average (as
    reported by pystone 2.0), and is approximately as fast as the standard
    string concatenation idiom ("".join([a + b + c])).

    "lazy slices" changes string slicing ("abc"[1], "a".strip()) so
    that, instead of directly calculating the resulting string, it
    returns a placeholder object representing the result. As a result,
    string slicing in CPython is now more than 60% faster on average
    (as reported by pystone 2.0).

    When considering this patch, please keep in mind that the "lazy" changes
    are distinct, and could be incorporated independently. In particular
    I'm guessing that "lazy concatenations" have a lot higher chance of
    being accepted than "lazy slices".

    These changes were implemented almost entirely in
    Include/stringobject.h and Objects/stringobject.c.

    With this patch applied, trunk builds and passes all expected tests
    on Win32 and Linux.

    For a more thorough discussion of this patch, please see the attached
    text file(s).

    @larryhastings larryhastings added interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 4, 2006
    @paulhankin
    Copy link
    Mannequin

    paulhankin mannequin commented Mar 11, 2007

    I really like the idea of the lazy cats, and can believe that it's a really good optimisation, but before I review this code properly I'd like to see:
    a. convincing that it doesn't break strict aliasing (a casual reading suggests it does)
    b. lazy slices removed into their own patch (or just removed) - I don't want to recommend a patch containing them
    c. adherence to coding standard
    d. a little more explanation of how the cat objects work: it's important because they're a future minefield of bugs.

    @larryhastings
    Copy link
    Contributor Author

    Howdy! Much has transpired since I posted this patch.

    • Guido expressed interest in having it in Py3k.
    • I ported it to Py3k; it's Python patch bpo-1629305 on SourceForge.
    • Guido didn't like it, specifically discussing the pathological behavior of "lazy slices".
    • I created a "v2 lazy slices" that eliminated the pathological behavior but added a lot of complexity.
    • I ran a poll on the Py3k mailing list to see how interested people were in "lazy concatenation" and "v2 lazy slices". Most people were +1 on lazy concatenation, and -1 on lazy slices (v1 or v2), a position I can completely endorse. However, no Python luminaries replied, which--given the patch's checkered past--seemed like a vote of no-confidence.
    • Guido closed patch bpo-1629305.

    Is there life after Guido patch-closing? I'd be happy to spend the time answering your questions if my patch had some sort of future. (Though you'll have to tell me what you mean by "break strict aliasing".)

    @paulhankin
    Copy link
    Mannequin

    paulhankin mannequin commented Mar 11, 2007

    Hi Larry,
    It doesn't sound too promising - I'm new and have no powers of resurrection :(

    By strict aliasing, I just meant it's illegal to access members of one type if the object is of a different (incompatible) type (actually I was wrong, this isn't the strict aliasing rule - it's a more fundamental one). In your case, it means it's illegal to pass a concat object where a string object is expected, even if the function accesses members that are common to them both. If this is happening, the answer is to make a union with the string object and cat object as members, and to use this union type instead but it's not pretty.

    I suggest this patch is closed anyway. If you still believe in your code and think that lazy string cats have support, I suggest making a new patch with just those in (fixed up to be correct C, and PEP-7 compliant).

    @tiran
    Copy link
    Member

    tiran commented Dec 18, 2007

    I'm raising the level to draw more attention to this featue. It should
    either be considered for inclusion or closed.

    @tiran tiran added type-feature A feature request or enhancement labels Dec 18, 2007
    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Mar 30, 2009

    ISTM that 2.x won't receive this kind on enhancement anymore.

    Collin, I'm adding you to the nosy list because you may be interested in
    having this (either on unladen or CPython). If so, also take a look at
    bpo-1569040.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 30, 2009

    Either this bug or bpo-1569040 should be closed as duplicate of the other
    (it's really the same approach by the same author at two different times
    :-)).

    @merwok merwok closed this as completed Aug 11, 2010
    @merwok merwok closed this as completed Aug 11, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants