Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Unicode "lazy strings" patches #44420

Closed
larryhastings opened this issue Jan 6, 2007 · 26 comments
Closed

The Unicode "lazy strings" patches #44420

larryhastings opened this issue Jan 6, 2007 · 26 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@larryhastings
Copy link
Contributor

BPO 1629305
Nosy @malemburg, @gvanrossum, @josiahcarlson, @larryhastings
Files
  • lch.py3k.unicode.lazy.concat.patch.1.txt: First cut of the "lazy concatenation" patch for Unicode strings in py3k
  • lch.py3k.unicode.lazy.concat.patch.3.txt: Third version of the "lazy concatenation" patch for Unicode strings in py3k
  • lch.py3k.unicode.lazy.concat.patch.53392.txt: "Lazy concatenation" only for Unicode in Py3k, patch against Python revision 53392.
  • lch.py3k.unicode.lazy.slice.and.concat.patch.53392.txt: "Lazy strings" patch, with "lazy slices" and "lazy concatenation", patch against Python revision 53392.
  • pybench.first.results.zip: Results from initial pybench testing.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-01-06.22:29:46.238>
    created_at = <Date 2007-01-06.09:37:36.000>
    labels = ['interpreter-core']
    title = 'The Unicode "lazy strings" patches'
    updated_at = <Date 2008-01-06.22:29:46.238>
    user = 'https://github.com/larryhastings'

    bugs.python.org fields:

    activity = <Date 2008-01-06.22:29:46.238>
    actor = 'admin'
    assignee = 'none'
    closed = True
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2007-01-06.09:37:36.000>
    creator = 'larry'
    dependencies = []
    files = ['7695', '7696', '7697', '7698', '7699']
    hgrepos = []
    issue_num = 1629305
    keywords = ['patch']
    message_count = 26.0
    messages = ['51660', '51661', '51662', '51663', '51664', '51665', '51666', '51667', '51668', '51669', '51670', '51671', '51672', '51673', '51674', '51675', '51676', '51677', '51678', '51679', '51680', '51681', '51682', '51683', '51684', '51685']
    nosy_count = 4.0
    nosy_names = ['lemburg', 'gvanrossum', 'josiahcarlson', 'larry']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1629305'
    versions = ['Python 3.0']

    @larryhastings
    Copy link
    Contributor Author

    These are patches to add lazy processing to Unicode strings for Python 3000. I plan to post separate patches for both "lazy concatenation" and "lazy slices", as I suspect "lazy concatenation" has a much higher chance of being accepted.

    There is a long discussion about "lazy concatenation" here:
    http://mail.python.org/pipermail/python-dev/2006-October/069224.html
    And another long discussion about "lazy slices" here:
    http://mail.python.org/pipermail/python-dev/2006-October/069506.html

    Note that, unlike the 8-bit-character strings patches, I don't expect the "lazy slices" patch to be dependent on the "lazy concatenation" patch. Unicode objects are stored differently, and already use a pointer to a separately-allocated buffer. This was the big (and mildly controversial) change made by the 8-bit-character "lazy concatenation" patch, and "lazy slices" needed it too. Since Unicode objects already look like that, the Unicode lazy patches should be independent.

    @larryhastings larryhastings added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jan 6, 2007
    @larryhastings larryhastings added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jan 6, 2007
    @josiahcarlson
    Copy link
    Mannequin

    josiahcarlson mannequin commented Jan 7, 2007

    What are the performance characteristics of each operation? I presume that a + b for unicode strings a and b is O(1) time (if I understand your implementation correctly). But according to my reading, (a + b + c + ...)[i] is O(number of concatenations performed). Is this correct?

    @malemburg
    Copy link
    Member

    While I don't think the added complexity in the implementation is worth it, given that there are other ways of achieving the same kind of performance (e.g. list of Unicode strings), some comments:

    • you add a long field to every Unicode object - so every single object in the system pays 4-8 bytes for the small performance advantage

    • Unicode objects are often references using PyUnicode_AS_UNICODE(); this operation doesn't allow passing back errors, yet your lazy evaluation approach can cause memory errors - how are you going to deal with them ? (currently you don't even test for them)

    • the lazy approach keeps all partial Unicode objects alive until they finally get concatenated; if you have lots of those (e.g. if you use x += y in a loop), then you pay the complete Python object overhead for every single partial Unicode object in the list of strings - given that most such operations use short strings, you are likely creating a memory overhead far greater than the the total length of all the strings

    @larryhastings
    Copy link
    Contributor Author

    jcarlson:
    The first time someone calls PyUnicode_AsUnicode() on a concatenation object, it renders the string, and that's an O(something) operation. In general this rendering is O(i), aka linear time, though linear related to *what* depends. (It iterates over the m concatenated strings, and each of the n characters in those strings, and whether n or m is more important depends on their values.) After rendering, the object behaves like any other Unicode string, including O(1) for array element lookup.

    If you're referring to GvR's statement "I mention performance because s[i] should remain an O(1) operation.", here:
    http://mail.python.org/pipermail/python-3000/2006-December/005281.html
    I suspect this refers to the UCS-2 vs. UTF-16 debate.

    lemberg:
    Your criticisms are fair; lazy evaluation is a tradeoff. In general my response to theories about how it will affect performance is "I invite you to try it and see".

    As for causing memory errors, the only problem I see is not checking for a NULL return from PyMem_NEW() in PyUnicode_AsUnicode(). But that's a bug, not a flaw in my approach, and I'll fix that bug today. I don't see how "[my] approach can cause memory errors" in any sort of larger sense.

    @larryhastings
    Copy link
    Contributor Author

    Revised the lazy concatenation patch to add (doh!) a check for when PyMem_NEW() fails in PyUnicode_AsUnicode().
    File Added: lch.py3k.unicode.lazy.concat.patch.2.txt

    @larryhastings
    Copy link
    Contributor Author

    Continuing the comedy of errors, concat patch #2 was actually the same as #1, it didn't have the fix for detecting a NULL return of PyMem_NEW(). Fixed in concat patch #3. (Deleting concat patch #2.)
    File Added: lch.py3k.unicode.lazy.concat.patch.3.txt

    @josiahcarlson
    Copy link
    Mannequin

    josiahcarlson mannequin commented Jan 10, 2007

    From what I understand, the point of the lazy strings patch is to make certain operations faster. What operations? Generally speaking, looped concatenation (x += y), and other looping operations that have traditionally been slow; O(n^2).

    While this error is still common among new users of Python, generally users only get bit once. They ask about it on python-list and are told: z = []; z.append(y); x = ''.join(z) .

    Then again, the only place where I've seen the iterative building up of *text* is really in document reformatting (like textwrap). Basically all other use-cases (that I have seen) generally involve the manipulation of binary data. Larry, out of curiosity, have you found code out there that currently loops and concatenates unicode?

    @larryhastings
    Copy link
    Contributor Author

    Much of what I do in Python is text processing. My largest Python project to date was an IDL which spewed out loads of text; I've also written an HTML formatter or two. I seem to do an awful lot of string concatenation in Python, and I'd like it to be fast. I'm not alone in this, as there have been several patches to Python in recent years to speed up string concatenation.

    Perhaps you aren't familiar with my original justification for the patch. I've always hated the "".join() idiom for string concatenation, as it violates the "There should be one--and preferably only one--obvious way to do it" principle (and arguably others). With lazy concatenation, the obvious way (using +) becomes competitive with "".join(), thus dispensing with the need for this inobvious and distracting idiom.

    For a more thorough dissection of the (original) patch, including its implementation and lots of discussion from other people, please see the original thread on c.l.p:
    http://groups.google.com/group/comp.lang.python/browse_frm/thread/b8a8f20bc3c81bcf
    Please ignore the benchmarks there, as they were quite flawed.

    And, no, I haven't seen a lot of code manipulating Unicode strings yet, but then I'm not a Python shaker-and-mover. Obviously I expect to see a whole lot more when Py3k is adopted.

    @malemburg
    Copy link
    Member

    Larry, I probably wasn't clear enough:

    PyUnicode_AS_UNICODE() returns a pointer to the underlying Py_UNICODE buffer. No API using this macro checks for a NULL return value of the macro since a Unicode object is guaranteed to have a non-NULL Py_UNICODE buffer. As a result, a memory caused during the concatenation process cannot be passed back up the call stack. The NULL return value would result in a plain segfault in the calling API.

    Regarding the tradeoff and trying such an approach: I've done such tests myself (not with Unicode but with 8-bit strings) and it didn't pay off. The memory consumption outweighs the performance you gain by using the 'x += y' approach. The ''.join(list) approach also doesn't really help if you're after performance (for much the same reasons).

    In mxTextTools I used slice integers pointing into the original parsed string to work around these problems, which works great and avoids creating short strings altogether (so you gain speed and memory).

    A patch I would find a lot more useful is one to create a Unicode alternative to cStringIO - for strings, this is by far the most performant way of creating a larger string from lots of small pieces. To complement this, a smart slice type might also be an attractive target; one that breaks up a larger string into slices and provides operations on these, including joining them to form a new string.

    I'm not convinced that murking with the underlying object type and doing "subtyping" on-the-fly is a clean design.

    @larryhastings
    Copy link
    Contributor Author

    lemburg:

    You're right, the possibility of PyUnicode_AS_UNICODE() returning NULL is new behavior, and this could conceivably result in crashes. To be clear: NULL return values will only happen when allocation of the final "str" buffer fails during lazy rendering. This will only happen in out-of-memory conditions; for right now, while the patch is under early review, I suspect that's okay.

    So far I've come up with four possible ways to resolve this problem, which I will list here from least-likely to most-likely:

    1. Redefine the API such that PyUnicode_AS_UNICODE() is allowed to return NULL, and fix every place in the Python source tree that calls it to check for a NULL return. Document this with strong language for external C module authors.
    2. Change the length to 0 and return a constant empty string. Suggest that users of the Unicode API ask for the pointer *first* and the length *second*.
    3. Change the length to 0 and return a previously-allocated buffer of some hopefully-big-enough-size (4096 bytes? 8192 bytes?), such that even if the caller iterates over the buffer, odds are good they'll stop before they hit the end. Again, suggest that users of the Unicode API ask for the pointer *first* and the length *second*.
    4. The patch is not accepted.

    Of course, I'm open to suggestions of other approaches. (Not to mention patches!)

    Regarding your memory usage and "slice integers" comments, perhaps you'll be interested in the full lazy patch, which I hope to post later today. "Lazy concatenation" is only one of the features of the full patch; the other is "lazy slices". For a full description of my "lazy slices" implementation, see this posting (and the subsequent conversation) to Python-Dev:
    http://mail.python.org/pipermail/python-dev/2006-October/069506.html
    And yes, lazy slices suffer from the same possible-NULL-return-from-PyUnicode_AS_UNICODE() problem that lazy concatenation does.

    As for your final statement, I never claimed that this was a particularly clean design. I merely claim it makes things faster and is (so far) self-contained. For the Unicode versions of my lazy strings patches, the only files I touched were "Include/unicodeobject.h" and "Objects/unicodeobject.c". I freely admit my patch makes those files *even fussier* to work on than they already are. But if you don't touch those files, you won't notice the difference*, and the patch makes some Python string operations faster without making anything else slower. At the very least I suggest the patches are worthy of examination.

    • Barring API changes to rectify the possible NULL return from PyUnicode_AS_UNICODE() problem, that is.

    @larryhastings
    Copy link
    Contributor Author

    File Added: lch.py3k.unicode.lazy.concat.patch.53392.txt

    @larryhastings
    Copy link
    Contributor Author

    Attached below you will find the full "lazy strings" patch, which has both "lazy concatenation" and "lazy slices". The diff is against the current revision of the Py3k branch, bpo-53392. On my machine (Win32) rt.bat produces identical output before and after the patch, for both debug and release builds.

    As I mentioned in a previous comment, you can read the description (and ensuing conversation) about "lazy slices" here:
    http://mail.python.org/pipermail/python-dev/2006-October/069506.html

    One new feature of this version: I added a method on a Unicode string, s.simplify(), which forces the string to "render" if it's one of my exotic string subtypes (a lazy concatenation or lazy slice). My goal is to assuage fears about pathological memory-use cases where you have long-lived tiny slices of gigantic strings. If you realize you're having that problem, simply add calls to .simplify() on the slices and the problem should go away.

    As for the semantics of .simplify(), it returns a reference to the string s. Honestly I wasn't sure whether it should return a new string or just monkey with the existing string. Really, rendering doesn't change the string; it's the same string, with the exact same external behavior, just with different bits floating around underneath. For now it monkeys with the existing string, as that seemed best. (But I'd be happy to switch it to returning a new string if it'd help.)

    I had planned to make the "lazy slices" patch independent of the "lazy concatenation" patch. However, it wound up being a bigger pain that I thought, and anyway I figure the likelyhood that "lazy slices" would be accepted and "lazy concatenation" would not is effectively zero. So I didn't bother. If there's genuine interest in "lazy slices" without "lazy concatenation", I can produce such a thing.
    File Added: lch.py3k.unicode.lazy.slice.and.concat.patch.53392.txt

    @larryhastings
    Copy link
    Contributor Author

    File Added: lch.py3k.unicode.lazy.concat.patch.53392.txt

    @larryhastings
    Copy link
    Contributor Author

    Just fixed the build under Linux--sorry, should have done that before posting the original patch. Patches now built and tested under Win32 and Linux, and produce the same output as an unpatched py3k trunk.

    lemburg: A minor correction: the full "lazy strings" patch (with "lazy slices") also touches "stringlib/partition.h", "stringlib/readme.txt", and "Objects/stringobject.c", in addition to the two unicodeobject.* files. The changes to these three files are minuscule, and don't affect their maintainability, so the gist of my statements still hold. (Besides, all three of those files will probably go away before Py3k ships.)
    File Added: lch.py3k.unicode.lazy.slice.and.concat.patch.53392.txt

    @josiahcarlson
    Copy link
    Mannequin

    josiahcarlson mannequin commented Jan 12, 2007

    I don't think that changing the possible return of PyUnicode_AS_UNICODE is reasonable. (option 1)

    Option 2 breaks the buffer interface.

    Option 3 severely limits the size of potential unicode strings. If you are only manipulating tiny unicode strings (8k?), then the effect of fast concatenation, slicing, etc., isn't terribly significant.

    Option 4 is possible, but I know I would feel bad if all of this work went to waste.

    Note what M. A. Lemburg mentioned. The functionality is useful, it's the polymorphic representation that is the issue. Rather than attempting to change the unicode representation, what about a wrapper type? Keep the base unicode representation simple (both Guido and M. A. have talked about this). Guido has also stated that he wouldn't be against views (slicing and/or concatenation) if they could be shown to have real use-cases. The use-cases you have offered here are still applicable, and because it wouldn't necessitate a (not insignificant) change in semantics and 3rd party code, would make it acceptable.

    @larryhastings
    Copy link
    Contributor Author

    josiahcarlson:

    I think you misunderstood options 2 and 3. The empty string (option 2) or
    nonempty but fixed size string (option 3) would *only* be returned in the
    event of an allocation failure, aka "the process is out of memory". Since
    it's out of memory yet trying to allocate more, it has *already* failed.
    My goal in proposing options 2 and 3 was that, when this happens (and it
    eventually will), Python would fail *gracefully* with an exception, rather
    than *miserably* with a bus error.

    As for writing a wrapper, I'm just not interested. I'm a strong believer
    in "There should be one--and preferably only one--obvious way to do it",
    and I feel a special-purpose wrapper class for good string performance
    adds mental clutter. The obvious way to do string concatenation is with
    "+"; the obvious way to to string slices is with "[:]". My goal is to
    make those fast so that you can use them *everywhere*--even in
    performance-critical code. I don't want a wrapper class, and have no
    interest in contributing to one.

    For what it's worth, I came up with a fifth approach this morning while
    posting to the Python-3000 mailing list: pre-allocate the str buffer,
    updating it to the correct size whenever the lazy object changes size.
    That would certainly fix the problem; the error would occur in a much
    more reportable place. But it would also slow down the code quite a lot,
    negating many of the speed gains of this approach.

    @larryhastings
    Copy link
    Contributor Author

    File Added: pybench.first.results.zip

    @gvanrossum
    Copy link
    Member

    Problems so far:

    • Style: you set your tab stops to 4 spaces. That is an absolute
      no-no! You can indent using 4 spaces, but you should NEVER assume
      that a TAB character is anything except 8 spaces.

    • Segfault in test_array. It seems that it's receiving a unicode slice
      object and treating it like a "classic" unicode object.

    • I got it to come to a grinding halt with the following worst-case scenario:

      a = []
      while True:
          x = u"x"*1000000
          x = x[30:60]  # Short slice of long string
          a.append(x)

    If you can't do better than that, I'll have to reject it.

    PS I used your combined patch, if it matters.

    @larryhastings
    Copy link
    Contributor Author

    Thanks for taking the time!

    • Style: you set your tab stops to 4 spaces. That is an absolute
      no-no!

    Sorry about that; I'll fix it if I resubmit.

    • Segfault in test_array. It seems that it's receiving a unicode
      slice object and treating it like a "classic" unicode object.

    I tested on Windows and Linux, and I haven't seen that behavior.

    Which test_array, by the way? In Lib/test, or Lib/ctypes/test?
    I'm having trouble with most of the DLL extensions on Windows;
    they complain that the module uses the incompatible python26.dll
    or python26_d.dll. So I haven't tested ctypes/test_array.py
    on Windows, but I have tested the other three permutations of
    Linux vs Windows and Lib/test/test_array vs
    Lib/ctypes/test/test_array.

    Can you give me a stack trace to the segfault? With that I bet I
    can fix it even without a reproducible test case.

    • I got it to come to a grinding halt with the following worst-case
      scenario:

      a = []
      while True:
      x = u"x"*1000000
      x = x[30:60] # Short slice of long string
      a.append(x)

    If you can't do better than that, I'll have to reject it.

    PS I used your combined patch, if it matters.

    It matters. The combined patch has "lazy slices", the other
    patch does not.

    When you say "grind to a halt" I'm not sure what you mean.
    Was it thrashing? How much CPU was it using?

    When I ran that test, my Windows computer got to 1035 iterations
    then threw a MemoryError. My Linux box behaved the same, except
    it got to 1605 iterations.

    Adding a call to .simplify() on the slice defeats this worst-case
    scenario:

    a = []
    while True:
        x = u"x"*1000000
        x = x[30:60].simplify()  # Short slice of long string
        a.append(x)

    .simplify() forces lazy strings to render themselves. With that
    change, this test will run until the cows come home. Is that
    acceptable?

    Failing that, is there any sort of last-ditch garbage collection
    pass that gets called when a memory allocation fails but before
    it returns NULL? If so, I could hook in to that and try to render
    some slices. (I don't see such a pass, but maybe I missed it.)

    Failing that, I could add garbage-collect-and-retry-once logic to
    memory allocation myself, either just for unicodeobject.c or as a
    global change. But I'd be shocked if you were interested in that
    approach; if Python doesn't have such a thing by now, you probably
    don't want it.

    And failing that, "lazy slices" are probably toast. It always was
    a tradeoff of speed for worst-case memory use, and I always knew
    it might not fly. If that's the case, please take a look at the
    other patch, and in the meantime I'll see if anyone can come up with
    other ways to mitigate the worst-case scenario.

    @larryhastings
    Copy link
    Contributor Author

    Here's another possible fix for the worst-case scenario:

    #define MAX_SLICE_DELTA (64*1024)
    if ( ((size_of_slice + MAX_SLICE_DELTA) > size_of_original) 
        || (size_of_slice > (size_of_original / 2))  )
        use_lazy_slice();
    else
        create_string_as_normal();

    You'd still get the full benefit of lazy slices most of the time, but it takes the edge off the really pathological cases.

    How's that?

    @gvanrossum
    Copy link
    Member

    Sorry, the test_array failure was due to not rebuilding after patching. Because extension modules are built using distutils, they don't get automatically rebuilt when a relevant header has changed.

    "grind to a halt": swapping, probably, due to memory filling up with 1M-character string objects, as you experienced yourself.

    Your proposal takes the edge off, although I can still come up with a worst-case scenario (just use 64K strings instead of 1M strings, and leave the rest the same).

    I am far from convinced that replacing one pathological case (O(N**2) concatenation, which is easily explained and avoided) with another (which is harder to explain due to the more complicated algorithms and heuristics involved) is a good trade-off.

    This is all the worse since your optimization doesn't have a clear time/space trade-off: it mostly attempts to preserve time *and* space, but in the worst case it can *waste* space. (And I'm not convinced there can't be a pathological case where it is slower, too.) And the gains are dependent on the ability to *avoid* ultimately rendering the string; if every string ends up being rendered, there is no net gain in space, and there might be no net gain in time either (at least not for slices).

    I believe I would rather not pursue this patch further at this time; a far more important programming task is the str/unicode unification (now that the int/long unification is mostly there).

    If you want to clean up the patch, I suggest that you add a large comment section somewhere (unicode.h?) describing the algorithms in a lot of detail, including edge cases and performance analysis, to make review of the code possible. But you're most welcome to withdraw it, too; it would save me a lot of headaches.

    @larryhastings
    Copy link
    Contributor Author

    As discussed (briefly) over email, I'm moving this discussion back to the Python-3000 mailing list. But before I do I wanted to clear up something from your reply.

    "lazy concatenation" and "lazy slices" are really two patches, filed under the "lazy slices" penumbra. They are different optimizations, with different implementations and different behaviors. I implemented them cumulatively to save work because they intertwine when merged, but I had hoped they would be considered independently. I apologize if this point was unclear (and moreso if it was a bad idea). My reason for doing so: I suspected "lazy slices" were doomed from the start; doing the patch this way meant wasting less work.

    One downside of "lazy slices" is their ability to waste loads of memory in the worst-case. Now, "lazy concatenation" simply doesn't have that problem. Yet the fourth and fifth paragraphs of your most recent reply imply you think it can.

    A quick recap of lazy concatenation:
    a = u"a"
    b = u"b"
    concat = a + b
    "concat" is a PyUnicodeConcatenationObject holding references to a and b (or rather their values). Its "value" is NULL, indicating that it is unrendered. The moment someone asks for the value of "concat", the object allocates space for its value, constructs the value by walking its tree of children, and frees its children. The implementation is heavily optimized for the general case (concatenation) and avoids recursion where possible.

    The worst-case memory consumption behavior of lazy concatenation is adding lots and lots of tiny strings and never rendering; that will allocate lots of PyUnicodeConcatenationObjects. But it's nowhere near as bad as a short lazy slice of a long string.

    Does that make "lazy concatenation" more palatable?

    2 similar comments
    @larryhastings
    Copy link
    Contributor Author

    As discussed (briefly) over email, I'm moving this discussion back to the Python-3000 mailing list. But before I do I wanted to clear up something from your reply.

    "lazy concatenation" and "lazy slices" are really two patches, filed under the "lazy slices" penumbra. They are different optimizations, with different implementations and different behaviors. I implemented them cumulatively to save work because they intertwine when merged, but I had hoped they would be considered independently. I apologize if this point was unclear (and moreso if it was a bad idea). My reason for doing so: I suspected "lazy slices" were doomed from the start; doing the patch this way meant wasting less work.

    One downside of "lazy slices" is their ability to waste loads of memory in the worst-case. Now, "lazy concatenation" simply doesn't have that problem. Yet the fourth and fifth paragraphs of your most recent reply imply you think it can.

    A quick recap of lazy concatenation:
    a = u"a"
    b = u"b"
    concat = a + b
    "concat" is a PyUnicodeConcatenationObject holding references to a and b (or rather their values). Its "value" is NULL, indicating that it is unrendered. The moment someone asks for the value of "concat", the object allocates space for its value, constructs the value by walking its tree of children, and frees its children. The implementation is heavily optimized for the general case (concatenation) and avoids recursion where possible.

    The worst-case memory consumption behavior of lazy concatenation is adding lots and lots of tiny strings and never rendering; that will allocate lots of PyUnicodeConcatenationObjects. But it's nowhere near as bad as a short lazy slice of a long string.

    Does that make "lazy concatenation" more palatable?

    @larryhastings
    Copy link
    Contributor Author

    As discussed (briefly) over email, I'm moving this discussion back to the Python-3000 mailing list. But before I do I wanted to clear up something from your reply.

    "lazy concatenation" and "lazy slices" are really two patches, filed under the "lazy slices" penumbra. They are different optimizations, with different implementations and different behaviors. I implemented them cumulatively to save work because they intertwine when merged, but I had hoped they would be considered independently. I apologize if this point was unclear (and moreso if it was a bad idea). My reason for doing so: I suspected "lazy slices" were doomed from the start; doing the patch this way meant wasting less work.

    One downside of "lazy slices" is their ability to waste loads of memory in the worst-case. Now, "lazy concatenation" simply doesn't have that problem. Yet the fourth and fifth paragraphs of your most recent reply imply you think it can.

    A quick recap of lazy concatenation:
    a = u"a"
    b = u"b"
    concat = a + b
    "concat" is a PyUnicodeConcatenationObject holding references to a and b (or rather their values). Its "value" is NULL, indicating that it is unrendered. The moment someone asks for the value of "concat", the object allocates space for its value, constructs the value by walking its tree of children, and frees its children. The implementation is heavily optimized for the general case (concatenation) and avoids recursion where possible.

    The worst-case memory consumption behavior of lazy concatenation is adding lots and lots of tiny strings and never rendering; that will allocate lots of PyUnicodeConcatenationObjects. But it's nowhere near as bad as a short lazy slice of a long string.

    Does that make "lazy concatenation" more palatable?

    @larryhastings
    Copy link
    Contributor Author

    Whoops, sorry. I refreshed a summary page I had lying around, which I guess re-posted the comment! Didn't mean to spam you with extra updates.

    @gvanrossum
    Copy link
    Member

    Looks like I forgot to close and reject this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants