Message 51663 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	larry
Recipients
Date	2007-01-08.18:50:02
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
jcarlson: The first time someone calls PyUnicode_AsUnicode() on a concatenation object, it renders the string, and that's an O(something) operation. In general this rendering is O(i), aka linear time, though linear related to what depends. (It iterates over the m concatenated strings, and each of the n characters in those strings, and whether n or m is more important depends on their values.) After rendering, the object behaves like any other Unicode string, including O(1) for array element lookup. If you're referring to GvR's statement "I mention performance because s[i] should remain an O(1) operation.", here: http://mail.python.org/pipermail/python-3000/2006-December/005281.html I suspect this refers to the UCS-2 vs. UTF-16 debate. lemberg: Your criticisms are fair; lazy evaluation is a tradeoff. In general my response to theories about how it will affect performance is "I invite you to try it and see". As for causing memory errors, the only problem I see is not checking for a NULL return from PyMem_NEW() in PyUnicode_AsUnicode(). But that's a bug, not a flaw in my approach, and I'll fix that bug today. I don't see how "[my] approach can cause memory errors" in any sort of larger sense.

jcarlson:
The first time someone calls PyUnicode_AsUnicode() on a concatenation object, it renders the string, and that's an O(something) operation.  In general this rendering is O(i), aka linear time, though linear related to *what* depends.  (It iterates over the m concatenated strings, and each of the n characters in those strings, and whether n or m is more important depends on their values.)  After rendering, the object behaves like any other Unicode string, including O(1) for array element lookup.

If you're referring to GvR's statement "I mention performance because s[i] should remain an O(1) operation.", here:
http://mail.python.org/pipermail/python-3000/2006-December/005281.html
I suspect this refers to the UCS-2 vs. UTF-16 debate.

lemberg:
Your criticisms are fair; lazy evaluation is a tradeoff.  In general my response to theories about how it will affect performance is "I invite you to try it and see".

As for causing memory errors, the only problem I see is not checking for a NULL return from PyMem_NEW() in PyUnicode_AsUnicode().  But that's a bug, not a flaw in my approach, and I'll fix that bug today.  I don't see how "[my] approach can cause memory errors" in any sort of larger sense.

History
Date	User	Action	Args
2007-08-23 15:56:02	admin	link	issue1629305 messages
2007-08-23 15:56:02	admin	create