Message 51662 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients
Date	2007-01-08.10:59:50
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
While I don't think the added complexity in the implementation is worth it, given that there are other ways of achieving the same kind of performance (e.g. list of Unicode strings), some comments: * you add a long field to every Unicode object - so every single object in the system pays 4-8 bytes for the small performance advantage * Unicode objects are often references using PyUnicode_AS_UNICODE(); this operation doesn't allow passing back errors, yet your lazy evaluation approach can cause memory errors - how are you going to deal with them ? (currently you don't even test for them) * the lazy approach keeps all partial Unicode objects alive until they finally get concatenated; if you have lots of those (e.g. if you use x += y in a loop), then you pay the complete Python object overhead for every single partial Unicode object in the list of strings - given that most such operations use short strings, you are likely creating a memory overhead far greater than the the total length of all the strings

While I don't think the added complexity in the implementation is worth it, given that there are other ways of achieving the same kind of performance (e.g. list of Unicode strings), some comments:

 * you add a long field to every Unicode object - so every single object in the system pays 4-8 bytes for the small performance advantage

 * Unicode objects are often references using PyUnicode_AS_UNICODE(); this operation doesn't allow passing back errors, yet your lazy evaluation approach can cause memory errors - how are you going to deal with them ?  (currently you don't even test for them)

 * the lazy approach keeps all partial Unicode objects alive until they finally get concatenated; if you have lots of those (e.g. if you use x += y in a loop), then you pay the complete Python object overhead for every single partial Unicode object in the list of strings - given that most such operations use short strings, you are likely creating a memory overhead far greater than the the total length of all the strings

History
Date	User	Action	Args
2007-08-23 15:56:01	admin	link	issue1629305 messages
2007-08-23 15:56:01	admin	create