This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients jafo, lemburg, orivej, pitrou
Date 2008-03-20.10:03:58
SpamBayes Score 0.232796
Marked as misclassified No
Message-id <>
Antoine, as I've already mentioned in my other comments, I'm -1 on
changing the Unicode object to a variable size object.

I also don't think that the micro-benchmarks you are applying really do
test the implementation in a real-life situations. The only case where
your patch appears significantly faster is the "long string" case. If
you look at the distribution of the Unicode strings generated by this
case, you'll find that most strings have less than 10-20 characters.
Optimizing pymalloc for these cases and tuning the parameters in the
Unicode implementation will likely give you the same performance
increase without having to sacrifice the advantages of using a pointer
in the object rather than a inlining the data.

I'm +1 on the free list changes, though, in the long run, I think that
putting more efforts into improving pymalloc and removing the free lists
altogether would be better.

BTW: Unicode slices would be a possible and fairly attractive target for
a C level subclass of Unicode objects. The pointer in the Unicode object
implementation could then point to the original Unicode object's buffer
and the subclass would add an extra pointer to the original object
itself (in order to keep it alive). The Unicode type (written by Fredrik
Lundh) I used as basis for the Unicode implementation worked with this idea.
Date User Action Args
2008-03-20 10:04:01lemburgsetspambayes_score: 0.232796 -> 0.232796
recipients: + lemburg, jafo, pitrou, orivej
2008-03-20 10:04:01lemburgsetspambayes_score: 0.232796 -> 0.232796
messageid: <>
2008-03-20 10:04:00lemburglinkissue1943 messages
2008-03-20 10:03:58lemburgcreate