This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients ajaksu2, amaury.forgeotdarc, collinwinter, ezio.melotti, jafo, lemburg, orivej, pitrou, vstinner
Date 2009-05-25.09:21:02
SpamBayes Score 1.14717e-07
Marked as misclassified No
Message-id <>
In-reply-to <>
Amaury Forgeot d'Arc wrote:
> Amaury Forgeot d'Arc <> added the comment:
> Looking at the comments, it seems that the performance gain comes from
> the removal of the double allocation which is needed by the current design.
> Was the following implementation considered:
> - keep the current PyUnicodeObject structure
> - for small strings, allocate one chunk of memory:
> sizeof(PyUnicodeObject)+2*length. Then set self->str=(Py_UNICODE*)(self+1);
> - for large strings, self->str may be allocated separately.
> - unicode_dealloc() must be careful and not free self->str if it is
> contiguous to the object (it's probably a good idea to reuse the
> self->state field for this purpose).

AFAIK, this was not yet been investigated.

Note that in real life applications, you hardly ever have to
call malloc on small strings - these are managed by pymalloc as
pieces of larger chunks and allocation/deallocation is generally
fast. You have the same situation for PyUnicodeObject itself
(which, as noted earlier, could be optimized in pymalloc even further,
since the size of PyUnicodeObject is fixed).

The OS malloc() is only called for longer strings and then only
for the string buffer itself - the PyUnicodeObject is again completly
managed by pymalloc, even in this case.
Date User Action Args
2009-05-25 09:21:05lemburgsetrecipients: + lemburg, collinwinter, jafo, amaury.forgeotdarc, pitrou, vstinner, ajaksu2, orivej, ezio.melotti
2009-05-25 09:21:03lemburglinkissue1943 messages
2009-05-25 09:21:02lemburgcreate