This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients dabeaz, ezio.melotti, loewis
Date 2012-10-16.21:35:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1350423323.27.0.367484867962.issue16254@psf.upfronthosting.co.za>
In-reply-to
Content
As stated, this is not a bug: there is no memory leak, nor any deviation from documented behavior.

You are right that it fills the wstr pointer, by calling PyUnicode_AsUnicodeAndSize in unicode_aswidechar, and then copying the data to a fresh buffer.

This is merely the simplest implementation; it's certainly possible to improve it. Contributions are welcome.

A number of things need to be considered:
- Computing the wstr size is somewhat expensive if on a 16-bit wchar_t system, since the result may need surrogate pairs.
- I would suggest that if possible, the wstr representation should be returned out of the unicode object (resetting wstr to NULL). This should produce the greatest reuse in code, yet avoid unnecessary copying.
- It's not possible to do so for strings where wstr is shared with the canonical representation (i.e. a UCS-2 string on 16-bit wchar_t, and a UCS-4 string on 32-bit wchar_t).
- I don't think wstr should be cleared if it was already filled when the function got called. Instead, wstr should only be returned if it was originally NULL.
History
Date User Action Args
2012-10-16 21:35:23loewissetrecipients: + loewis, ezio.melotti, dabeaz
2012-10-16 21:35:23loewissetmessageid: <1350423323.27.0.367484867962.issue16254@psf.upfronthosting.co.za>
2012-10-16 21:35:23loewislinkissue16254 messages
2012-10-16 21:35:23loewiscreate