This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients vstinner
Date 2012-10-24.18:38:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1351103901.39.0.0945105753188.issue16311@psf.upfronthosting.co.za>
In-reply-to
Content
Attached patch modifies text decoders to use the _PyUnicodeWriter API to factorize the code. It removes unicode_widen() and unicode_putchar() functions.

 * Don't overallocate by default  (except for "raw-unicode-escape" codec), enable overallocation on the first decode error (as done currently)
 * _PyUnicodeWriter_Prepare() only overallocates 25%, instead of 100%
for unicode_decode_call_errorhandler()
 * Use _PyUnicodeWriter_Prepare() + PyUnicode_WRITE() (two macros)
instead of unicode_putchar() (function)
 * _PyUnicodeWriter structures stores many useful fields, so we don't
have to pass multiple parameters to functions, only the writer

I wrote the patch to factorize the code, but it might be faster.
History
Date User Action Args
2012-10-24 18:38:22vstinnersetrecipients: + vstinner
2012-10-24 18:38:21vstinnersetmessageid: <1351103901.39.0.0945105753188.issue16311@psf.upfronthosting.co.za>
2012-10-24 18:38:21vstinnerlinkissue16311 messages
2012-10-24 18:38:21vstinnercreate