This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients ezio.melotti, jcea, loewis, pitrou, vstinner
Date 2011-12-17.19:50:11
SpamBayes Score 0.00053829217
Marked as misclassified No
Message-id <1324151412.21.0.479618520532.issue13624@psf.upfronthosting.co.za>
In-reply-to
Content
Can you please provide your exact testing procedure? Standard iobench.py doesn't support testing for separate ASCII, UCS-1 and UCS-2 data, so you must have used some other tool. Exact code, command line parameters, hardware description and timing results would be appreciated.

Looking at the encoder, I think the first thing to change is to reduce the over-allocation for UCS-1 and UCS-2 strings. This may or may not help the run-time, but should reduce memory consumption.

I wonder whether making two passes over the string (one to compute the size, and the other one with an allocated result buffer) could improve the performance.

If there is further special-casing, I'd only special-case UCS-1. I doubt that the _READ() macro really is the bottleneck, and would rather expect that loop unrolling can help. Because of unallowed surrogates, unrolling is not practical for UCS-2.
History
Date User Action Args
2011-12-17 19:50:12loewissetrecipients: + loewis, jcea, pitrou, vstinner, ezio.melotti
2011-12-17 19:50:12loewissetmessageid: <1324151412.21.0.479618520532.issue13624@psf.upfronthosting.co.za>
2011-12-17 19:50:11loewislinkissue13624 messages
2011-12-17 19:50:11loewiscreate