Message 149699 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	loewis
Recipients	ezio.melotti, jcea, loewis, pitrou, vstinner
Date	2011-12-17.19:50:11
SpamBayes Score	0.00053829217
Marked as misclassified	No
Message-id	<1324151412.21.0.479618520532.issue13624@psf.upfronthosting.co.za>
In-reply-to

Content
Can you please provide your exact testing procedure? Standard iobench.py doesn't support testing for separate ASCII, UCS-1 and UCS-2 data, so you must have used some other tool. Exact code, command line parameters, hardware description and timing results would be appreciated. Looking at the encoder, I think the first thing to change is to reduce the over-allocation for UCS-1 and UCS-2 strings. This may or may not help the run-time, but should reduce memory consumption. I wonder whether making two passes over the string (one to compute the size, and the other one with an allocated result buffer) could improve the performance. If there is further special-casing, I'd only special-case UCS-1. I doubt that the _READ() macro really is the bottleneck, and would rather expect that loop unrolling can help. Because of unallowed surrogates, unrolling is not practical for UCS-2.

Can you please provide your exact testing procedure? Standard iobench.py doesn't support testing for separate ASCII, UCS-1 and UCS-2 data, so you must have used some other tool. Exact code, command line parameters, hardware description and timing results would be appreciated.

Looking at the encoder, I think the first thing to change is to reduce the over-allocation for UCS-1 and UCS-2 strings. This may or may not help the run-time, but should reduce memory consumption.

I wonder whether making two passes over the string (one to compute the size, and the other one with an allocated result buffer) could improve the performance.

If there is further special-casing, I'd only special-case UCS-1. I doubt that the _READ() macro really is the bottleneck, and would rather expect that loop unrolling can help. Because of unallowed surrogates, unrolling is not practical for UCS-2.

History
Date	User	Action	Args
2011-12-17 19:50:12	loewis	set	recipients: + loewis, jcea, pitrou, vstinner, ezio.melotti
2011-12-17 19:50:12	loewis	set	messageid: <1324151412.21.0.479618520532.issue13624@psf.upfronthosting.co.za>
2011-12-17 19:50:11	loewis	link	issue13624 messages
2011-12-17 19:50:11	loewis	create