Message 199462 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	serhiy.storchaka
Recipients	barry, christian.heimes, kristjan.jonsson, pitrou, serhiy.storchaka, vstinner
Date	2013-10-11.11:57:42
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1381492663.04.0.601157666373.issue19219@psf.upfronthosting.co.za>
In-reply-to

Content
> - unmarshalling ASCII strings is faster: you can pass 127 to PyUnicode_New without scanning for non-ASCII chars You should ensure that loaded bytes are ASCII-only. Otherwise broken or malicious marshalled data will compromise you program. Decoding UTF-8 is so fast as decoding ASCII (with checks) and is almost so fast as memcpy. As for output, we could use cached UTF-8 representation of string (always exists for ASCII only strings) before calling PyUnicode_AsUTF8String(). I'm good with buffering and codes for short strings and tuples (I have not examined a code closely yet), but special casing ASCII looks not so good to me.

> - unmarshalling ASCII strings is faster: you can pass 127 to PyUnicode_New without scanning for non-ASCII chars

You should ensure that loaded bytes are ASCII-only. Otherwise broken or malicious marshalled data will compromise you program. Decoding UTF-8 is so fast as decoding ASCII (with checks) and is almost so fast as memcpy.

As for output, we could use cached UTF-8 representation of string (always exists for ASCII only strings) before calling PyUnicode_AsUTF8String().

I'm good with buffering and codes for short strings and tuples (I have not examined a code closely yet), but special casing ASCII looks not so good to me.

History
Date	User	Action	Args
2013-10-11 11:57:43	serhiy.storchaka	set	recipients: + serhiy.storchaka, barry, pitrou, kristjan.jonsson, vstinner, christian.heimes
2013-10-11 11:57:43	serhiy.storchaka	set	messageid: <1381492663.04.0.601157666373.issue19219@psf.upfronthosting.co.za>
2013-10-11 11:57:43	serhiy.storchaka	link	issue19219 messages
2013-10-11 11:57:42	serhiy.storchaka	create