Message 281869 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	John Helour, lemburg, loewis, mdk, serhiy.storchaka, vstinner, xiang.zhang
Date	2016-11-28.12:40:36
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1480336836.1.0.893057658555.issue24339@psf.upfronthosting.co.za>
In-reply-to

Content
The codec code has a few (performance) issues: * nonspacing_diacritical_marks should be a set for fast lookup * ord(c) in range(0x00, 0xA0) should be rewritten using < and >= * result += bytes([ord(c)]) has exponential timing (it copies the whole bytes string for every single operation); better use a bytearray and convert this to bytes in one final step * the error messages should include more useful information about the cause and location of the error, instead of just UnicodeError("Unacceptable unicode character") and raise KeyError Please also check whether it's not possible to reuse the charmap codec functions we have. Thanks.

The codec code has a few (performance) issues:

 * nonspacing_diacritical_marks should be a set for fast lookup
 * ord(c) in range(0x00, 0xA0) should be rewritten using < and >=
 * result += bytes([ord(c)]) has exponential timing (it copies
   the whole bytes string for every single operation); better
   use a bytearray and convert this to bytes in one final step
 * the error messages should include more useful information
   about the cause and location of the error, instead of just
   UnicodeError("Unacceptable unicode character") and
   raise KeyError

Please also check whether it's not possible to reuse the charmap codec
functions we have. Thanks.

History
Date	User	Action	Args
2016-11-28 12:40:36	lemburg	set	recipients: + lemburg, loewis, vstinner, serhiy.storchaka, xiang.zhang, John Helour, mdk
2016-11-28 12:40:36	lemburg	set	messageid: <1480336836.1.0.893057658555.issue24339@psf.upfronthosting.co.za>
2016-11-28 12:40:36	lemburg	link	issue24339 messages
2016-11-28 12:40:36	lemburg	create