Message 346134 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	methane, serhiy.storchaka, vstinner
Date	2019-06-20.16:33:54
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1561048434.41.0.67038507249.issue37348@roundup.psfhosted.org>
In-reply-to

Content
> _PyUnicode_FromASCII(s, len) is faster than PyUnicode_FromString(s) because PyUnicode_FromString() uses temporary _PyUnicodeWriter to support UTF-8. I don't understand how _PyUnicodeWriter could be slow. It does not overallocate by default. It's just wrapper to implement efficient memory management. > Oh, wait. Why we used _PyUnicodeWriter here? To optimize decoding errors: the error handler can use replacement string longer than 1 character. Overallocation is used in this case.

> _PyUnicode_FromASCII(s, len) is faster than PyUnicode_FromString(s) because PyUnicode_FromString() uses temporary _PyUnicodeWriter to support UTF-8.

I don't understand how _PyUnicodeWriter could be slow. It does not overallocate by default. It's just wrapper to implement efficient memory management.

> Oh, wait.  Why we used _PyUnicodeWriter here?

To optimize decoding errors: the error handler can use replacement string longer than 1 character. Overallocation is used in this case.

History
Date	User	Action	Args
2019-06-20 16:33:54	vstinner	set	recipients: + vstinner, methane, serhiy.storchaka
2019-06-20 16:33:54	vstinner	set	messageid: <1561048434.41.0.67038507249.issue37348@roundup.psfhosted.org>
2019-06-20 16:33:54	vstinner	link	issue37348 messages
2019-06-20 16:33:54	vstinner	create