This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients barry, brett.cannon, christian.heimes, kristjan.jonsson, pitrou, ronaldoussoren, serhiy.storchaka, vstinner
Date 2013-10-11.21:35:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1381527333.51.0.257694807529.issue19219@psf.upfronthosting.co.za>
In-reply-to
Content
> (however, a quick test suggests that PyUnicode_DecodeUTF8 is quite slower)

It's surprising that PyUnicode_DecodeUTF8() is quite slower than _PyUnicode_FromUCS1(). _PyUnicode_FromUCS1() calls ucs1lib_find_max_char() and then memcpy(). PyUnicode_DecodeUTF8() first tries ascii_decode() which is very similar than ucs1lib_find_max_char().

The difference is maybe that _PyUnicode_FromUCS1() copies all bytes at once (memcpy()), whereas ascii_decode() copies bytes while if the string is ASCII or not.
History
Date User Action Args
2013-10-11 21:35:33vstinnersetrecipients: + vstinner, barry, brett.cannon, ronaldoussoren, pitrou, kristjan.jonsson, christian.heimes, serhiy.storchaka
2013-10-11 21:35:33vstinnersetmessageid: <1381527333.51.0.257694807529.issue19219@psf.upfronthosting.co.za>
2013-10-11 21:35:33vstinnerlinkissue19219 messages
2013-10-11 21:35:33vstinnercreate