This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients eric.smith, scoder, serhiy.storchaka, vstinner
Date 2016-08-22.13:10:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1471871435.98.0.691225428471.issue27818@psf.upfronthosting.co.za>
In-reply-to
Content
-    while (pos<end && Py_ISDIGIT(PyUnicode_READ_CHAR(s, pos)))
+    while (pos<end && Py_ISDIGIT(PyUnicode_READ(ukind, udata, pos)))

Great change. It's really bad for performance to use such inefficient macro in a loop: PyUnicode_READ_CHAR() uses 2 nested "if" :-/

faster_format.patch LGTM except of Serhiy's comment.

To get best performances, it's even better to specialize Unicode code to have 4 versions: ascii, latin1, ucs2, ucs4. The "stringlib" does that using C "templates".
History
Date User Action Args
2016-08-22 13:10:36vstinnersetrecipients: + vstinner, scoder, eric.smith, serhiy.storchaka
2016-08-22 13:10:35vstinnersetmessageid: <1471871435.98.0.691225428471.issue27818@psf.upfronthosting.co.za>
2016-08-22 13:10:35vstinnerlinkissue27818 messages
2016-08-22 13:10:35vstinnercreate