Issue 13570: Expose faster unicode<->ascii functions in the C-API

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/57779

classification

Title:	Expose faster unicode<->ascii functions in the C-API
Type:	performance	Stage:	resolved
Components:	Unicode	Versions:	Python 3.3

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	ezio.melotti, jcea, loewis, pitrou, skrah, vstinner
Priority:	normal	Keywords:

Created on 2011-12-09 21:12 by skrah, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg149124 - (view)	Author: Stefan Krah (skrah) *	Date: 2011-12-09 21:12
I just ran the telco benchmark ... http://www.bytereef.org/mpdecimal/quickstart.html#telco-benchmark ... on _decimal to see how the PEP-393 changes affect the module. The benchmark reads numbers from a binary file, does some calculations and prints the result strings to a file. Average results (10 iterations each): Python 2.7: 5.87s Revision 1726fa560112: 6.07s Revision 7ffe3d304487: 6.56s The bottleneck in telco.py is the line that writes a Decimal to the output file: outfil.write("%s\n" % t) The bottleneck in _decimal is (res is ascii): PyUnicode_FromString(res); PyUnicode_DecodeASCII(res) has the same performance. With this function ... static PyObject* unicode_fromascii(const char* s, Py_ssize_t size) { PyObject *res; res = PyUnicode_New(size, 127); if (!res) return NULL; memcpy(PyUnicode_1BYTE_DATA(res), s, size); return res; } ... I get the same performance as with Python 2.7 (5.85s)! I think it would be really beneficial for C-API users to have more ascii low level functions that don't do error checking and are simply as fast as possible.
msg149151 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-12-10 13:13
Le 09/12/2011 22:12, Stefan Krah a écrit : > The bottleneck in _decimal is (res is ascii): > > PyUnicode_FromString(res); > > PyUnicode_DecodeASCII(res) has the same performance. > > > With this function ... > > static PyObject* > unicode_fromascii(const char* s, Py_ssize_t size) > { > PyObject *res; > res = PyUnicode_New(size, 127); > if (!res) > return NULL; > memcpy(PyUnicode_1BYTE_DATA(res), s, size); > return res; > } > > ... I get the same performance as with Python 2.7 (5.85s)! The problem is that unicode_fromascii() is unsafe: it doesn't check that the string is pure ASCII. That's why this function is private. Because of the PEP 383, ASCII and UTF-8 decoders (PyUnicode_DecodeASCII and PyUnicode_FromString) have to first scan the input to check for errors, and then do a fast memcpy. The scanner of these two decoders is already optimized to process the input string word by word (word=the C long type), instead of byte by byte, using a bit mask. -- You can write your own super fast ASCII decoder using two lines: res = PyUnicode_New(size, 127); memcpy(PyUnicode_1BYTE_DATA(res), s, size); (this is exactly what unicode_fromascii does) > I think it would be really beneficial for C-API users to have > more ascii low level functions that don't do error checking and > are simply as fast as possible. It is really important to ensure that a ASCII string doesn't contain characters outside [U+0000; U+007F] because many operations on ASCII string are optimized (e.g. UTF-8 pointer is shared with the ASCII pointer). I prefer to not expose such function or someone will use it without understanding exactly how dangerous it is. Martin and other may disagree with me. Do you know Murphy's Law? :-) http://en.wikipedia.org/wiki/Murphy%27s_law
msg149159 - (view)	Author: Stefan Krah (skrah) *	Date: 2011-12-10 14:16
> I prefer to not expose such function or someone will use it without > understanding exactly how dangerous it is. OK. - I'm afraid that I made an error in the benchmarks, since I accidentally used a changed version of telco.py, namely: # t is a Decimal outfil.write("%s\n" % t) # original version ... outfil.write(str(t)) # changed version runs outfil.write('\n') # faster since PEP-393 ... Since PEP-393 the changed version with two calls to write() runs quite a bit faster than the original. For Python-3.2 and 2.7 the original runs faster. Do you have an idea what could cause this?
msg149253 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2011-12-11 22:22
It's reasonable that string % formatting might have become slower... I wonder what the issue is at this point. Unless you can state a clear issue that you want to see resolved, I propose to close this report as invalid.
msg149260 - (view)	Author: Stefan Krah (skrah) *	Date: 2011-12-11 23:36
Sorry, the title of the issue isn't correct any more. The revised issue is that in 3.3 a) outfil.write("%s\n" % t) is about 11% slower than in Python2.7 and 8% slower than in Python3.2. On the other hand in 3.3 the hack b) outfil.write(str(t)); outfil.write('\n') runs about as fast as a) in 3.2. This doesn't necessarily show up in microbenchmarks with timeit, so I thought I'd leave this open for others to see (and comment). But if I understand correctly, the slowdown in string formatting is expected, so we can indeed close this.
msg149261 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-12-11 23:44
> But if I understand correctly, the slowdown in string formatting is > expected, so we can indeed close this. Well, expected doesn't mean it shouldn't be improved, so finding a way to speed it up would be nice ;) (probably difficult, though)

History
Date	User	Action	Args
2022-04-11 14:57:24	admin	set	github: 57779
2011-12-11 23:44:24	pitrou	set	nosy: + pitrou messages: + msg149261
2011-12-11 23:36:34	skrah	set	status: open -> closed resolution: not a bug messages: + msg149260 stage: resolved
2011-12-11 22:22:27	loewis	set	messages: + msg149253
2011-12-10 14:16:46	skrah	set	messages: + msg149159
2011-12-10 13:13:49	vstinner	set	messages: + msg149151
2011-12-09 22:30:54	jcea	set	nosy: + jcea
2011-12-09 21:12:31	skrah	create