This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients loewis, mark.dickinson, pitrou, serhiy.storchaka, vstinner
Date 2012-05-08.23:58:39
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1336521522.38.0.748341957924.issue14744@psf.upfronthosting.co.za>
In-reply-to
Content
> Fill the ascii buffer and then copying can be cheaper than using
> _PyUnicodeWriter with general non-ascii string.

Here is a new patch using _PyUnicodeWriter directly in longobject.c.

According to my benchmark (see below), formating a small number (5 decimal digits) is 17% faster with my patch version 2 compared to tip, and 38% faster compared to Python 3.3 before my optimizations on str%tuples or str.format(). Creating a temporary PyUnicode is not cheap, at least for short strings.

str%tuple and str.format() allocates len(format_string)+100 ASCII characters at the beginning, which is enough for "x={}".format(12345) for example. So only a resize is needed, and it looks like resizing is cheap.

I'm not completly satisfied of the usage of Py_LOCAL_INLINE in unicodeobject.c for _PyUnicodeWriter methods. The same "hacks" (?) should be used in formatter_unicode.c.

Shell script (bench.sh) used to benchmark:
--------
echo -n "{0}.{1}.{2}: "; ./python -m timeit -r 10 -s 'fmt="{0}.{1}.{2}"' 'fmt.format("http", "client", "HTTPConnection")'
echo -n " [line {0:2d}] : "; ./python -m timeit -r 10 -s 'fmt=" [line {0:2d}] "' 'fmt.format(5)'
echo -n "str: "; ./python -m timeit -r 10 -s 'fmt="{0}"*100' 'fmt.format("ABCDEF")'
echo -n "str conv: "; ./python -m timeit -r 10 -s 'fmt="{0:s}"*100' 'fmt.format("ABCDEF")'
echo -n "long x 3: "; ./python -m timeit -r 10 -s 'fmt="x={0} x={0} x={0}"' 'fmt.format(12345)'
echo -n "float x 3: "; ./python -m timeit -r 10 -s 'fmt="x={0} x={0} x={0}"' 'fmt.format(12.345)'
echo -n "complex x 3: "; ./python -m timeit -r 10 -s 'fmt="x={0} x={0} x={0}"' 'fmt.format(12.345+2j)'
echo -n "long, float, complex: "; ./python -m timeit -r 10 -s 'fmt="x={} y={} z={}"' 'fmt.format(12345, 12.345, 12.345+2j)'
echo -n "huge long: "; ./python -m timeit -r 10 -s 'import math; huge=math.factorial(2000); fmt="x={}"' 'fmt.format(huge)'
--------

Results:
--------
3.3:

{0}.{1}.{2}: 1000000 loops, best of 10: 0.394 usec per loop
 [line {0:2d}] : 1000000 loops, best of 10: 0.519 usec per loop
str: 100000 loops, best of 10: 7.01 usec per loop
str conv: 100000 loops, best of 10: 13.3 usec per loop
long x 3: 1000000 loops, best of 10: 0.569 usec per loop
float x 3: 1000000 loops, best of 10: 1.62 usec per loop
complex x 3: 100000 loops, best of 10: 3.34 usec per loop
long, float, complex: 100000 loops, best of 10: 2.08 usec per loop
huge long: 1000 loops, best of 10: 666 usec per loop

3.3 + format_writer.patch :

{0}.{1}.{2}: 1000000 loops, best of 10: 0.412 usec per loop (+5%)
 [line {0:2d}] : 1000000 loops, best of 10: 0.461 usec per loop (-11%)
str: 100000 loops, best of 10: 6.85 usec per loop (-2%)
str conv: 100000 loops, best of 10: 11.1 usec per loop (-17%)
long x 3: 1000000 loops, best of 10: 0.605 usec per loop (+6%)
float x 3: 1000000 loops, best of 10: 1.57 usec per loop (-3%)
complex x 3: 100000 loops, best of 10: 3.54 usec per loop (+6%)
long, float, complex: 100000 loops, best of 10: 2.19 usec per loop (+5%)
huge long: 1000 loops, best of 10: 665 usec per loop (0%)

3.3 + format_writer-2.patch :

{0}.{1}.{2}: 1000000 loops, best of 10: 0.378 usec per loop (-4%)
 [line {0:2d}] : 1000000 loops, best of 10: 0.454 usec per loop (-13%)
str: 100000 loops, best of 10: 6.18 usec per loop (-12%)
str conv: 100000 loops, best of 10: 10.9 usec per loop (-18%)
long x 3: 1000000 loops, best of 10: 0.471 usec per loop (-17%)
float x 3: 1000000 loops, best of 10: 1.37 usec per loop (-15%)
complex x 3: 100000 loops, best of 10: 3.4 usec per loop (+2%)
long, float, complex: 1000000 loops, best of 10: 1.93 usec per loop (-7%)
huge long: 1000 loops, best of 10: 665 usec per loop (0%)
--------
History
Date User Action Args
2012-05-08 23:58:43vstinnersetrecipients: + vstinner, loewis, mark.dickinson, pitrou, serhiy.storchaka
2012-05-08 23:58:42vstinnersetmessageid: <1336521522.38.0.748341957924.issue14744@psf.upfronthosting.co.za>
2012-05-08 23:58:41vstinnerlinkissue14744 messages
2012-05-08 23:58:41vstinnercreate