classification
Title: Accelerate 'string' % (value, ...) by using formatted string literals
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution:
Dependencies: 11549 Superseder:
Assigned To: serhiy.storchaka Nosy List: eric.smith, serhiy.storchaka, taleinat, ztane
Priority: low Keywords: patch

Created on 2016-09-29 08:49 by serhiy.storchaka, last changed 2018-09-07 21:15 by taleinat.

Pull Requests
URL Status Linked Edit
PR 5012 open serhiy.storchaka, 2017-12-26 00:04
Messages (7)
msg277688 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-29 08:49
For now using formatted string literals (PEP498) is the fastest way of formatting strings.

$ ./python -m perf timeit -s 'k = "foo"; v = "bar"' -- '"%s = %r" % (k, v)'
Median +- std dev: 2.27 us +- 0.20 us

$ ./python -m perf timeit -s 'k = "foo"; v = "bar"' -- 'f"{k!s} = {v!r}"'
Median +- std dev: 1.09 us +- 0.08 us

The compiler could translate C-style formatting with literal format string to the equivalent formatted string literal. The code '%s = %r' % (k, v) could be translated to

    t1 = k; t2 = v; f'{t1!r} = {t2!s}'; del t1, t2

or even simpler if k and v are initialized local variables.

$ ./python -m perf timeit -s 'k = "foo"; v = "bar"' -- 't1 = k; t2 = v; f"{t1!s} = {t2!r}"; del t1, t2'
Median +- std dev: 1.22 us +- 0.05 us

This is not easy issue and needs first implementing the AST optimizer.
msg277694 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2016-09-29 09:42
There isn't a direct mapping between %-formatting and __format__ format specifiers. Off the top of my head, I can think of at least one difference:

>>> '%i' % 3
'3'
>>> '{:i}'.format(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Unknown format code 'i' for object of type 'int'

So you'll need to be careful with edge cases like this.

Also, for all usages of %s, remember to call str() (or add !s):

>>> '%s' % 1
'1'
>>> f'{1:s}'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Unknown format code 's' for object of type 'int'
>>> f'{1!s:s}'
'1'

Although that also reminds me of this default alignment difference:
>>> x=0
>>> '%2s' % x
' 0'
>>> f'{x!s:2s}'
'0 '
>>> f'{x!s:>2s}'
' 0'

So, in general, the mapping will be difficult. On the other hand, if you can do it, and provide a function that maps between %-formatting codes and __format__ codes, then that might be a generally useful tool.
msg277700 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-29 10:44
'%s' % x should be translated to f'{x!s}', not to f'{x:s}'. Only %s, %r and %a can be supported. Formatting with %i should left untranslated. Or maybe translate '%r: %i' % (a, x) to f'{a!r}: {"%i" % x}'.

It is possible also to introduce special opcodes that converts argument to exact int or float. Then '%06i' % x could be translated to f'{__exact_int__(x):06}'.
msg277702 - (view) Author: Antti Haapala (ztane) * Date: 2016-09-29 12:01
Serhiy, you actually did make a mistake above; `'%s' % x` cannot be rewritten as `f'{x!s}'`, only `'%s' % (x,)` can be optimized... 

(just try with `x = 1, 2`)
msg277703 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-29 12:40
Thanks for the correction Antti. Yes, this is what I initially meant. This optimization is applicable only if the left argument of % is a literal string and the right argument is a tuple expression. Saying about `'%s' % x` I meant a component of the tuple.
msg309049 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-26 00:07
PR 5012 implements transformation simple format strings containing only %s, %r and %a into f-strings.
msg324795 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2018-09-07 21:15
I'm +1 on this optimization.
History
Date User Action Args
2018-09-07 21:15:06taleinatsetnosy: + taleinat
messages: + msg324795
2017-12-26 00:07:35serhiy.storchakasetmessages: + msg309049
2017-12-26 00:04:15serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request4901
2017-12-25 16:57:34serhiy.storchakasetassignee: serhiy.storchaka
2016-09-29 12:40:20serhiy.storchakasetmessages: + msg277703
2016-09-29 12:01:36ztanesetnosy: + ztane
messages: + msg277702
2016-09-29 10:44:19serhiy.storchakasetmessages: + msg277700
2016-09-29 09:42:27eric.smithsetmessages: + msg277694
2016-09-29 08:50:19serhiy.storchakasetdependencies: + Build-out an AST optimizer, moving some functionality out of the peephole optimizer
2016-09-29 08:49:56serhiy.storchakacreate