Author eric.smith
Recipients Mark.Shannon, eric.smith, larry
Date 2015-10-26.15:44:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1445874299.22.0.773155754186.issue25483@psf.upfronthosting.co.za>
In-reply-to
Content
Currently, the f-string f'a{3!r:10}' evaluates to bytecode that does the same thing as:

''.join(['a', format(repr(3), '10')])

That is, it literally calls the functions format() and repr(). The same holds true for str() and ascii() with !s and !a, respectively.

By redefining format, str, repr, and ascii, you can break or pervert the computation of the f-string's value:

>>> def format(v, fmt=None): return '42'
...
>>> f'{3}'
'42'

It's always been my intention to fix this. This patch adds an opcode FORMAT_VALUE, which instead of looking up format, etc., directly calls PyObject_Format, PyObject_Str, PyObject_Repr, and PyObject_ASCII. Thus, you can no longer modify what an f-string produces merely by overriding the named functions.


In addition, because I'm now saving the name lookups and function calls, performance is improved.

Here are the times without this patch:

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
1000000 loops, best of 3: 0.3 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
1000000 loops, best of 3: 0.511 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
1000000 loops, best of 3: 0.497 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
1000000 loops, best of 3: 0.461 usec per loop


And with this patch:

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
100000000 loops, best of 3: 0.02 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
10000000 loops, best of 3: 0.0896 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
10000000 loops, best of 3: 0.0923 usec per loop


So a 90%+ speedup, for these simple cases.

Also, now f-strings are faster than %-formatting, at least for some types:

$ ./python -m timeit -s 'x="test"' '"%s"%x'
10000000 loops, best of 3: 0.0755 usec per loop

$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop


Note that people often "benchmark" %-formatting with code like the following. But the optimizer converts this to a constant string, so it's not a fair comparison:

$ ./python -m timeit '"%s"%"test"'
100000000 loops, best of 3: 0.0161 usec per loop


These microbenchmarks aren't the end of the story, since the string concatenation also takes some time. That's another optimization I might implement in the future.

Thanks to Mark and Larry for some advice on this.
History
Date User Action Args
2015-10-26 15:44:59eric.smithsetrecipients: + eric.smith, larry, Mark.Shannon
2015-10-26 15:44:59eric.smithsetmessageid: <1445874299.22.0.773155754186.issue25483@psf.upfronthosting.co.za>
2015-10-26 15:44:59eric.smithlinkissue25483 messages
2015-10-26 15:44:58eric.smithcreate