Title: Unexpected behavior re.sub() with raw f-strings
Messages (3)
Author: dkreeft (dkreeft) Date: 2020-09-29 15:00
Steps to reproduce (Windows/Python 3.7.7):

1. Define replacement string that starts with an integer:


2. Use re.sub() as follows:

re.sub(r'([a-z]+)', fr"\1{REPLACEMENT}", 'something')

3. The outcome is not 'something12345' as expected, but 'J345'.

Note that I am using the group in the replacement argument, which is a raw f-string.

A quick investigation with other replacement strings renders similar unexpected behavior:

REPLACEMENT = '1': leads to re.error (invalid group reference 11 at position 1)

So it seems like the f-string is evaluated first, yielding a string starting with an integer. Python then interprets the '\1' to indicate group 1 as '\1<first integer>', which leads to the behavior described above. Even if this is by design, it seems confusing and makes using groups with re.sub() cumbersome if the replacement f-string starts with an integer.
Author: Eric V. Smith (eric.smith) Date: 2020-09-29 15:19
f-strings are indeed evaluated when the value of the string is needed. Your example is equivalent to:

>>> re.sub(r'([a-z]+)', fr"\112345", 'something')

As always with regexes, you need to be careful when dynamically composing them.
Author: Matthew Barnett (mrabarnett) Date: 2020-09-29 15:35
Arguments are evaluated first and then the results are passed to the function. That's true throughout the language.

In this instance, you can use \g<1> in the replacement string to refer to group 1:

re.sub(r'([a-z]+)', fr"\g<1>{REPLACEMENT}", 'something')
