This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients ezio.melotti, mrabarnett, sabakauser, serhiy.storchaka, xtreak
Date 2018-08-01.10:19:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1533118782.74.0.56676864532.issue34304@psf.upfronthosting.co.za>
In-reply-to
Content
If you want to replace %d with literal \d, you need to repeat the backslash 4 times:

    pattern = re.sub('%d', '\\\\d+', pattern)

or use a raw string literal and repeat the backslash 2 times:

    pattern = re.sub('%d', r'\\d+', pattern)

Since the backslash has a special meaning in the replacement pattern, it needs to be escaped with a backslash, i.e. duplicated. But since it has a special meaning in Python string literals, every of these backslashes needs to be escaped with a backslash in a non-raw string literal, i.e. repeated 4 times.

Python 3.6 is more lenient. It keeps a backslash if it is followed by a character which doesn't compound a known escape sequences in a replacement string. But it emits a deprecation warning, which you can see when run Python with corresponding -W option.

$ python3.6 -Wa -c 'import re; print(re.sub("%d", "\d+", "DBMS_NAME: string(%d) %s"))'
<string>:1: DeprecationWarning: invalid escape sequence \d
/usr/lib/python3.6/re.py:191: DeprecationWarning: bad escape \d
  return _compile(pattern, flags).sub(repl, string, count)
DBMS_NAME: string(\d+) %s

$ python3.6 -Wa -c 'import re; print(re.sub("%d", "\\d+", "DBMS_NAME: string(%d) %s"))'
/usr/lib/python3.6/re.py:191: DeprecationWarning: bad escape \d
  return _compile(pattern, flags).sub(repl, string, count)
DBMS_NAME: string(\d+) %s

$ python3.6 -Wa -c 'import re; print(re.sub("%d", "\\\d+", "DBMS_NAME: string(%d) %s"))'
<string>:1: DeprecationWarning: invalid escape sequence \d
DBMS_NAME: string(\d+) %s

$ python3.6 -Wa -c 'import re; print(re.sub("%d", "\\\\d+", "DBMS_NAME: string(%d) %s"))'
DBMS_NAME: string(\d+) %s

Here "invalid escape sequence \d" is generated by the Python parser, "bad escape \d" is generated by the RE engine.
History
Date User Action Args
2018-08-01 10:19:42serhiy.storchakasetrecipients: + serhiy.storchaka, ezio.melotti, mrabarnett, sabakauser, xtreak
2018-08-01 10:19:42serhiy.storchakasetmessageid: <1533118782.74.0.56676864532.issue34304@psf.upfronthosting.co.za>
2018-08-01 10:19:42serhiy.storchakalinkissue34304 messages
2018-08-01 10:19:42serhiy.storchakacreate