classification
Title: sre_constants.error: bad escape \d
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Noah Petherbridge, ezio.melotti, mrabarnett, r.david.murray, serhiy.storchaka
Priority: normal Keywords:

Created on 2016-07-08 21:57 by Noah Petherbridge, last changed 2016-10-16 08:09 by serhiy.storchaka. This issue is now closed.

Messages (5)
msg270010 - (view) Author: Noah Petherbridge (Noah Petherbridge) Date: 2016-07-08 21:57
I found a bug in Python 3.6.0a2 that wasn't present on previous versions of Python concerning the "\d" escape sequence as used in the following regular expression:

import re
s = "hello"
s = re.sub(re.escape(r'(\d+?)'), '(?:\d+?)', s)

(The purpose of this regular expression was to translate the literal regexp string "(\d+?)" to be a non-capturing literal regexp string, to eventually be used as a re pattern).

When running this code in 3.6.0a2 I receive the following stack traces:

- - - - - - - - - -

Traceback (most recent call last):
  File "/home/kirsle/.pyenv/versions/3.6.0a2/lib/python3.6/sre_parse.py", line 877, in parse_template
    this = chr(ESCAPES[this][1])
KeyError: '\\d'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    s = re.sub(re.escape(r'(\d+?)'), '(?:\d+?)', s)
  File "/home/kirsle/.pyenv/versions/3.6.0a2/lib/python3.6/re.py", line 181, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "/home/kirsle/.pyenv/versions/3.6.0a2/lib/python3.6/re.py", line 324, in _subx
    template = _compile_repl(template, pattern)
  File "/home/kirsle/.pyenv/versions/3.6.0a2/lib/python3.6/re.py", line 311, in _compile_repl
    p = sre_parse.parse_template(repl, pattern)
  File "/home/kirsle/.pyenv/versions/3.6.0a2/lib/python3.6/sre_parse.py", line 880, in parse_template
    raise s.error('bad escape %s' % this, len(this))
sre_constants.error: bad escape \d at position 3

- - - - - - - - - -

However, the script runs without crashing on Python 3.5.1 and 2.7.11

% python --version
Python 3.6.0a2
msg270011 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2016-07-08 22:30
There's a move to treat invalid escape sequences as an error (see issue 27364). The previous behaviour was to treat them as literals.

The replacement template string contains \d, which is not a valid escape sequence (it's valid for the pattern, but not the template).
msg270012 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-07-08 22:33
It's just supposed to be a warning at this point, though, so this looks like a bug.
msg270024 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-09 05:23
It is a deprection warning in 3.5.

$ python3.5 -Wd
>>> import re
>>> re.sub(re.escape(r'(\d+?)'), '(?:\d+?)', r'(\d+?)')
/usr/lib/python3.5/re.py:182: DeprecationWarning: bad escape \d
  return _compile(pattern, flags).sub(repl, string, count)
'(?:\\d+?)'

The warning was added in issue23622, and turned into exception in issue27030.
msg278750 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-10-16 08:09
All behaves as purposed. But the documentation is not accurate, and even misleading. It should be enhanced (issue28450).
History
Date User Action Args
2016-10-16 08:09:34serhiy.storchakasetstatus: open -> closed
resolution: not a bug
messages: + msg278750

stage: resolved
2016-07-09 05:23:04serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg270024
2016-07-08 22:33:57r.david.murraysetnosy: + r.david.murray
messages: + msg270012
2016-07-08 22:30:12mrabarnettsettype: crash -> behavior
messages: + msg270011
2016-07-08 21:57:07Noah Petherbridgecreate