Title: re.match() not matching escaped hyphens
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.6
Status: closed Resolution: not a bug
Assigned To: Nosy List: ezio.melotti, josh.r, lprice, mrabarnett, steven.daprano
Created on 2019-03-06 02:39 by lprice, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg337272 - (view) Author: Lottie Price (lprice) Date: 2019-03-06 02:39

re.match("\\-", "\\-")

returns None.
I expected a match.


I have some random text strings I want to match against. I'm using re.escape() to ensure that the text characters are not interpreted as special characters:
      re.match(re.escape("-"), re.escape("-"))

As a result, strings with hyphens are coming back as above, and are not matching.
msg337274 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-03-06 03:45
"\\-" is equivalent to the raw string r"\-" (that it, it has one backslash, followed by a hyphen). \X (where X is any ASCII non-letter, non-digit) matches the character itself (the escape does nothing except ensure the punctuation doesn't have any special regex meaning). So your pattern is equivalent to "-". Since re.match has an implicit anchor at the beginning of the string (making it roughly like "^-"), the string "\-" doesn't match.

Use raw strings consistently for your regular expressions to reduce the number of rounds of deescaping. re.match(r"\\-", "\\-") works as you expected.
msg337275 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-03-06 03:54
You don't escape the text you are searching. Try this:

py> re.match(re.escape('-'), "-")
<_sre.SRE_Match object; span=(0, 1), match='-'>

py> re.match(re.escape('a-c'), "a-c")
<_sre.SRE_Match object; span=(0, 3), match='a-c'>
