classification
Title: \b requires raw strings or to be escaped. Update docs with that hint?
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.7, Python 3.6, Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Mike.Lissner, docs@python, r.david.murray
Priority: normal Keywords:

Created on 2017-01-18 20:21 by Mike.Lissner, last changed 2017-01-18 21:11 by r.david.murray. This issue is now closed.

Messages (2)
msg285751 - (view) Author: Mike Lissner (Mike.Lissner) Date: 2017-01-18 20:21
I just ran into a funny corner case I imagine others are aware of. When you write "\b" in Python, it is a single character: "\x08". So if you try to write a regex like:

words = '\b(.*)\b'

That won't work. But using a raw string will:

words = r'\b(.*)\b'

As will escaping it in this horrible fashion:

words = '\\b(.*)\\b'

I believe this doesn't affect any of the other regex flags, so I wonder if it's worth adding something to the docs to warn about this. I just spent a bunch of time trying to figure out why it seemed like \b wasn't working. A little tip in the docs would have gone a LONG way.
msg285755 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-01-18 21:11
One should always use raw strings for regex expressions, and this is already documented in the introduction to the regex module.  Further, in 3.5 using \ in front of characters that aren't special produces a warning, which should reduce the frequency of this mistake.

I don't see that there's anything to do here.
History
Date User Action Args
2017-01-18 21:11:54r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg285755

resolution: not a bug
stage: resolved
2017-01-18 20:21:15Mike.Lissnercreate