Author Cristian Barbarosie
Recipients Cristian Barbarosie, docs@python
Date 2017-04-06.04:40:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1491453642.54.0.802330198745.issue30004@psf.upfronthosting.co.za>
In-reply-to
Content
In the Regular Expression HOWTO
https://docs.python.org/3.6/howto/regex.html#regex-howto
the last example in the "Grouping" section has a bug. The code is supposed to find repeated words, but it catches false repetitions.

>>> p = re.compile(r'(\b\w+)\s+\1')
>>> p.search('Paris in the the spring').group()
'the the'
>>> p.search('k is the thermal coefficient').group()
'the the'

I propose adding a \b after \1, this solves the problem :

>>> p = re.compile(r'(\b\w+)\s+\1\b')
>>> p.search('Paris in the the spring').group()
'the the'
>>> print p.search('k is the thermal coefficient')
None
History
Date User Action Args
2017-04-06 04:40:42Cristian Barbarosiesetrecipients: + Cristian Barbarosie, docs@python
2017-04-06 04:40:42Cristian Barbarosiesetmessageid: <1491453642.54.0.802330198745.issue30004@psf.upfronthosting.co.za>
2017-04-06 04:40:42Cristian Barbarosielinkissue30004 messages
2017-04-06 04:40:42Cristian Barbarosiecreate