classification
Title: in regex-howto, improve example on grouping
Type: enhancement Stage: resolved
Components: Documentation, Regular Expressions Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Cristian Barbarosie, Mariatta, akuchling, docs@python, ezio.melotti, mandeepb, mrabarnett, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-04-06 04:40 by Cristian Barbarosie, last changed 2017-11-25 07:03 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4443 merged mandeepb, 2017-11-17 15:48
PR 4554 merged python-dev, 2017-11-25 04:56
PR 4555 merged python-dev, 2017-11-25 04:57
Messages (13)
msg291209 - (view) Author: Cristian Barbarosie (Cristian Barbarosie) Date: 2017-04-06 04:40
In the Regular Expression HOWTO
https://docs.python.org/3.6/howto/regex.html#regex-howto
the last example in the "Grouping" section has a bug. The code is supposed to find repeated words, but it catches false repetitions.

>>> p = re.compile(r'(\b\w+)\s+\1')
>>> p.search('Paris in the the spring').group()
'the the'
>>> p.search('k is the thermal coefficient').group()
'the the'

I propose adding a \b after \1, this solves the problem :

>>> p = re.compile(r'(\b\w+)\s+\1\b')
>>> p.search('Paris in the the spring').group()
'the the'
>>> print p.search('k is the thermal coefficient')
None
msg291246 - (view) Author: Cristian Barbarosie (Cristian Barbarosie) Date: 2017-04-06 20:57
Just discovered that a nearly identical example is presented in the end of section "Non-capturing and Named Groups". My proposal applies to this other example, too.
And, by the way, reading this HOWTO has been very useful to me.
msg292511 - (view) Author: Mandeep Bhutani (mandeepb) * Date: 2017-04-28 05:39
Looks like both examples need a closing \b. Is this being worked on or should I submit a PR?
msg292972 - (view) Author: Cristian Barbarosie (Cristian Barbarosie) Date: 2017-05-04 11:14
This topic seems stuck. Is there anything else I should do ?
msg306361 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-16 14:12
Do you mind to create a pull request on GitHub Cristian?
msg306439 - (view) Author: Cristian Barbarosie (Cristian Barbarosie) Date: 2017-11-17 14:25
I'm sorry, I have no experience at all with Git. Could you please do it for me ?
The bug appears in two places, see my first two messages.
Thank you
msg306440 - (view) Author: Mandeep Bhutani (mandeepb) * Date: 2017-11-17 14:35
Serhiy, Christian: I'll submit a PR for this later today.
msg306447 - (view) Author: Mandeep Bhutani (mandeepb) * Date: 2017-11-17 15:53
Cristian, Serhiy: I've submitted a PR for this bug. 

Cristian: I apologize for misspelling your name in a prior post.
msg306942 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-11-25 04:56
New changeset 610e5afdcbe3eca906ef32f4e0364e20e1b1ad23 by Mariatta (Mandeep Bhutani) in branch 'master':
bpo-30004: Fix the code example of using group in Regex Howto Docs (GH-4443)
https://github.com/python/cpython/commit/610e5afdcbe3eca906ef32f4e0364e20e1b1ad23
msg306943 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-11-25 05:01
New changeset c02037d62284f4d4ca6b22f2ed05165ce2014951 by Mariatta (Miss Islington (bot)) in branch '2.7':
bpo-30004: Fix the code example of using group in Regex Howto Docs (GH-4443) (GH-4555)
https://github.com/python/cpython/commit/c02037d62284f4d4ca6b22f2ed05165ce2014951
msg306944 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-11-25 05:03
New changeset 3e60747025edc34b503397ab8211be59cfdd05cd by Mariatta (Miss Islington (bot)) in branch '3.6':
bpo-30004: Fix the code example of using group in Regex Howto Docs (GH-4443) (GH-4554)
https://github.com/python/cpython/commit/3e60747025edc34b503397ab8211be59cfdd05cd
msg306945 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-11-25 05:03
Thanks everyone. I merged the PR, and it's been backported to 3.6 and 2.7
msg306947 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-25 07:03
Thank you Cristian for reporting this issue. Thank you Mandeep for your patch. Thank you Mariatta for merging.
History
Date User Action Args
2017-11-25 07:03:29serhiy.storchakasetmessages: + msg306947
2017-11-25 05:03:50Mariattasetstatus: open -> closed
resolution: fixed
messages: + msg306945

stage: patch review -> resolved
2017-11-25 05:03:06Mariattasetmessages: + msg306944
2017-11-25 05:01:42Mariattasetmessages: + msg306943
2017-11-25 04:57:10python-devsetpull_requests: + pull_request4486
2017-11-25 04:56:10python-devsetpull_requests: + pull_request4485
2017-11-25 04:56:02Mariattasetnosy: + Mariatta
messages: + msg306942
2017-11-17 15:53:21mandeepbsetmessages: + msg306447
2017-11-17 15:48:59mandeepbsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request4384
2017-11-17 14:35:44mandeepbsetmessages: + msg306440
2017-11-17 14:25:10Cristian Barbarosiesetmessages: + msg306439
2017-11-16 14:12:24serhiy.storchakasetversions: - Python 3.3, Python 3.4, Python 3.5
nosy: + ezio.melotti, serhiy.storchaka, mrabarnett

messages: + msg306361

components: + Regular Expressions
stage: needs patch
2017-05-04 11:14:35Cristian Barbarosiesetmessages: + msg292972
2017-04-28 05:39:19mandeepbsetnosy: + mandeepb
messages: + msg292511
2017-04-06 20:57:53Cristian Barbarosiesetmessages: + msg291246
2017-04-06 05:16:16serhiy.storchakasetnosy: + akuchling
2017-04-06 04:40:42Cristian Barbarosiecreate