classification
Title: RegExp Conditional Construct (?(id/name)yes-pattern|no-pattern) Problem
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: LHampton, SilentGhost, docs@python, ezio.melotti, mrabarnett
Priority: normal Keywords:

Created on 2020-03-22 17:44 by LHampton, last changed 2020-03-26 17:01 by mrabarnett. This issue is now closed.

Messages (6)
msg364811 - (view) Author: Leon Hampton (LHampton) Date: 2020-03-22 17:44
Hello,
In the 3.7.7 documentation on Regular Expression, the Conditional Construct, (?(id/name)yes-pattern|no-pattern), is discussed. (This is a very thorough document, by the way. Good job!)
One example given for the Conditional Construct does not work as described. Specifically, the example gives this matching pattern '(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)' and states that it will NOT MATCH the string '<user@host.com'. In my tests the pattern DOES MATCH that string.
(The other examples work as described.) 
This may be a bug in re since it seems to me that the match should fail.
Respectfully,
Leon
msg364812 - (view) Author: Leon Hampton (LHampton) Date: 2020-03-22 17:55
Hello,
There may be a bug in the implementation of the Conditional Construction of Regular Expressions, namely the (?(id/name)yes-pattern|no-pattern).
In the Regular Expression documentation (https://docs.python.org/3.7/library/re.html), in the portion about the Conditional Construct, it gives this sample pattern '(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)' and states that the pattern WILL NOT MATCH this string '<user@host.com'. In my tests the pattern MATCHES the string.
I agree that the pattern should not match.
Respectfully,
Leon
msg364816 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2020-03-22 19:05
The documentation is talking about whether it'll match at the current position in the string. It's not a bug.
msg364829 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2020-03-22 22:40
Leon, this most likely is not a bug, not because what's stated in documentation, but because you're most likely not testing what you think you do. Here is the test that you should be doing:

>>> re.match(r'(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)', '<user@host.com')
>>>

No match. If there is a different output in your setup, please provide both the output and the details of your system and Python installation.
msg365052 - (view) Author: Leon Hampton (LHampton) Date: 2020-03-26 04:48
Matthew Barnett & SilentGhost,
Thank you for your prompt responses. (Really prompt. Amazing!)
SilentGhost,
Regarding your response, I used re.search, not re.match. When I used re.match, the regex failed. When I used re.search, it matched.
Here are my tests.

Your example (cut-and-pasted):
x = re.match(r'(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)', '<user@host.com')
print(x)
I received 'None', the expected response.

My example using search:
x = re.search(r'(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)', '<user@host.com')
print(x)
I received:
<re.Match object; span=(1, 14), match='user@host.com'>

I understand the re.match failing, since it always starts at the beginning of the string, but why did re.search succeed? After failing with the yes-pattern, when the regex engine backtracked to the (<)? did it decide not to match the '<' at all and skip the character? Seems like it. What do you think?

I am running Python 3.7 via Spyder 4.1.1 on Windows 10.

Respectfully,
Leon
msg365094 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2020-03-26 17:01
That's what searching does!

Does the pattern match here? If not, advance by one character and try again. Repeat until a match is found or you've reached the end.
History
Date User Action Args
2020-03-26 17:01:40mrabarnettsetmessages: + msg365094
2020-03-26 04:48:15LHamptonsetmessages: + msg365052
2020-03-22 22:40:38SilentGhostsetstatus: open -> closed

nosy: + SilentGhost
messages: + msg364829

stage: resolved
2020-03-22 19:05:36mrabarnettsetresolution: not a bug
messages: + msg364816
2020-03-22 17:55:41LHamptonsettitle: Poor RegEx example for (?(id/name)yes-pattern|no-pattern) -> RegExp Conditional Construct (?(id/name)yes-pattern|no-pattern) Problem
nosy: + mrabarnett, ezio.melotti

messages: + msg364812

components: + Regular Expressions, - Documentation
2020-03-22 17:44:39LHamptoncreate