This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: A bug related to matching the empty string
Type: behavior Stage:
Components: Regular Expressions Versions: Python 3.2
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: lanyjie, mrabarnett, ned.deily
Priority: normal Keywords:

Created on 2010-11-25 16:39 by lanyjie, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (3)
msg122380 - (view) Author: Yingjie (lanyjie) Date: 2010-11-25 16:39
Here are some puzzling results I have got (I am using Python 3, I suppose similar results for python 2).

When I do the following, I got an exception:
>>> re.findall('(d*)*', 'adb')
>>> re.findall('((d)*)*', 'adb')

When I do this, I am fine but the result is wrong:
>>> re.findall('((.d.)*)*', 'adb')
[('', 'adb'), ('', '')]

Why is it wrong?

The first mactch of groups:
('', 'adb')
indicates the outer group ((.d.)*) captured
the empty string, while the inner group (.d.)
captured 'adb', so the outer group must have
captured the empty string at the end of the
provided string 'adb'.

Once we have matched the final empty string '',
there should be no more matches, but we got
another match ('', '')!!!

So, findall matched the empty string in
the end of the string twice!!!

Isn't this a bug?

Yingjie
msg122384 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2010-11-25 17:24
The spans say this:

>>> for m in re.finditer('((.d.)*)*', 'adb'):
    print(m.span())

    
(0, 3)
(3, 3)

There's an non-empty match followed by an empty match.

IHMO, not a bug.
msg122476 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2010-11-26 20:31
Closing this issue since it appears to not be a bug.
History
Date User Action Args
2022-04-11 14:57:09adminsetgithub: 54741
2010-11-26 20:31:54ned.deilysetstatus: open -> closed

nosy: + ned.deily
messages: + msg122476

resolution: not a bug
2010-11-25 17:24:27mrabarnettsetnosy: + mrabarnett
messages: + msg122384
2010-11-25 16:41:06lanyjiesettype: behavior
2010-11-25 16:39:12lanyjiecreate