Issue817234
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2003-10-03 15:01 by kevinbutler, last changed 2022-04-10 16:11 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
sre.patch | niemeyer, 2004-09-03 18:13 | Applied patch. |
Messages (4) | |||
---|---|---|---|
msg18533 - (view) | Author: Kevin J. Butler (kevinbutler) | Date: 2003-10-03 15:01 | |
The iterator returned by re.finditer appears to not terminate if the final match is empty, but rather keeps returning the final (empty) match. Is this a bug in _sre? If so, I'll be happy to file it, though fixing it is a bit beyond my _sre experience level at this point. The solution would appear to be to either a check for duplicate match in iterator.next(), or to increment position by one after returning an empty match (which should be OK, because if a non-empty match started at that location, we would have returned it instead of the empty match). Code to illustrate the failure: from re import finditer last = None for m in finditer( ".*", "asdf" ): if last == m.span(): print "duplicate match:", last break print m.group(), m.span() last = m.span() --- asdf (0, 4) (4, 4) duplicate match: (4, 4) --- findall works: print re.findall( ".*", "asdf" ) ['asdf', ''] Workaround is to explicitly check for a duplicate span, as I did above, or to check for a duplicate end(), which avoids the final empty match Seo Sanghyeon sent the following fix to python-dev list: Attached one line patch fixes re.finditer bug reported by Kevin J. Butler. I read cvs log to find out why this code is introduced, and it seems to be related to SF bug #581080. But that bug didn't appear after my patch, so I wonder why it was introduced in the first place. It seems beyond my understanding. Please enlighten me. To test: #581080 import re list(re.finditer('\s', 'a b')) # expected: one item list # bug: hang #Kevin J. Butler import re list(re.finditer('.*', 'asdf')) # expected: two item list (?) # bug: hang Seo Sanghyeon -------------- next part -------------- ? patch Index: Modules/_sre.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/_sre.c,v retrieving revision 2.99 diff -c -r2.99 _sre.c *** Modules/_sre.c 26 Jun 2003 14:41:08 -0000 2.99 --- Modules/_sre.c 2 Oct 2003 03:48:55 -0000 *************** *** 3062,3069 **** match = pattern_new_match((PatternObject*) self->pattern, state, status); ! if ((status == 0 || state->ptr == state->start) && ! state->ptr < state->end) state->start = (void*) ((char*) state->ptr + state->charsize); else state->start = state->ptr; --- 3062,3068 ---- match = pattern_new_match((PatternObject*) self->pattern, state, status); ! if (status == 0 || state->ptr == state->start) state->start = (void*) ((char*) state->ptr + state->charsize); else state->start = state->ptr; |
|||
msg18534 - (view) | Author: Kevin J. Butler (kevinbutler) | Date: 2003-10-03 18:16 | |
Logged In: YES user_id=117665 The above patch does resolve the problem. The code was introduced in rev 2.85 http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Modules/_sre.c to resolve bug 581080 http://sourceforge.net/tracker/index.php?func=detail&aid=581080&group_id=5470&atid=105470 but removing this line does not re-introduce that bug. Thanks, and kudos to Seo... |
|||
msg18535 - (view) | Author: Fredrik Lundh (effbot) * | Date: 2004-09-03 12:04 | |
Logged In: YES user_id=38376 Still there in 2.4a3, as the following revised example shows: import re m = re.finditer(".*", "asdf") print m.next().span() print m.next().span() print m.next().span() # this should raise an exception Gustavo, can you look at this patch too? |
|||
msg18536 - (view) | Author: Gustavo Niemeyer (niemeyer) * | Date: 2004-09-03 18:13 | |
Logged In: YES user_id=7887 Patch applied and test cases added to check this bug and also for #581080. Kevin and Seo, thanks for the bug report and the fix. Fredrik, thanks for pointing me to the issue. Applied as: Lib/test/test_re.py: 1.52 Modules/_sre.c: 2.108 Patch attached for reference. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:11:35 | admin | set | github: 39362 |
2003-10-03 15:01:52 | kevinbutler | create |