Message307546
>>> re.findall(r'^|\w+', 'two words')
['', 'wo', 'words']
Seems the current behavior was documented incorrectly in issue732120.
It will be fixed in 3.7 (see issue1647489, issue25054), but I hesitate to backport the fix to 3.6 and 2.7 because this can break the user code. For example:
In Python 3.6:
>>> list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar'))
[<_sre.SRE_Match object; span=(4, 4), match=''>, <_sre.SRE_Match object; span=(5, 5), match=''>]
In Python 3.7:
>>> list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar'))
[<re.Match object; span=(4, 4), match=''>, <re.Match object; span=(4, 5), match='\n'>, <re.Match object; span=(5, 5), match=''>]
(This is a real pattern used in the docstring module, but with re.sub()).
The proposed PR documents the current weird behavior in 2.7 and 3.6. |
|
Date |
User |
Action |
Args |
2017-12-04 10:08:06 | serhiy.storchaka | set | recipients:
+ serhiy.storchaka, rhettinger, ezio.melotti, mrabarnett, docs@python |
2017-12-04 10:08:06 | serhiy.storchaka | set | messageid: <1512382086.34.0.213398074469.issue32211@psf.upfronthosting.co.za> |
2017-12-04 10:08:06 | serhiy.storchaka | link | issue32211 messages |
2017-12-04 10:08:05 | serhiy.storchaka | create | |
|