This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients ezio.melotti, mrabarnett, probinso, serhiy.storchaka
Date 2021-02-14.09:19:37
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1613294377.69.0.561664707736.issue43222@roundup.psfhosted.org>
In-reply-to
Content
There was a bug in the regular expression engine which caused re.split() working incorrectly with zero-width patterns. Note that in your example _DIGIT_BOUNDARY_RE.split("10.0.0") returns ['10.0.0'] on Python 2.7 -- the result which you unlikely expected.

It was impossible to fix that bug without changing behavior of other functions in corner cases and breaking existing code. So we first made re.split() raising an exception instead of returning nonsensical result and added warnings for some other cases to help users to catch potential bugs in their code and avoid ambiguous patterns. You see this in 3.6. In 3.7 we fixed the underlying bug. It caused breakage of some user code, but it made regular expressions more consistent in long perspective and made zero-width patterns more usable.

In your particular case, if you still need to support Python 2.7 and 3.6, try to use re.split() with pattern r'(\D+)' or r'(\d+)' (parentheses are meaningful here). It gives almost the same result, except possible prepended and appended empty strings.
History
Date User Action Args
2021-02-14 09:19:37serhiy.storchakasetrecipients: + serhiy.storchaka, ezio.melotti, mrabarnett, probinso
2021-02-14 09:19:37serhiy.storchakasetmessageid: <1613294377.69.0.561664707736.issue43222@roundup.psfhosted.org>
2021-02-14 09:19:37serhiy.storchakalinkissue43222 messages
2021-02-14 09:19:37serhiy.storchakacreate