Author serhiy.storchaka
Recipients ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
Date 2014-11-08.11:01:19
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1415444480.37.0.198748558658.issue22818@psf.upfronthosting.co.za>
In-reply-to
Content
For now re.split doesn't split with zero-width regex. There are a number of issues for this (issue852532, issue988761, issue3262, issue22817). This is definitely a bug, but fixing this bug will likely break existing code which use regular expressions which can match zero-width (e.g. re.split('(:*)', 'ab')).

I propose to deprecate splitting on possible zero-width regular expressions. This expressions either not work at all as expected (r'\b' never split) or can be rewritten to not match empty string ('(:*)' to '(:+)').

In next release (3.6) we can convert deprecation warning to the exception, an then after transitional period change behavior to more correct handling zero-width matches without breaking backward compatibility.
History
Date User Action Args
2014-11-08 11:01:20serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, ezio.melotti, mrabarnett
2014-11-08 11:01:20serhiy.storchakasetmessageid: <1415444480.37.0.198748558658.issue22818@psf.upfronthosting.co.za>
2014-11-08 11:01:20serhiy.storchakalinkissue22818 messages
2014-11-08 11:01:19serhiy.storchakacreate