This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Patrick Maupin
Recipients Patrick Maupin, ezio.melotti, mrabarnett
Date 2015-06-10.18:53:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1433962389.05.0.538082969488.issue24426@psf.upfronthosting.co.za>
In-reply-to
Content
The addition of a capturing group in a re.split() pattern, e.g. using '(\n)' instead of '\n', causes a factor of 10 performance degradation.

I use re.split a() lot, but never noticed the issue before.  It was extremely noticeable on 1000 patterns in a 5BG file, though, requiring 40 seconds instead of 4.

I have attached a script demonstrating the issue.  I have tested on 2.7 and 3.4, but have no reason to believe it doesn't exist on other vesions as well.

Thanks,
Pat
History
Date User Action Args
2015-06-10 18:53:09Patrick Maupinsetrecipients: + Patrick Maupin, ezio.melotti, mrabarnett
2015-06-10 18:53:09Patrick Maupinsetmessageid: <1433962389.05.0.538082969488.issue24426@psf.upfronthosting.co.za>
2015-06-10 18:53:08Patrick Maupinlinkissue24426 messages
2015-06-10 18:53:08Patrick Maupincreate