Message 100439 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	exarkun
Recipients	exarkun, ezio.melotti, michael.foord, mrabarnett, olivers, vstinner
Date	2010-03-04.23:37:26
SpamBayes Score	4.45784e-05
Marked as misclassified	No
Message-id	<1267745848.85.0.501782587141.issue8064@psf.upfronthosting.co.za>
In-reply-to

Content
> So is it reasonable / unavoidable that UCS4 builds should be 1200 times slower at regex handling? No, but it's probably reasonable / unavoidable that a more complex regex should be some number of times slower than a simpler regex. On Linux, the regex being constructed is more complex. On OS X, it's simpler. The reason for the difference is that the Linux build is UCS4, but that's only because the unicode character width is being used as part of the function that constructs the regular expression. If you take this variable out of that function, so that it returns the same string regardless of the width of a unicode character, then performance evens out.

> So is it reasonable / unavoidable that UCS4 builds should be 1200 times slower at regex handling?

No, but it's probably reasonable / unavoidable that a more complex regex should be some number of times slower than a simpler regex.

On Linux, the regex being constructed is more complex.  On OS X, it's simpler.  The reason for the difference is that the Linux build is UCS4, but that's only because the unicode character width is being used as part of the function that constructs the regular expression.

If you take this variable out of that function, so that it returns the same string regardless of the width of a unicode character, then performance evens out.

History
Date	User	Action	Args
2010-03-04 23:37:28	exarkun	set	recipients: + exarkun, vstinner, ezio.melotti, mrabarnett, michael.foord, olivers
2010-03-04 23:37:28	exarkun	set	messageid: <1267745848.85.0.501782587141.issue8064@psf.upfronthosting.co.za>
2010-03-04 23:37:27	exarkun	link	issue8064 messages
2010-03-04 23:37:26	exarkun	create