This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author exarkun
Recipients exarkun, ezio.melotti, michael.foord, mrabarnett, olivers, vstinner
Date 2010-03-04.23:37:26
SpamBayes Score 4.45784e-05
Marked as misclassified No
Message-id <1267745848.85.0.501782587141.issue8064@psf.upfronthosting.co.za>
In-reply-to
Content
> So is it reasonable / unavoidable that UCS4 builds should be 1200 times slower at regex handling?

No, but it's probably reasonable / unavoidable that a more complex regex should be some number of times slower than a simpler regex.

On Linux, the regex being constructed is more complex.  On OS X, it's simpler.  The reason for the difference is that the Linux build is UCS4, but that's only because the unicode character width is being used as part of the function that constructs the regular expression.

If you take this variable out of that function, so that it returns the same string regardless of the width of a unicode character, then performance evens out.
History
Date User Action Args
2010-03-04 23:37:28exarkunsetrecipients: + exarkun, vstinner, ezio.melotti, mrabarnett, michael.foord, olivers
2010-03-04 23:37:28exarkunsetmessageid: <1267745848.85.0.501782587141.issue8064@psf.upfronthosting.co.za>
2010-03-04 23:37:27exarkunlinkissue8064 messages
2010-03-04 23:37:26exarkuncreate