This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steve.newcomb
Recipients steve.newcomb
Date 2016-08-30.16:27:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1472574470.7.0.512575721216.issue27898@psf.upfronthosting.co.za>
In-reply-to
Content
Our most regular-expression-processing-intensive Python 2.7 code takes 2.5x more execution time in 2.7.12 than it did in 2.7.6.  I discovered this after upgrading from Ubuntu 14.04 to Ubuntu 16.04.  Basically this code runs thousands of compiled regular expressions on thousands of texts.  Both the multiprocessing module and the re module are heavily used.

See attached profiler outputs, which look quite different in several respects.  I used the profiling module to profile the same Python code, processing the same data, using the same hardware, under both Ubuntu 14.04 (Python 2.7.6) and Ubuntu 16.04 (Python 2.7.12).  

It is striking, for example, that cPickle.load appears so prominently in the 2.7.12 profile -- a fact which appears to implicate the multiprocessing module somehow.  But I suspect that the re module is more likely the main source of the problem, because the execution times of other production steps -- steps that do not call the multiprocessing module -- also appear to be extended to a degree that is roughly proportional to the amount of regular expression processing done in those other steps.

I will happily provide any further information I can.  Any insights about this surprisingly severe performance degradation would be welcome.
History
Date User Action Args
2016-08-30 16:27:50steve.newcombsetrecipients: + steve.newcomb
2016-08-30 16:27:50steve.newcombsetmessageid: <1472574470.7.0.512575721216.issue27898@psf.upfronthosting.co.za>
2016-08-30 16:27:50steve.newcomblinkissue27898 messages
2016-08-30 16:27:50steve.newcombcreate