This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients brett.cannon, pitrou, serhiy.storchaka, terry.reedy
Date 2016-03-01.10:02:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1456826550.21.0.5273202494.issue26436@psf.upfronthosting.co.za>
In-reply-to
Content
I used the code from fasta and regex-dna tests almost without changes. I.e. one part create the data in standard FASTA format (with 60-character lines and headers), and other part parses this format. The code can be simple if generate and consume raw data.

As for the quality of the code, tested code is pretty simple and enough pythonic. Yes, using replace() is more idiomatic and faster, but we are testing regular expressions. bytes.translate() doesn't work with dict, and str.translate() is slower than replace() or re.sub().

The code for generating test data is not the kind of the code that should be used in tutorials. It is highly optimized code that uses different optimization tricks that could be hard to understand without comments. But nothing unpythonic. It can be simplified if avoid formatting the data in standard FASTA format.

> I would add another kind of question: is it stressing something useful that isn't already stressed by the two other regex benchmarks we already have?

Yes, it is. The regex_v8 benchmark is 2x faster with regex than with re. But the regex_dna benchmark is 1.6x slower with regex than with re. Thus these tests are stressing different aspects of regular expressions.

It may be worth also to test regular expressions with unicode strings. I expect some difference with latest Python and earlier 3.x and 2.7. The question is how to do this? Add a special option to switch between bytes and unicode (as --force_bytes in regex_effbot), or just run tests for bytes and unicode sequentially and add results?
History
Date User Action Args
2016-03-01 10:02:30serhiy.storchakasetrecipients: + serhiy.storchaka, brett.cannon, terry.reedy, pitrou
2016-03-01 10:02:30serhiy.storchakasetmessageid: <1456826550.21.0.5273202494.issue26436@psf.upfronthosting.co.za>
2016-03-01 10:02:30serhiy.storchakalinkissue26436 messages
2016-03-01 10:02:29serhiy.storchakacreate