This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
Date 2013-10-23.14:36:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1382538962.45.0.421817949827.issue19365@psf.upfronthosting.co.za>
In-reply-to
Content
sre_parse.parse_template uses string concatenation to accumulate characters.

    def literal(literal, p=p, pappend=a):
        if p and p[-1][0] is LITERAL:
            p[-1] = LITERAL, p[-1][1] + literal
        else:
            pappend((LITERAL, literal))

This operation have quadratic calculation complexity for long replacement strings.

$ ./python -m timeit -n1 -r1 -s "from sre_parse import parse_template; repl = 'x'*100000"  "parse_template(repl, '')"
1 loops, best of 1: 3.38 sec per loop
$ ./python -m timeit -n1 -r1 -s "from sre_parse import parse_template; repl = 'x'*200000"  "parse_template(repl, '')"
1 loops, best of 1: 18.2 sec per loop

The proposed patch change amortized complexity to be linear. It also speeds up parsing shorter strings.

$ ./python -m timeit -n1 -r1 -s "from sre_parse import parse_template; repl = 'x'*100000"  "parse_template(repl, '')"
1 loops, best of 1: 915 msec per loop
$ ./python -m timeit -n1 -r1 -s "from sre_parse import parse_template; repl = 'x'*200000"  "parse_template(repl, '')"
1 loops, best of 1: 1.79 sec per loop
History
Date User Action Args
2013-10-23 14:36:02serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, ezio.melotti, mrabarnett
2013-10-23 14:36:02serhiy.storchakasetmessageid: <1382538962.45.0.421817949827.issue19365@psf.upfronthosting.co.za>
2013-10-23 14:36:02serhiy.storchakalinkissue19365 messages
2013-10-23 14:36:02serhiy.storchakacreate