Author gvanrossum
Recipients gvanrossum
Date 2013-02-09.15:59:30
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1360425571.01.0.152514473227.issue17170@psf.upfronthosting.co.za>
In-reply-to
Content
I'm trying to speed up a web template engine and I find that the code needs to do a lot of string replacements of this form:

  name = name.replace('_', '-')

Characteristics of the data: the names are relatively short (1-10 characters usually), and the majority don't contain a '_' at all.

For this combination I've found that the following idiom is significantly faster:

  if '_' in name:
      name = name.replace('_', '-')

I'd hate for that idiom to become popular.  I looked at the code (in the default branch) briefly, but it is already optimized for this case.  So I am at a bit of a loss to explain the speed difference...

Some timeit experiments:

bash-3.2$ ./python.exe -m timeit -s "a = 'hundred'" "'x' in a"
./python.exe -m timeit -s "a = 'hundred'" "'x' in a"

bash-3.2$ ./python.exe -m timeit -s "a = 'hundred'" "a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hundred'" "a.replace('x', 'y')"

bash-3.2$ ./python.exe -m timeit -s "a = 'hundred'" "if 'x' in a: a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hundred'" "if 'x' in a: a.replace('x', 'y')"

bash-3.2$ ./python.exe -m timeit -s "a = 'hunxred'" "a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hunxred'" "a.replace('x', 'y')"

bash-3.2$ ./python.exe -m timeit -s "a = 'hunxred'" "if 'x' in a: a.replace('x', 'y')"
./python.exe -m timeit -s "a = 'hunxred'" "if 'x' in a: a.replace('x', 'y')"
History
Date User Action Args
2013-02-09 15:59:31gvanrossumsetrecipients: + gvanrossum
2013-02-09 15:59:31gvanrossumsetmessageid: <1360425571.01.0.152514473227.issue17170@psf.upfronthosting.co.za>
2013-02-09 15:59:30gvanrossumlinkissue17170 messages
2013-02-09 15:59:30gvanrossumcreate