This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients BreamoreBoy, benjamin.peterson, ezio.melotti, kushal.das, loewis, pitrou, serhiy.storchaka, thomaslee, vstinner
Date 2012-12-31.11:21:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <201212311321.04822.storchaka@gmail.com>
In-reply-to <1356903410.95.0.414339144986.issue16061@psf.upfronthosting.co.za>
Content
> str_replace_1char.patch: why not implementing replace_1char_inplace() in
> stringlib, with one version per character type (UCS1, UCS2, UCS4)?

Because there are no benefits to do it. All three versions (UCS1, UCS2, and 
UCS4) have no any common code. The best implementation used for every kind of 
strings. For UCS1 it uses fast memchr() (findchar() has some overhead here), 
for UCS2 it uses findchar(), and for UCS4 it uses a dumb loop, because 
findchar() will be too ineffective here.

> I prefer unicode_2.patch algorithm because it's simpler: only one loop (vs
> two loops for str_replace_1char.patch, with a threshold of 10 different
> characters).

Yes, UCS1-implementation in str_replace_1char.patch is more complicated, but 
it is faster for more input strings. memchr() is more effective than a simple 
loop when the replaceable characters are rare. But when they meet often, a 
simple cycle is more efficient. The "attempts" counter determines how many 
characters will be checked before using memchr(). This speeds up the 
replacement in strings with frequent replacements, but a little slow down the 
replacement in strings with rare replacements. 10 is a compromise. 
str_replace_1char.patch speed up not only case when *each* character replaced, 
but when 1/2, 1/3, 1/5,... characters replaced.

> Why do you changed your algorithm? Is str_replace_1char.patch algorithm
> more efficient than unicode_2.patch algorithm? Is the speedup really
> interesting?

You can run benchmarks and compare results. str_replace_1char.patch provides 
not the best performance, but most stable results for wide sort of strings, 
and has no regressions comparing with 3.2.
History
Date User Action Args
2012-12-31 11:21:27serhiy.storchakasetrecipients: + serhiy.storchaka, loewis, pitrou, vstinner, thomaslee, benjamin.peterson, ezio.melotti, BreamoreBoy, kushal.das
2012-12-31 11:21:26serhiy.storchakalinkissue16061 messages
2012-12-31 11:21:26serhiy.storchakacreate