This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients BreamoreBoy, ezio.melotti, kushal.das, loewis, pitrou, serhiy.storchaka, thomaslee, vstinner
Date 2012-10-13.18:03:27
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <201210132103.09737.storchaka@gmail.com>
In-reply-to <1350136906.89.0.860941780143.issue16061@psf.upfronthosting.co.za>
Content
After much experimentation, I suggest the new patch.

Benchmark results (time of replacing 1 of n character (ch1 to ch2) in 100000-
char string).

Py3.2        Py3.3        patch  n ch1 ch2 fill

231 (-13%)   3025 (-93%)  200    1 'a' 'b' 'c'
626 (-18%)   2035 (-75%)  511    2 'a' 'b' 'c'
444 (-26%)   957 (-66%)   327    5 'a' 'b' 'c'
349 (-30%)   530 (-54%)   243    10 'a' 'b' 'c'
306 (-40%)   300 (-38%)   185    20 'a' 'b' 'c'
280 (-54%)   169 (-23%)   130    50 'a' 'b' 'c'
273 (-62%)   123 (-15%)   105    100 'a' 'b' 'c'
265 (-70%)   82 (-4%)     79     1000 'a' 'b' 'c'
230 (+4%)    3012 (-92%)  239    1 '\u010a' '\u010b' '\u010c'
624 (-17%)   1907 (-73%)  518    2 '\u010a' '\u010b' '\u010c'
442 (-16%)   962 (-62%)   370    5 '\u010a' '\u010b' '\u010c'
347 (-5%)    566 (-42%)   330    10 '\u010a' '\u010b' '\u010c'
305 (-10%)   357 (-23%)   275    20 '\u010a' '\u010b' '\u010c'
285 (-26%)   241 (-12%)   212    50 '\u010a' '\u010b' '\u010c'
280 (-33%)   190 (-2%)    187    100 '\u010a' '\u010b' '\u010c'
263 (-41%)   170 (-8%)    156    1000 '\u010a' '\u010b' '\u010c'
3355 (-85%)  3309 (-85%)  498    1 '\U0001000a' '\U0001000b' '\U0001000c'
2290 (-65%)  2267 (-65%)  800    2 '\U0001000a' '\U0001000b' '\U0001000c'
1598 (-62%)  1279 (-52%)  612    5 '\U0001000a' '\U0001000b' '\U0001000c'
1313 (-60%)  950 (-45%)   519    10 '\U0001000a' '\U0001000b' '\U0001000c'
1195 (-61%)  824 (-44%)   464    20 '\U0001000a' '\U0001000b' '\U0001000c'
1055 (-59%)  640 (-32%)   434    50 '\U0001000a' '\U0001000b' '\U0001000c'
982 (-55%)   549 (-20%)   439    100 '\U0001000a' '\U0001000b' '\U0001000c'
941 (-56%)   473 (-12%)   417    1000 '\U0001000a' '\U0001000b' '\U0001000c'

On other platforms other numbers are possible. Especially I'm interested in 
the results on Windows and on 64-bit. For the test I used the script 
replacebench2.py, and compared the results with the help of script 
https://bitbucket.org/storchaka/cpython-stuff/raw/default/bench/bench-diff.py 
.
Files
File name Uploaded
str_replace_1char.patch serhiy.storchaka, 2012-10-13.18:03:27
History
Date User Action Args
2012-10-13 18:03:29serhiy.storchakasetrecipients: + serhiy.storchaka, loewis, pitrou, vstinner, thomaslee, ezio.melotti, BreamoreBoy, kushal.das
2012-10-13 18:03:28serhiy.storchakalinkissue16061 messages
2012-10-13 18:03:27serhiy.storchakacreate