Currently list concatenation, slicing and repeating operations are using PyList_New() which allocates memory for the items by calloc(). malloc() could be used instead, since the allocated memory is overwritten by mentioned operations.
I made benchmarks with this script:
NAME=list-malloc-master.json
python -m perf timeit --name slice0 -s "l = [None]*1000000" "l[:0]" --duplicate=2048 --append $NAME
python -m perf timeit --name slice1 -s "l = [None]*1000000" "l[:1]" --duplicate=1024 --append $NAME
python -m perf timeit --name slice2 -s "l = [None]*1000000" "l[:2]" --duplicate=1024 --append $NAME
python -m perf timeit --name slice3 -s "l = [None]*1000000" "l[:3]" --duplicate=1024 --append $NAME
python -m perf timeit --name slice1000000 -s "l = [None]*1000000" "l[:1000000]" --append $NAME
python -m perf timeit --name cat0 -s "l = [None]*0" "l + l" --duplicate=1024 --append $NAME
python -m perf timeit --name cat1 -s "l = [None]*1" "l * 1" --duplicate=1024 --append $NAME
python -m perf timeit --name cat2 -s "l = [None]*2" "l * 1" --duplicate=1024 --append $NAME
python -m perf timeit --name cat3 -s "l = [None]*3" "l * 1" --duplicate=1024 --append $NAME
python -m perf timeit --name cat1000000 -s "l = [None]*1000000" "l * 1" --append $NAME
python -m perf timeit --name 1x0 -s "l = [None]" "l * 0" --duplicate=1024 --append $NAME
python -m perf timeit --name 1x1 -s "l = [None]" "l * 1" --duplicate=1024 --append $NAME
python -m perf timeit --name 1x2 -s "l = [None]" "l * 2" --duplicate=1024 --append $NAME
python -m perf timeit --name 1x3 -s "l = [None]" "l * 3" --duplicate=1024 --append $NAME
python -m perf timeit --name 1x1000000 -s "l = [None]" "l * 1000000" --append $NAME
Here's comparison table:
+--------------+--------------------+------------------------------+
| Benchmark | list-malloc-master | list-malloc |
+==============+====================+==============================+
| slice1 | 84.5 ns | 59.6 ns: 1.42x faster (-30%) |
+--------------+--------------------+------------------------------+
| slice2 | 71.6 ns | 61.8 ns: 1.16x faster (-14%) |
+--------------+--------------------+------------------------------+
| slice3 | 74.4 ns | 63.6 ns: 1.17x faster (-15%) |
+--------------+--------------------+------------------------------+
| slice1000000 | 4.39 ms | 4.08 ms: 1.08x faster (-7%) |
+--------------+--------------------+------------------------------+
| cat0 | 23.9 ns | 24.9 ns: 1.04x slower (+4%) |
+--------------+--------------------+------------------------------+
| cat1 | 73.2 ns | 51.9 ns: 1.41x faster (-29%) |
+--------------+--------------------+------------------------------+
| cat2 | 61.6 ns | 53.1 ns: 1.16x faster (-14%) |
+--------------+--------------------+------------------------------+
| cat3 | 63.0 ns | 54.3 ns: 1.16x faster (-14%) |
+--------------+--------------------+------------------------------+
| cat1000000 | 4.38 ms | 4.08 ms: 1.07x faster (-7%) |
+--------------+--------------------+------------------------------+
| 1x0 | 27.1 ns | 27.7 ns: 1.02x slower (+2%) |
+--------------+--------------------+------------------------------+
| 1x1 | 72.9 ns | 51.9 ns: 1.41x faster (-29%) |
+--------------+--------------------+------------------------------+
| 1x2 | 60.9 ns | 52.9 ns: 1.15x faster (-13%) |
+--------------+--------------------+------------------------------+
| 1x3 | 62.5 ns | 54.8 ns: 1.14x faster (-12%) |
+--------------+--------------------+------------------------------+
| 1x1000000 | 2.67 ms | 2.34 ms: 1.14x faster (-12%) |
+--------------+--------------------+------------------------------+
Not significant (1): slice0
|