dict creation performance regression #60669

phihag · 2012-11-13T16:19:04Z

BPO	16465
Nosy	@tim-one, @rhettinger, @jcea, @pitrou, @vstinner, @tiran, @benjaminp, @asvetlov, @serhiy-storchaka, @MojoVampire
Superseder	bpo-23601: use small object allocator for dict key storage
Files	dict_free_key_list.patch: Patch for using the free list for keys

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/serhiy-storchaka'
closed_at = <Date 2016-03-11.13:02:04.158>
created_at = <Date 2012-11-13.16:19:04.141>
labels = ['interpreter-core', 'performance']
title = 'dict creation performance regression'
updated_at = <Date 2016-03-11.15:33:52.236>
user = 'https://bugs.python.org/phihag'

bugs.python.org fields:

activity = <Date 2016-03-11.15:33:52.236>
actor = 'vstinner'
assignee = 'serhiy.storchaka'
closed = True
closed_date = <Date 2016-03-11.13:02:04.158>
closer = 'serhiy.storchaka'
components = ['Interpreter Core']
creation = <Date 2012-11-13.16:19:04.141>
creator = 'phihag'
dependencies = []
files = ['27983']
hgrepos = []
issue_num = 16465
keywords = ['patch', '3.3regression']
message_count = 23.0
messages = ['175503', '175504', '175507', '175511', '175528', '175531', '175546', '175551', '175561', '175563', '175567', '175581', '175583', '175586', '175615', '183211', '205362', '205363', '205397', '224937', '224939', '261565', '261576']
nosy_count = 13.0
nosy_names = ['tim.peters', 'rhettinger', 'jcea', 'pitrou', 'vstinner', 'christian.heimes', 'benjamin.peterson', 'phihag', 'asvetlov', 'pingebretson', 'jcon', 'serhiy.storchaka', 'josh.r']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = '23601'
type = 'performance'
url = 'https://bugs.python.org/issue16465'
versions = ['Python 3.5']

phihag · 2012-11-13T16:19:04Z

On my system, {} has become significantly slower in 3.3:

$ python3.2 -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0314 usec per loop
$ python3.3 -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0892 usec per loop
$ hg id -i
ee7b713fec71+
$ ./python -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0976 usec per loop

Is this because of the dict randomization?

serhiy-storchaka · 2012-11-13T16:43:59Z

I confirm that.

$ ./python -m timeit -n 1000000 '{};{};{};{};{};{};{};{};{};{}'

2.6: 0.62 usec
2.7: 0.57 usec
3.1: 0.55 usec
3.2: 0.48 usec
3.3: 1.5 usec

Randomization is not affecting it.

serhiy-storchaka · 2012-11-13T17:03:29Z

Ah, this is an effect of PEP-412. The difference exists only for creating dicts up to 5 items (inclusive).

serhiy-storchaka · 2012-11-13T17:34:45Z

May be using the free list for keys will restore performance.

pitrou · 2012-11-13T22:32:28Z

Does this regression impact any real-world program?

rhettinger · 2012-11-14T01:23:33Z

Does this regression impact any real-world program?

That is a blow-off response. A huge swath of the language is affected by dictionary performance (keyword args, module lookups, attribute lookup, etc). Most programs will be affected to some degree -- computationally bound programs will notice more and i/o bound programs won't notice at all.

pitrou · 2012-11-14T07:20:56Z

> Does this regression impact any real-world program?

That is a blow-off response. A huge swath of the language is affected
by dictionary performance (keyword args, module lookups, attribute
lookup, etc).

I was merely suggesting to report actual (non-micro) benchmark numbers.
This issue is about dict creation, not dict lookups.

serhiy-storchaka · 2012-11-14T08:46:54Z

As I understand, the new dict created on every call of function with keyword arguments. This slow down every such call about 0.1 µsec. This is about 10% of int('42', base=16). In the sum, some programs can slow down to a few percents.

Direct comparison of 3.2 and 3.3 is meaningless because of the influence of other factors (first of all PEP-393).

pitrou · 2012-11-14T11:13:33Z

As I understand, the new dict created on every call of function with
keyword arguments. This slow down every such call about 0.1 µsec.
This is about 10% of int('42', base=16).

Ok, but int('42', base=16) is about the fastest function call with
keyword arguments one can think about :-) In other words, the overhead
will probably not be noticeable for most function calls.

serhiy-storchaka · 2012-11-14T11:20:07Z

Ok, but int('42', base=16) is about the fastest function call with
keyword arguments one can think about :-)

Not as fast as a call without keywords, int('42', 16). :-( But this is a different issue.

serhiy-storchaka · 2012-11-14T13:13:23Z

Here is a really simple patch. It speed up to 1.9x an empty dict creation (still 1.6x slower than 3.2). Make your measurements of real-world programs.

pitrou · 2012-11-14T18:56:32Z

Ok, here are some benchmark results with your patch:

### call_method ###
Min: 0.313816 -> 0.305150: 1.03x faster
Avg: 0.316292 -> 0.305678: 1.03x faster
Significant (t=21.07)
Stddev: 0.00188 -> 0.00050: 3.7398x smaller
Timeline: http://tinyurl.com/couzwbe

### call_method_unknown ###
Min: 0.323125 -> 0.312034: 1.04x faster
Avg: 0.325697 -> 0.313115: 1.04x faster
Significant (t=17.62)
Stddev: 0.00258 -> 0.00099: 2.6059x smaller
Timeline: http://tinyurl.com/clwacj3

### call_simple ###
Min: 0.215008 -> 0.220773: 1.03x slower
Avg: 0.215666 -> 0.221657: 1.03x slower
Significant (t=-23.28)
Stddev: 0.00062 -> 0.00078: 1.2721x larger
Timeline: http://tinyurl.com/cv6gxft

### fastpickle ###
Min: 0.554353 -> 0.569768: 1.03x slower
Avg: 0.558192 -> 0.574705: 1.03x slower
Significant (t=-6.90)
Stddev: 0.00363 -> 0.00393: 1.0829x larger
Timeline: http://tinyurl.com/c8s2dqk

### float ###
Min: 0.292259 -> 0.279563: 1.05x faster
Avg: 0.296011 -> 0.286154: 1.03x faster
Significant (t=2.99)
Stddev: 0.00417 -> 0.00609: 1.4585x larger
Timeline: http://tinyurl.com/c34ofe5

### iterative_count ###
Min: 0.115091 -> 0.118527: 1.03x slower
Avg: 0.116174 -> 0.119266: 1.03x slower
Significant (t=-6.82)
Stddev: 0.00068 -> 0.00075: 1.1148x larger
Timeline: http://tinyurl.com/bsmuuso

### json_dump_v2 ###
Min: 2.591838 -> 2.652666: 1.02x slower
Avg: 2.599335 -> 2.687225: 1.03x slower
Significant (t=-8.09)
Stddev: 0.00715 -> 0.02322: 3.2450x larger
Timeline: http://tinyurl.com/clxutkt

### json_load ###
Min: 0.378329 -> 0.370132: 1.02x faster
Avg: 0.379840 -> 0.371222: 1.02x faster
Significant (t=13.68)
Stddev: 0.00112 -> 0.00085: 1.3237x smaller
Timeline: http://tinyurl.com/c7ycte2

### nbody ###
Min: 0.259945 -> 0.251294: 1.03x faster
Avg: 0.261186 -> 0.251843: 1.04x faster
Significant (t=14.95)
Stddev: 0.00125 -> 0.00062: 2.0274x smaller
Timeline: http://tinyurl.com/cg8z6yg

### nqueens ###
Min: 0.295304 -> 0.275767: 1.07x faster
Avg: 0.298382 -> 0.278839: 1.07x faster
Significant (t=10.19)
Stddev: 0.00244 -> 0.00353: 1.4429x larger
Timeline: http://tinyurl.com/bo2dn4x

### pathlib ###
Min: 0.100494 -> 0.102371: 1.02x slower
Avg: 0.101787 -> 0.104876: 1.03x slower
Significant (t=-8.15)
Stddev: 0.00103 -> 0.00159: 1.5432x larger
Timeline: http://tinyurl.com/cb9w79t

### richards ###
Min: 0.166939 -> 0.172828: 1.04x slower
Avg: 0.167945 -> 0.174199: 1.04x slower
Significant (t=-9.65)
Stddev: 0.00092 -> 0.00112: 1.2241x larger
Timeline: http://tinyurl.com/cdros5x

### silent_logging ###
Min: 0.070485 -> 0.066857: 1.05x faster
Avg: 0.070538 -> 0.067041: 1.05x faster
Significant (t=44.64)
Stddev: 0.00003 -> 0.00017: 5.4036x larger
Timeline: http://tinyurl.com/buz5h9j

### 2to3 ###
0.490925 -> 0.478927: 1.03x faster

### chameleon ###
Min: 0.050694 -> 0.049364: 1.03x faster
Avg: 0.051030 -> 0.049759: 1.03x faster
Significant (t=8.45)
Stddev: 0.00041 -> 0.00041: 1.0036x larger
Timeline: b'http://tinyurl.com/d5czdjt'

### mako ###
Min: 0.183295 -> 0.177191: 1.03x faster
Avg: 0.184944 -> 0.179300: 1.03x faster
Significant (t=15.54)
Stddev: 0.00119 -> 0.00137: 1.1555x larger
Timeline: b'http://tinyurl.com/dx6xrvx'

In the end, it's mostly a wash.

jcea · 2012-11-14T19:44:33Z

Antoine, I would consider this a performance regression to solve for 3.3.1. Small dictionary creation is everywhere in CPython.

pitrou · 2012-11-14T20:13:21Z

Antoine, I would consider this a performance regression to solve for
3.3.1. Small dictionary creation is everywhere in CPython.

Again, feel free to provide real-world benchmark numbers proving the
regression. I've already posted some figures.

serhiy-storchaka · 2012-11-15T12:07:28Z

In the end, it's mostly a wash.

Yes, it's predictable. The gain is too small and undistinguished on a background of random noise. May be more long-running benchmark will show more reliable results, but in any case, the gain is small.

My long-running hard-optimized simulation is about 5% faster on 3.2 or patched 3.4 comparing to 3.3. But this is not a typical program.

tiran · 2013-02-28T11:06:08Z

The patch gives a measurable speedup (./python is a patched 3.3.0+). IMO we should apply it. It's small and I can see no harm, too.

$ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'"; echo $cmd; eval $cmd; done
python2.7 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.162 usec per loop
python3.2 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.142 usec per loop
python3.3 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.669 usec per loop
./python -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.381 usec per loop

$ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 'int(\"1\", base=16)'"; echo $cmd; eval $cmd; done
python2.7 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.268 usec per loop
python3.2 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.302 usec per loop
python3.3 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.477 usec per loop
./python -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.356 usec per loop

serhiy-storchaka · 2013-12-06T11:44:43Z

So what we should do with this?

pitrou · 2013-12-06T11:48:39Z

You said it yourself: """The gain is too small and undistinguished on a background of random noise""". Closing would be reasonable, unless someone exhibits a reasonable benchmark where there is a significant difference.

In any case, it's too late for perf improvements on 3.4.

rhettinger · 2013-12-06T19:35:10Z

This patch looks correct and like the right thing to do. I recommend applying it to 3.5.

serhiy-storchaka · 2014-08-06T15:04:34Z

Antoine, are you still oppose to this patch?

pitrou · 2014-08-06T15:26:16Z

The patch adds complication to the already complicated memory management of dicts. It increases maintenance burden in critical code.

Have we found any case where it makes a tangible difference?
(I'm not talking about timeit micro-benchmarks)

serhiy-storchaka · 2016-03-11T13:02:04Z

Closed in the favor of bpo-23601.

vstinner · 2016-03-11T15:33:52Z

Serhiy Storchaka:

Closed in the favor of bpo-23601.

Cool!

phihag mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Nov 13, 2012

serhiy-storchaka self-assigned this Aug 6, 2014

serhiy-storchaka closed this as completed Mar 11, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dict creation performance regression #60669

dict creation performance regression #60669

phihag mannequin commented Nov 13, 2012

phihag mannequin commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

pitrou commented Nov 13, 2012

rhettinger commented Nov 14, 2012

pitrou commented Nov 14, 2012

serhiy-storchaka commented Nov 14, 2012

pitrou commented Nov 14, 2012

serhiy-storchaka commented Nov 14, 2012

serhiy-storchaka commented Nov 14, 2012

pitrou commented Nov 14, 2012

jcea commented Nov 14, 2012

pitrou commented Nov 14, 2012

serhiy-storchaka commented Nov 15, 2012

tiran commented Feb 28, 2013

serhiy-storchaka commented Dec 6, 2013

pitrou commented Dec 6, 2013

rhettinger commented Dec 6, 2013

serhiy-storchaka commented Aug 6, 2014

pitrou commented Aug 6, 2014

serhiy-storchaka commented Mar 11, 2016

vstinner commented Mar 11, 2016

dict creation performance regression #60669

dict creation performance regression #60669

Comments

phihag mannequin commented Nov 13, 2012

phihag mannequin commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

pitrou commented Nov 13, 2012

rhettinger commented Nov 14, 2012

pitrou commented Nov 14, 2012

serhiy-storchaka commented Nov 14, 2012

pitrou commented Nov 14, 2012

serhiy-storchaka commented Nov 14, 2012

serhiy-storchaka commented Nov 14, 2012

pitrou commented Nov 14, 2012

jcea commented Nov 14, 2012

pitrou commented Nov 14, 2012

serhiy-storchaka commented Nov 15, 2012

tiran commented Feb 28, 2013

serhiy-storchaka commented Dec 6, 2013

pitrou commented Dec 6, 2013

rhettinger commented Dec 6, 2013

serhiy-storchaka commented Aug 6, 2014

pitrou commented Aug 6, 2014

serhiy-storchaka commented Mar 11, 2016

vstinner commented Mar 11, 2016