classification
Title: dict creation performance regression
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.5
process
Status: closed Resolution: out of date
Dependencies: Superseder: use small object allocator for dict key storage
View: 23601
Assigned To: serhiy.storchaka Nosy List: asvetlov, benjamin.peterson, christian.heimes, jcea, jcon, josh.r, phihag, pingebretson, pitrou, rhettinger, serhiy.storchaka, tim.peters, vstinner
Priority: normal Keywords: 3.3regression, patch

Created on 2012-11-13 16:19 by phihag, last changed 2016-03-11 15:33 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
dict_free_key_list.patch serhiy.storchaka, 2012-11-14 13:13 Patch for using the free list for keys review
Messages (23)
msg175503 - (view) Author: Philipp Hagemeister (phihag) * Date: 2012-11-13 16:19
On my system, {}  has become significantly slower in 3.3:

$ python3.2 -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0314 usec per loop
$ python3.3 -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0892 usec per loop
$ hg id -i
ee7b713fec71+
$ ./python -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0976 usec per loop

Is this because of the dict randomization?
msg175504 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-13 16:43
I confirm that.

$ ./python -m timeit -n 1000000 '{};{};{};{};{};{};{};{};{};{}'

2.6: 0.62 usec
2.7: 0.57 usec
3.1: 0.55 usec
3.2: 0.48 usec
3.3: 1.5 usec

Randomization is not affecting it.
msg175507 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-13 17:03
Ah, this is an effect of PEP 412.  The difference exists only for creating dicts up to 5 items (inclusive).
msg175511 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-13 17:34
May be using the free list for keys will restore performance.
msg175528 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-13 22:32
Does this regression impact any real-world program?
msg175531 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-11-14 01:23
> Does this regression impact any real-world program?

That is a blow-off response.  A huge swath of the language is affected by dictionary performance (keyword args, module lookups, attribute lookup, etc).  Most programs will be affected to some degree -- computationally bound programs will notice more and i/o bound programs won't notice at all.
msg175546 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-14 07:20
> > Does this regression impact any real-world program?
> 
> That is a blow-off response.  A huge swath of the language is affected
> by dictionary performance (keyword args, module lookups, attribute
> lookup, etc).

I was merely suggesting to report actual (non-micro) benchmark numbers.
This issue is about dict creation, not dict lookups.
msg175551 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-14 08:46
As I understand, the new dict created on every call of function with keyword arguments.  This slow down every such call about 0.1 µsec.  This is about 10% of int('42', base=16).  In the sum, some programs can slow down to a few percents.

Direct comparison of 3.2 and 3.3 is meaningless because of the influence of other factors (first of all PEP 393).
msg175561 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-14 11:13
> As I understand, the new dict created on every call of function with
> keyword arguments.  This slow down every such call about 0.1 µsec.
> This is about 10% of int('42', base=16).

Ok, but `int('42', base=16)` is about the fastest function call with
keyword arguments one can think about :-) In other words, the overhead
will probably not be noticeable for most function calls.
msg175563 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-14 11:20
> Ok, but `int('42', base=16)` is about the fastest function call with
> keyword arguments one can think about :-)

Not as fast as a call without keywords, `int('42', 16)`. :-(  But this is a different issue.
msg175567 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-14 13:13
Here is a really simple patch.  It speed up to 1.9x an empty dict creation (still 1.6x slower than 3.2).  Make your measurements of real-world programs.
msg175581 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-14 18:56
Ok, here are some benchmark results with your patch:

### call_method ###
Min: 0.313816 -> 0.305150: 1.03x faster
Avg: 0.316292 -> 0.305678: 1.03x faster
Significant (t=21.07)
Stddev: 0.00188 -> 0.00050: 3.7398x smaller
Timeline: http://tinyurl.com/couzwbe

### call_method_unknown ###
Min: 0.323125 -> 0.312034: 1.04x faster
Avg: 0.325697 -> 0.313115: 1.04x faster
Significant (t=17.62)
Stddev: 0.00258 -> 0.00099: 2.6059x smaller
Timeline: http://tinyurl.com/clwacj3

### call_simple ###
Min: 0.215008 -> 0.220773: 1.03x slower
Avg: 0.215666 -> 0.221657: 1.03x slower
Significant (t=-23.28)
Stddev: 0.00062 -> 0.00078: 1.2721x larger
Timeline: http://tinyurl.com/cv6gxft

### fastpickle ###
Min: 0.554353 -> 0.569768: 1.03x slower
Avg: 0.558192 -> 0.574705: 1.03x slower
Significant (t=-6.90)
Stddev: 0.00363 -> 0.00393: 1.0829x larger
Timeline: http://tinyurl.com/c8s2dqk

### float ###
Min: 0.292259 -> 0.279563: 1.05x faster
Avg: 0.296011 -> 0.286154: 1.03x faster
Significant (t=2.99)
Stddev: 0.00417 -> 0.00609: 1.4585x larger
Timeline: http://tinyurl.com/c34ofe5

### iterative_count ###
Min: 0.115091 -> 0.118527: 1.03x slower
Avg: 0.116174 -> 0.119266: 1.03x slower
Significant (t=-6.82)
Stddev: 0.00068 -> 0.00075: 1.1148x larger
Timeline: http://tinyurl.com/bsmuuso

### json_dump_v2 ###
Min: 2.591838 -> 2.652666: 1.02x slower
Avg: 2.599335 -> 2.687225: 1.03x slower
Significant (t=-8.09)
Stddev: 0.00715 -> 0.02322: 3.2450x larger
Timeline: http://tinyurl.com/clxutkt

### json_load ###
Min: 0.378329 -> 0.370132: 1.02x faster
Avg: 0.379840 -> 0.371222: 1.02x faster
Significant (t=13.68)
Stddev: 0.00112 -> 0.00085: 1.3237x smaller
Timeline: http://tinyurl.com/c7ycte2

### nbody ###
Min: 0.259945 -> 0.251294: 1.03x faster
Avg: 0.261186 -> 0.251843: 1.04x faster
Significant (t=14.95)
Stddev: 0.00125 -> 0.00062: 2.0274x smaller
Timeline: http://tinyurl.com/cg8z6yg

### nqueens ###
Min: 0.295304 -> 0.275767: 1.07x faster
Avg: 0.298382 -> 0.278839: 1.07x faster
Significant (t=10.19)
Stddev: 0.00244 -> 0.00353: 1.4429x larger
Timeline: http://tinyurl.com/bo2dn4x

### pathlib ###
Min: 0.100494 -> 0.102371: 1.02x slower
Avg: 0.101787 -> 0.104876: 1.03x slower
Significant (t=-8.15)
Stddev: 0.00103 -> 0.00159: 1.5432x larger
Timeline: http://tinyurl.com/cb9w79t

### richards ###
Min: 0.166939 -> 0.172828: 1.04x slower
Avg: 0.167945 -> 0.174199: 1.04x slower
Significant (t=-9.65)
Stddev: 0.00092 -> 0.00112: 1.2241x larger
Timeline: http://tinyurl.com/cdros5x

### silent_logging ###
Min: 0.070485 -> 0.066857: 1.05x faster
Avg: 0.070538 -> 0.067041: 1.05x faster
Significant (t=44.64)
Stddev: 0.00003 -> 0.00017: 5.4036x larger
Timeline: http://tinyurl.com/buz5h9j

### 2to3 ###
0.490925 -> 0.478927: 1.03x faster

### chameleon ###
Min: 0.050694 -> 0.049364: 1.03x faster
Avg: 0.051030 -> 0.049759: 1.03x faster
Significant (t=8.45)
Stddev: 0.00041 -> 0.00041: 1.0036x larger
Timeline: b'http://tinyurl.com/d5czdjt'

### mako ###
Min: 0.183295 -> 0.177191: 1.03x faster
Avg: 0.184944 -> 0.179300: 1.03x faster
Significant (t=15.54)
Stddev: 0.00119 -> 0.00137: 1.1555x larger
Timeline: b'http://tinyurl.com/dx6xrvx'


In the end, it's mostly a wash.
msg175583 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-11-14 19:44
Antoine, I would consider this a performance regression to solve for 3.3.1. Small dictionary creation is everywhere in CPython.
msg175586 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-14 20:13
> Antoine, I would consider this a performance regression to solve for
> 3.3.1. Small dictionary creation is everywhere in CPython.

Again, feel free to provide real-world benchmark numbers proving the
regression. I've already posted some figures.
msg175615 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-15 12:07
> In the end, it's mostly a wash.

Yes, it's predictable. The gain is too small and undistinguished on a background of random noise. May be more long-running benchmark will show more reliable results, but in any case, the gain is small.

My long-running hard-optimized simulation is about 5% faster on 3.2 or patched 3.4 comparing to 3.3. But this is not a typical program.
msg183211 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-02-28 11:06
The patch gives a measurable speedup (./python is a patched 3.3.0+). IMO we should apply it. It's small and I can see no harm, too.

$ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'"; echo $cmd; eval $cmd; done
python2.7 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.162 usec per loop
python3.2 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.142 usec per loop
python3.3 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.669 usec per loop
./python -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.381 usec per loop

$ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 'int(\"1\", base=16)'"; echo $cmd; eval $cmd; done
python2.7 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.268 usec per loop
python3.2 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.302 usec per loop
python3.3 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.477 usec per loop
./python -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.356 usec per loop
msg205362 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-06 11:44
So what we should do with this?
msg205363 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-06 11:48
You said it yourself: """The gain is too small and undistinguished on a background of random noise""". Closing would be reasonable, unless someone exhibits a reasonable benchmark where there is a significant difference.

In any case, it's too late for perf improvements on 3.4.
msg205397 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-12-06 19:35
This patch looks correct and like the right thing to do.  I recommend applying it to 3.5.
msg224937 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-08-06 15:04
Antoine, are you still oppose to this patch?
msg224939 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-06 15:26
The patch adds complication to the already complicated memory management of dicts. It increases maintenance burden in critical code.

Have we found any case where it makes a tangible difference?
(I'm not talking about timeit micro-benchmarks)
msg261565 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-03-11 13:02
Closed in the favor of issue23601.
msg261576 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-03-11 15:33
Serhiy Storchaka:
> Closed in the favor of issue23601.

Cool!
History
Date User Action Args
2016-03-11 15:33:52vstinnersetmessages: + msg261576
2016-03-11 13:02:04serhiy.storchakasetstatus: open -> closed
superseder: use small object allocator for dict key storage
messages: + msg261565

resolution: out of date
stage: patch review -> resolved
2014-08-06 15:26:16pitrousetnosy: + tim.peters
messages: + msg224939
2014-08-06 15:04:34serhiy.storchakasetassignee: serhiy.storchaka
messages: + msg224937
2014-03-06 22:43:15josh.rsetnosy: + josh.r
2014-02-15 06:09:43pingebretsonsetnosy: + pingebretson
2013-12-06 19:35:10rhettingersetmessages: + msg205397
2013-12-06 11:48:39pitrousetmessages: + msg205363
versions: + Python 3.5, - Python 3.4
2013-12-06 11:44:43serhiy.storchakasetmessages: + msg205362
2013-02-28 11:06:08christian.heimessetnosy: + christian.heimes
messages: + msg183211
2013-01-09 23:04:19jconsetnosy: + jcon
2012-11-17 14:15:18asvetlovsetnosy: + asvetlov
2012-11-16 23:30:29vstinnersetnosy: + vstinner
2012-11-15 12:07:28serhiy.storchakasetmessages: + msg175615
2012-11-14 20:13:21pitrousetmessages: + msg175586
2012-11-14 19:44:33jceasetmessages: + msg175583
2012-11-14 18:56:33pitrousetpriority: high -> normal

messages: + msg175581
versions: - Python 3.3
2012-11-14 17:42:59jceasetnosy: + jcea
2012-11-14 13:13:23serhiy.storchakasetfiles: + dict_free_key_list.patch
keywords: + patch
messages: + msg175567

stage: patch review
2012-11-14 11:20:07serhiy.storchakasetmessages: + msg175563
2012-11-14 11:13:32pitrousetmessages: + msg175561
2012-11-14 08:46:54serhiy.storchakasetmessages: + msg175551
2012-11-14 07:20:56pitrousetmessages: + msg175546
2012-11-14 01:23:33rhettingersetpriority: normal -> high
nosy: + rhettinger
messages: + msg175531

2012-11-13 22:32:28pitrousetmessages: + msg175528
2012-11-13 17:34:45serhiy.storchakasetnosy: + pitrou, benjamin.peterson
messages: + msg175511
2012-11-13 17:03:28serhiy.storchakasetmessages: + msg175507
2012-11-13 16:43:59serhiy.storchakasetkeywords: + 3.3regression
nosy: + serhiy.storchaka
messages: + msg175504

2012-11-13 16:19:04phihagcreate