PEP 509: Add ma_version to PyDictObject #70246

vstinner · 2016-01-09T09:30:48Z

BPO	26058
Nosy	@malemburg, @brettcannon, @pitrou, @vstinner, @bitdancer, @1st1, @jstasiak
Files	dict_version.patch dict_version-2.patch dict_version-3.patch dict_version-4.patch guard_benchmark.patch guard_benchmark.py dict_version-5.patch dict_version-6.patch dict_version-7.patch dict_version-8.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-09-08.20:07:22.991>
created_at = <Date 2016-01-09.09:30:48.408>
labels = ['type-feature']
title = 'PEP 509: Add ma_version to PyDictObject'
updated_at = <Date 2016-09-10.04:52:35.818>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2016-09-10.04:52:35.818>
actor = 'python-dev'
assignee = 'none'
closed = True
closed_date = <Date 2016-09-08.20:07:22.991>
closer = 'vstinner'
components = []
creation = <Date 2016-01-09.09:30:48.408>
creator = 'vstinner'
dependencies = []
files = ['41542', '41547', '41581', '41599', '41604', '41605', '41691', '41701', '42476', '42518']
hgrepos = []
issue_num = 26058
keywords = ['patch']
message_count = 31.0
messages = ['257809', '257810', '257814', '257818', '257819', '257820', '257848', '257875', '257880', '257961', '257992', '258138', '258139', '258151', '258820', '258824', '258826', '258867', '263414', '263525', '263726', '263727', '263728', '263729', '263730', '263731', '263732', '263787', '275132', '275133', '275566']
nosy_count = 8.0
nosy_names = ['lemburg', 'brett.cannon', 'pitrou', 'vstinner', 'r.david.murray', 'python-dev', 'yselivanov', 'jstasiak']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue26058'
versions = ['Python 3.6']

vstinner · 2016-01-09T09:30:47Z

Attached patch implements the following PEP currently discussed on python-ideas:
https://faster-cpython.readthedocs.org/pep_dict_version.html#pep-dict-version

vstinner · 2016-01-09T09:34:10Z

Result of pybench on Dict microbenchmarks:

              DictCreation:    44ms    49ms  -10.5%    44ms    49ms  -10.7%
         DictWithFloatKeys:    35ms    35ms   -0.5%    35ms    35ms   -1.0%
       DictWithIntegerKeys:    28ms    28ms   -1.2%    28ms    29ms   -2.3%
        DictWithStringKeys:    26ms    27ms   -3.2%    26ms    28ms   -4.8%
    SimpleDictManipulation:    52ms    53ms   -0.7%    53ms    53ms   -0.4%

Hum, as usuall, pybench doesn't seem reliable at all: I expect worse performance with the patch since it adds "version++" in dict.__setimte__(). I don't really trust pybench, results seem to have a lot of noise :-/

Maybe I'm not using pybench correctly? I used:

$ make distclean; ./configure && make
$ ./python Tools/pybench/pybench.py -f pybench.default
$ patch -p1 < dict_version.patch
$ ./python Tools/pybench/pybench.py -f pybench.dictversion
$ ./python Tools/pybench/pybench.py -s pybench.dictversion -c pybench.default

Full output:

-------------------------------------------------------------------------------
PYBENCH 2.1
-------------------------------------------------------------------------------

using CPython 3.6.0a0 (default:53271aa4d84c+, Jan 9 2016, 10:27:40) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
disabled garbage collection
system check interval set to maximum: 2147483647
using timer: time.perf_counter
timer: resolution=1e-09, implementation=clock_gettime(CLOCK_MONOTONIC)

-------------------------------------------------------------------------------
Benchmark: pybench.dictversion
-------------------------------------------------------------------------------

Rounds: 10
Warp:   10
Timer:  time.perf_counter

Machine Details:
   Platform ID:    Linux-4.2.5-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
   Processor:      x86_64

Python:
   Implementation: CPython
   Executable:     /home/haypo/prog/python/default/python
   Version:        3.6.0a0
   Compiler:       GCC 5.1.1 20150618 (Red Hat 5.1.1-4)
   Bits:           64bit
   Build:          Jan  9 2016 10:27:40 (#default:53271aa4d84c+)
   Unicode:        UCS4

-------------------------------------------------------------------------------
Comparing with: pybench.default
-------------------------------------------------------------------------------

Rounds: 10
Warp:   10
Timer:  time.perf_counter

Machine Details:
   Platform ID:    Linux-4.2.5-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
   Processor:      x86_64

Python:
   Implementation: CPython
   Executable:     /home/haypo/prog/python/default/python
   Version:        3.6.0a0
   Compiler:       GCC 5.1.1 20150618 (Red Hat 5.1.1-4)
   Bits:           64bit
   Build:          Jan  9 2016 10:21:57 (#default:53271aa4d84c)
   Unicode:        UCS4

Test minimum run-time average run-time
this other diff this other diff
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 42ms 42ms +0.1% 43ms 43ms -0.9%
BuiltinMethodLookup: 26ms 25ms +5.7% 26ms 25ms +5.1%
CompareFloats: 27ms 27ms -0.9% 27ms 28ms -4.2%
CompareFloatsIntegers: 60ms 63ms -3.3% 61ms 65ms -6.8%
CompareIntegers: 41ms 38ms +7.9% 41ms 38ms +7.2%
CompareInternedStrings: 30ms 28ms +5.7% 30ms 28ms +5.0%
CompareLongs: 24ms 22ms +8.6% 24ms 22ms +8.5%
CompareStrings: 22ms 22ms +0.3% 23ms 24ms -6.6%
ComplexPythonFunctionCalls: 43ms 42ms +1.2% 43ms 45ms -5.2%
ConcatStrings: 29ms 32ms -11.0% 29ms 33ms -13.6%
CreateInstances: 45ms 45ms +0.3% 46ms 46ms -0.4%
CreateNewInstances: 34ms 34ms +0.6% 35ms 34ms +0.7%
CreateStringsWithConcat: 58ms 58ms +0.1% 58ms 58ms -0.1%
DictCreation: 44ms 49ms -10.5% 44ms 49ms -10.7%
DictWithFloatKeys: 35ms 35ms -0.5% 35ms 35ms -1.0%
DictWithIntegerKeys: 28ms 28ms -1.2% 28ms 29ms -2.3%
DictWithStringKeys: 26ms 27ms -3.2% 26ms 28ms -4.8%
ForLoops: 22ms 22ms +0.4% 22ms 22ms +0.6%
IfThenElse: 34ms 34ms +0.9% 34ms 34ms +0.8%
ListSlicing: 34ms 34ms -0.2% 34ms 34ms -0.1%
NestedForLoops: 37ms 36ms +2.1% 37ms 36ms +2.1%
NestedListComprehensions: 36ms 35ms +1.4% 36ms 36ms +1.8%
NormalClassAttribute: 75ms 77ms -2.5% 75ms 77ms -2.3%
NormalInstanceAttribute: 37ms 37ms +2.2% 38ms 37ms +2.5%
PythonFunctionCalls: 37ms 36ms +1.8% 37ms 37ms +1.6%
PythonMethodCalls: 50ms 47ms +5.5% 50ms 48ms +4.5%
Recursion: 61ms 61ms -0.2% 61ms 62ms -0.2%
SecondImport: 35ms 36ms -2.9% 35ms 37ms -3.3%
SecondPackageImport: 37ms 37ms -0.3% 37ms 37ms -0.4%
SecondSubmoduleImport: 89ms 87ms +2.1% 90ms 88ms +1.6%
SimpleComplexArithmetic: 23ms 23ms +0.0% 24ms 24ms -0.1%
SimpleDictManipulation: 52ms 53ms -0.7% 53ms 53ms -0.4%
SimpleFloatArithmetic: 25ms 25ms -1.2% 25ms 25ms -1.0%
SimpleIntFloatArithmetic: 32ms 32ms -0.3% 32ms 32ms -1.1%
SimpleIntegerArithmetic: 32ms 32ms -0.6% 32ms 32ms -0.3%
SimpleListComprehensions: 29ms 28ms +2.3% 30ms 29ms +3.2%
SimpleListManipulation: 27ms 28ms -1.3% 28ms 28ms -1.4%
SimpleLongArithmetic: 22ms 22ms +3.6% 23ms 22ms +4.4%
SmallLists: 38ms 38ms -1.9% 38ms 41ms -7.0%
SmallTuples: 47ms 44ms +7.8% 47ms 44ms +7.5%
SpecialClassAttribute: 75ms 73ms +2.3% 75ms 73ms +2.2%
SpecialInstanceAttribute: 38ms 39ms -1.8% 38ms 39ms -2.1%
StringMappings: 84ms 83ms +0.8% 84ms 84ms +0.5%
StringPredicates: 49ms 49ms -0.6% 49ms 49ms -1.0%
StringSlicing: 42ms 42ms +0.8% 42ms 42ms +0.7%
TryExcept: 25ms 25ms -0.4% 25ms 25ms -0.8%
TryFinally: 31ms 31ms +1.1% 32ms 31ms +1.4%
TryRaiseExcept: 12ms 12ms -0.8% 12ms 12ms -1.4%
TupleSlicing: 43ms 43ms +0.1% 43ms 43ms -0.6%
WithFinally: 49ms 50ms -0.9% 49ms 50ms -0.9%
WithRaiseExcept: 39ms 39ms -0.6% 39ms 40ms -1.6%
-------------------------------------------------------------------------------
Totals: 2011ms 2006ms +0.2% 2025ms 2035ms -0.5%

(this=pybench.dictversion, other=pybench.default)

vstinner · 2016-01-09T09:53:06Z

Using timeit to microbenchmark dict operations (constructor, __setitem__, __delitem__, clear), I get exactly the same performance or even better performance (???) with the patch.

./python -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'

Original: 318 ns
Patched: 318 ns

./python -m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'

Original: 19.9 ms
Patched: 18.9 ms (5% faster)

vstinner · 2016-01-09T11:00:02Z

Updated patch, I forgot to include unit tests.

I added a test on integer overflow on the version. The patch adds _testcapi.dict_setversion(dict, version) and _testcapi.PY_SIZE_MAX.

pitrou · 2016-01-09T11:05:46Z

I don't understand this:

The version 0 is reserved for "missing key"

vstinner · 2016-01-09T11:09:25Z

I don't understand this:
> The version 0 is reserved for "missing key"

Oh, ignore it, it's an outdated comment :-) Dictionary version is initialized to 0, it's not more "reserved".

FYI I added the comment in my first implementation which also added a version to dictionary *entries*.

brettcannon · 2016-01-09T18:03:03Z

A better way to benchmark this is to run hg.python.org/benchmarks with a patched interpreter and see if that shows any performance impact.

vstinner · 2016-01-09T22:42:34Z

Brett:

A better way to benchmark this is to run hg.python.org/benchmarks with a patched interpreter and see if that shows any performance impact.

Ok. Here is the output. Can you please explain me the result? :-) Especially the "significant" value.

$ ./python.orig ../benchmarks/perf.py ./python.orig ./python.patched

INFO:root:Automatically selected timer: perf_counter
INFO:root:Running ./python.patched ../benchmarks/lib3/2to3/2to3 -f all ../benchmarks/lib/2to3
INFO:root:Running ./python.patched ../benchmarks/lib3/2to3/2to3 -f all ../benchmarks/lib/2to3 1 time
INFO:root:Running ./python.orig ../benchmarks/lib3/2to3/2to3 -f all ../benchmarks/lib/2to3
INFO:root:Running ./python.orig ../benchmarks/lib3/2to3/2to3 -f all ../benchmarks/lib/2to3 1 time
INFO:root:Running ./python.patched ../benchmarks/performance/bm_chameleon_v2.py -n 50 --timer perf_counter
INFO:root:Running ./python.orig ../benchmarks/performance/bm_chameleon_v2.py -n 50 --timer perf_counter
INFO:root:Running ./python.patched ../benchmarks/performance/bm_django_v3.py -n 50 --timer perf_counter
INFO:root:Running ./python.orig ../benchmarks/performance/bm_django_v3.py -n 50 --timer perf_counter
INFO:root:Running ./python.patched ../benchmarks/performance/bm_pickle.py -n 50 --timer perf_counter --use_cpickle pickle
INFO:root:Running ./python.orig ../benchmarks/performance/bm_pickle.py -n 50 --timer perf_counter --use_cpickle pickle
INFO:root:Running ./python.patched ../benchmarks/performance/bm_pickle.py -n 50 --timer perf_counter --use_cpickle unpickle
INFO:root:Running ./python.orig ../benchmarks/performance/bm_pickle.py -n 50 --timer perf_counter --use_cpickle unpickle
INFO:root:Running ./python.patched ../benchmarks/performance/bm_json_v2.py -n 50 --timer perf_counter
INFO:root:Running ./python.orig ../benchmarks/performance/bm_json_v2.py -n 50 --timer perf_counter
INFO:root:Running ./python.patched ../benchmarks/performance/bm_json.py -n 50 --timer perf_counter json_load
INFO:root:Running ./python.orig ../benchmarks/performance/bm_json.py -n 50 --timer perf_counter json_load
INFO:root:Running ./python.patched ../benchmarks/performance/bm_nbody.py -n 50 --timer perf_counter
INFO:root:Running ./python.orig ../benchmarks/performance/bm_nbody.py -n 50 --timer perf_counter
INFO:root:Running ./python.patched ../benchmarks/performance/bm_regex_v8.py -n 50 --timer perf_counter
INFO:root:Running ./python.orig ../benchmarks/performance/bm_regex_v8.py -n 50 --timer perf_counter
INFO:root:Running ./python.patched ../benchmarks/performance/bm_tornado_http.py -n 100 --timer perf_counter
INFO:root:Running ./python.orig ../benchmarks/performance/bm_tornado_http.py -n 100 --timer perf_counter
[ 1/10] 2to3...
[ 2/10] chameleon_v2...
[ 3/10] django_v3...
[ 4/10] fastpickle...
[ 5/10] fastunpickle...
[ 6/10] json_dump_v2...
[ 7/10] json_load...
[ 8/10] nbody...
[ 9/10] regex_v8...
[10/10] tornado_http...

Report on Linux smithers 4.2.8-300.fc23.x86_64 #1 SMP Tue Dec 15 16:49:06 UTC 2015 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
6.825440 -> 6.996616: 1.03x slower

### fastpickle ###
Min: 0.449710 -> 0.458054: 1.02x slower
Avg: 0.451589 -> 0.583100: 1.29x slower
Significant (t=-8.04)
Stddev: 0.00076 -> 0.11568: 152.3886x larger

### fastunpickle ###
Min: 0.549672 -> 0.545734: 1.01x faster
Avg: 0.551284 -> 0.696920: 1.26x slower
Significant (t=-6.65)
Stddev: 0.00118 -> 0.15493: 130.9694x larger

### json_load ###
Min: 0.432500 -> 0.466602: 1.08x slower
Avg: 0.433771 -> 0.471205: 1.09x slower
Significant (t=-10.63)
Stddev: 0.00495 -> 0.02440: 4.9303x larger

### nbody ###
Min: 0.230055 -> 0.234925: 1.02x slower
Avg: 0.230985 -> 0.236441: 1.02x slower
Significant (t=-7.56)
Stddev: 0.00048 -> 0.00508: 10.5095x larger

The following not significant results are hidden, use -v to show them:
chameleon_v2, django_v3, json_dump_v2, regex_v8, tornado_http.

brettcannon · 2016-01-10T01:07:19Z

So min is the fastest time in a benchmark execution, average is the average across all benchmark executions, and the t-value is some statistics thing. Anything marked insignificant had such a small average difference it isn't worth reporting.

If you want to smooth out the numbers you should do a rigorous run that uses more loops per benchmark to help smooth out outliers. And if you are doing this on Linux there is a flag to measure memory usage as well.

vstinner · 2016-01-11T14:48:17Z

The PEP is now at python.org: PEP-509.

vstinner · 2016-01-11T17:37:56Z

Patch version 3 updated for the second version of the PEP-509. Main changes:

ma_version now has the type PY_UINT64_T
remove dict.__version__ property: replace with _testcapi.dict_get_version() for tests

vstinner · 2016-01-13T10:38:27Z

Patch version 4: I forgot to include my patch to update test_sys. Now the full Python test suite pass.

vstinner · 2016-01-13T10:41:15Z

I'm not a big fan of pybench (it looks unstable and so not reliable), but here are results with dict_version-4.patch.

I used more loops and a lower warp factor to get more reliable tests (I hope):

./python Tools/pybench/pybench.py -f pybench.orig -w 2 -C 100 -n 25
./python.patched Tools/pybench/pybench.py -f pybench.dictver -w 2 -C 100 -n 25
./python Tools/pybench/pybench.py -s pybench.dictver -c pybench.orig

-------------------------------------------------------------------------------
PYBENCH 2.1
-------------------------------------------------------------------------------

using CPython 3.6.0a0 (default:77d24f51effc+, Jan 13 2016, 11:27:53) [GCC 5.3.1 20151207 (Red Hat 5.3.1-2)]
disabled garbage collection
system check interval set to maximum: 2147483647
using timer: time.perf_counter
timer: resolution=1e-09, implementation=clock_gettime(CLOCK_MONOTONIC)

-------------------------------------------------------------------------------
Benchmark: pybench.dictver
-------------------------------------------------------------------------------

Rounds: 25
Warp:   2
Timer:  time.perf_counter

Machine Details:
   Platform ID:    Linux-4.2.8-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
   Processor:      x86_64

Python:
   Implementation: CPython
   Executable:     /home/haypo/prog/python/default/python
   Version:        3.6.0a0
   Compiler:       GCC 5.3.1 20151207 (Red Hat 5.3.1-2)
   Bits:           64bit
   Build:          Jan 13 2016 11:27:53 (#default:77d24f51effc+)
   Unicode:        UCS4

-------------------------------------------------------------------------------
Comparing with: pybench.orig
-------------------------------------------------------------------------------

Rounds: 25
Warp:   2
Timer:  time.perf_counter

Machine Details:
   Platform ID:    Linux-4.2.8-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
   Processor:      x86_64

Python:
   Implementation: CPython
   Executable:     /home/haypo/prog/python/default/python
   Version:        3.6.0a0
   Compiler:       GCC 5.3.1 20151207 (Red Hat 5.3.1-2)
   Bits:           64bit
   Build:          Jan 13 2016 11:14:50 (#default:77d24f51effc)
   Unicode:        UCS4

Test minimum run-time average run-time
this other diff this other diff
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 230ms 229ms +0.4% 241ms 230ms +5.1%
BuiltinMethodLookup: 130ms 132ms -1.2% 135ms 134ms +0.4%
CompareFloats: 147ms 149ms -1.4% 151ms 149ms +1.2%
CompareFloatsIntegers: 330ms 333ms -0.8% 347ms 335ms +3.6%
CompareIntegers: 214ms 209ms +2.5% 223ms 209ms +6.5%
CompareInternedStrings: 160ms 145ms +10.8% 170ms 145ms +16.9%
CompareLongs: 121ms 120ms +0.2% 124ms 120ms +3.4%
CompareStrings: 132ms 131ms +0.7% 138ms 132ms +4.8%
ComplexPythonFunctionCalls: 233ms 235ms -0.7% 241ms 238ms +1.1%
ConcatStrings: 166ms 165ms +0.3% 177ms 167ms +6.0%
CreateInstances: 240ms 247ms -3.1% 253ms 249ms +1.5%
CreateNewInstances: 178ms 186ms -4.2% 188ms 188ms +0.1%
CreateStringsWithConcat: 315ms 316ms -0.5% 331ms 318ms +4.3%
DictCreation: 254ms 236ms +7.8% 262ms 237ms +10.5%
DictWithFloatKeys: 211ms 199ms +6.1% 219ms 201ms +8.9%
DictWithIntegerKeys: 171ms 163ms +5.4% 180ms 166ms +9.0%
DictWithStringKeys: 163ms 142ms +14.5% 170ms 144ms +17.7%
ForLoops: 121ms 121ms -0.3% 125ms 124ms +0.9%
IfThenElse: 179ms 178ms +0.7% 185ms 178ms +3.6%
ListSlicing: 194ms 193ms +0.4% 198ms 194ms +2.2%
NestedForLoops: 212ms 210ms +1.2% 220ms 210ms +4.6%
NestedListComprehensions: 205ms 212ms -3.3% 218ms 215ms +1.5%
NormalClassAttribute: 429ms 407ms +5.5% 446ms 408ms +9.3%
NormalInstanceAttribute: 212ms 206ms +2.8% 226ms 209ms +8.0%
PythonFunctionCalls: 208ms 210ms -1.4% 215ms 212ms +1.2%
PythonMethodCalls: 275ms 253ms +8.7% 293ms 255ms +14.9%
Recursion: 333ms 328ms +1.4% 366ms 329ms +11.0%
SecondImport: 190ms 188ms +0.8% 201ms 188ms +6.7%
SecondPackageImport: 195ms 192ms +1.8% 214ms 192ms +11.7%
SecondSubmoduleImport: 472ms 447ms +5.7% 502ms 455ms +10.3%
SimpleComplexArithmetic: 141ms 138ms +2.1% 146ms 139ms +5.2%
SimpleDictManipulation: 287ms 288ms -0.3% 313ms 290ms +8.0%
SimpleFloatArithmetic: 138ms 137ms +0.7% 151ms 140ms +7.9%
SimpleIntFloatArithmetic: 172ms 180ms -4.3% 182ms 181ms +0.4%
SimpleIntegerArithmetic: 172ms 169ms +1.9% 184ms 170ms +8.1%
SimpleListComprehensions: 166ms 167ms -0.7% 179ms 173ms +3.3%
SimpleListManipulation: 159ms 153ms +3.9% 166ms 155ms +7.6%
SimpleLongArithmetic: 118ms 117ms +1.2% 123ms 118ms +4.3%
SmallLists: 210ms 204ms +2.8% 221ms 205ms +7.3%
SmallTuples: 244ms 252ms -3.1% 255ms 256ms -0.2%
SpecialClassAttribute: 424ms 416ms +1.8% 442ms 419ms +5.5%
SpecialInstanceAttribute: 214ms 211ms +1.3% 223ms 214ms +4.5%
StringMappings: 442ms 446ms -0.8% 460ms 447ms +2.9%
StringPredicates: 259ms 258ms +0.2% 281ms 260ms +8.2%
StringSlicing: 218ms 220ms -1.2% 228ms 221ms +3.3%
TryExcept: 130ms 135ms -3.9% 134ms 136ms -1.4%
TryFinally: 173ms 171ms +0.9% 180ms 172ms +4.4%
TryRaiseExcept: 62ms 63ms -1.2% 63ms 63ms +0.9%
TupleSlicing: 223ms 224ms -0.3% 251ms 232ms +8.0%
WithFinally: 263ms 264ms -0.4% 280ms 265ms +5.4%
WithRaiseExcept: 205ms 203ms +0.7% 215ms 204ms +5.5%
-------------------------------------------------------------------------------
Totals: 11040ms 10900ms +1.3% 11635ms 10991ms +5.9%

(this=pybench.dictver, other=pybench.orig)

vstinner · 2016-01-13T16:12:34Z

guard_benchmark.patch: patch adding a _testcapi.guard_benchmark(), a microbenchmark on dictionary guard. The benchmark measures the cost of checking if a dictionary key was modified.

Run the benchmark with attached guard_benchmark.py. Result on my PC:

PyObject_GetItem(): 10.2 ns
PyDict_GetItem(): 9.1 ns
guard->check(): 2.9 ns

python3 -m platform:
Linux-4.2.8-300.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three

You have to modify manually _testcapi.c to choose between the 3 implementations.

guard_benchmark.patch requires the issue bpo-26098 patch and the fat module which implements fat.GuardDict. The fat module can be found at:
https://github.com/haypo/fat

vstinner · 2016-01-22T16:22:07Z

Patch version 5: a global counter is now used to set ma_version field of dictionaries. The global counter is incremented each time that a dictionary is created and each time that a dictionary is modified.

A dictionary version is now unique: two dictionaries cannot have the same version. So if a guard stores a version, the check on the version will fail if a different dictionary is used.

1st1 · 2016-01-22T16:40:45Z

Patch version 5: a global counter is now used to set ma_version field of dictionaries. The global counter is incremented each time that a dictionary is created and each time that a dictionary is modified.

This is great, thank you, Victor.

vstinner · 2016-01-22T16:45:21Z

This is great, thank you, Victor.

I will update the PEP-509 later for the global counter.

vstinner · 2016-01-23T12:50:13Z

Patch version 6: remove now unused function _testcapi.dict_set_version(). I also moved tests to test_pep509.py to make it more explicit that the implementation of the PEP-509 is not part the Python dictionary type specification, other Python implementations can choose to not implement it.

vstinner · 2016-04-14T15:20:31Z

The implementation is outdated: ma_version was renamed to ma_version_tag in the latest version of the PEP-509.

vstinner · 2016-04-15T20:49:10Z

The implementation is outdated: ma_version was renamed to ma_version_tag in the latest version of the PEP-509.

Patch version 7 is updated to the latest PEP (rebased and rename ma_version to ma_version_tag).

vstinner · 2016-04-19T10:24:07Z

Patch version 8:

Update to the latest PEP: remove micro-optimization in dictobject.c if the new value is identical to the current value, dict.__setitem__() now always changes the version
Refactor test_pep509 to address Brett's comments
Add new unit tests on identical values
Add new unit tests on equal values (with special __eq__ method)

vstinner · 2016-04-19T10:38:56Z

timeit microbenchmarks on dict_version-8.patch, minimum of 10 runs.

$ ./python.orig -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'
1000000 loops, best of 3: 0.292 usec per loop
$ ./python.version -m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'
1000000 loops, best of 3: 0.293 usec per loop

=> 1 nanosecond (0.3%) slower

$ ./python.orig -m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'
10 loops, best of 3: 21.2 msec per loop
$ ./python.version -m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'
10 loops, best of 3: 21.3 msec per loop

=> 0.1 ms (0.5%) slower

vstinner · 2016-04-19T10:47:41Z

I ran again timeit microbenchmarks with CPU isolation on dict_version-8.patch, minimum of 10 runs.

-m timeit 'd={1: 0}; d[2]=0; d[3]=0; d[4]=0; del d[1]; del d[2]; d.clear()'

Original: 287 ns
Version: 289 ns (+2 ns, +0.7%)

-m timeit 'd={i:i for i in range(2**16)}' 'for i in range(2**16): d[i]=i-1' 'for i in range(2**16): d[i]=i+1' 'for i in range(2**15): del d[i]' 'd.clear()'

Original: 21.2 msec
Version: 21.4 msec (+0.2 ms, +0.9%)

vstinner · 2016-04-19T10:52:21Z

pybench results on dict_version-8.patch with CPU isolation: http://haypo-notes.readthedocs.org/microbenchmark.html

As usual, I'm very skeptical on the pybench results which almost look like noise. I don't understand how my change can make any operation *faster*, whereas some benchmarks are faster with the patch...

Dict microbenchmarks:

       DictCreation:    38ms    36ms   +4.8%    39ms    37ms   +3.9%
  DictWithFloatKeys:    40ms    40ms   -0.8%    40ms    40ms   -0.4%
DictWithIntegerKeys:    33ms    31ms   +7.2%    33ms    31ms   +7.6%
 DictWithStringKeys:    29ms    28ms   +0.4%    29ms    29ms   +0.7%

SimpleDictManipulation: 59ms 59ms -0.4% 59ms 59ms -0.4%

Full output:

$ ./python.version Tools/pybench/pybench.py -f pybench.version
$ ./python.orig Tools/pybench/pybench.py -f pybench.orig
$ ./python.orig Tools/pybench/pybench.py -s pybench.version -c pybench.orig

PYBENCH 2.1
-------------------------------------------------------------------------------

using CPython 3.6.0a0 (default:e281a57d5b29, Apr 19 2016, 12:30:36) [GCC 5.3.1 20151207 (Red Hat 5.3.1-2)]
disabled garbage collection
system check interval set to maximum: 2147483647
using timer: time.perf_counter
timer: resolution=1e-09, implementation=clock_gettime(CLOCK_MONOTONIC)

-------------------------------------------------------------------------------
Benchmark: pybench.version
-------------------------------------------------------------------------------

Rounds: 10
Warp:   10
Timer:  time.perf_counter

Machine Details:
   Platform ID:    Linux-4.4.4-301.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
   Processor:      x86_64

Python:
   Implementation: CPython
   Executable:     /home/haypo/prog/python/default/python.version
   Version:        3.6.0a0
   Compiler:       GCC 5.3.1 20151207 (Red Hat 5.3.1-2)
   Bits:           64bit
   Build:          Apr 19 2016 12:29:16 (#default:e281a57d5b29+)
   Unicode:        UCS4

-------------------------------------------------------------------------------
Comparing with: pybench.orig
-------------------------------------------------------------------------------

Rounds: 10
Warp:   10
Timer:  time.perf_counter

Machine Details:
   Platform ID:    Linux-4.4.4-301.fc23.x86_64-x86_64-with-fedora-23-Twenty_Three
   Processor:      x86_64

Python:
   Implementation: CPython
   Executable:     /home/haypo/prog/python/default/python.orig
   Version:        3.6.0a0
   Compiler:       GCC 5.3.1 20151207 (Red Hat 5.3.1-2)
   Bits:           64bit
   Build:          Apr 19 2016 12:30:36 (#default:e281a57d5b29)
   Unicode:        UCS4

Test minimum run-time average run-time
this other diff this other diff
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 49ms 50ms -1.3% 49ms 50ms -1.2%
BuiltinMethodLookup: 26ms 26ms -0.8% 26ms 27ms -1.2%
CompareFloats: 29ms 30ms -2.0% 29ms 30ms -1.9%
CompareFloatsIntegers: 39ms 39ms +1.4% 39ms 39ms +1.5%
CompareIntegers: 43ms 43ms -1.1% 43ms 43ms -1.1%
CompareInternedStrings: 28ms 28ms -0.2% 28ms 28ms -0.3%
CompareLongs: 25ms 25ms +2.8% 25ms 25ms +2.9%
CompareStrings: 26ms 27ms -0.8% 26ms 27ms -1.6%
ComplexPythonFunctionCalls: 44ms 44ms -1.6% 44ms 45ms -1.6%
ConcatStrings: 35ms 33ms +6.4% 35ms 33ms +6.1%
CreateInstances: 49ms 48ms +2.6% 50ms 49ms +1.8%
CreateNewInstances: 37ms 36ms +2.5% 37ms 36ms +2.2%
CreateStringsWithConcat: 65ms 63ms +3.3% 66ms 64ms +2.9%
DictCreation: 38ms 36ms +4.8% 39ms 37ms +3.9%
DictWithFloatKeys: 40ms 40ms -0.8% 40ms 40ms -0.4%
DictWithIntegerKeys: 33ms 31ms +7.2% 33ms 31ms +7.6%
DictWithStringKeys: 29ms 28ms +0.4% 29ms 29ms +0.7%
ForLoops: 25ms 25ms -0.4% 26ms 26ms -0.3%
IfThenElse: 37ms 35ms +3.3% 37ms 36ms +3.0%
ListSlicing: 39ms 38ms +0.3% 39ms 39ms +0.0%
NestedForLoops: 40ms 40ms +0.1% 40ms 40ms -0.0%
NestedListComprehensions: 41ms 42ms -0.2% 42ms 42ms -0.9%
NormalClassAttribute: 82ms 78ms +4.0% 82ms 79ms +3.8%
NormalInstanceAttribute: 43ms 42ms +1.9% 44ms 43ms +1.8%
PythonFunctionCalls: 40ms 42ms -5.0% 41ms 43ms -4.8%
PythonMethodCalls: 51ms 52ms -1.1% 51ms 52ms -0.8%
Recursion: 69ms 72ms -4.8% 69ms 73ms -4.7%
SecondImport: 37ms 38ms -0.1% 37ms 38ms -0.1%
SecondPackageImport: 40ms 39ms +1.9% 40ms 39ms +1.6%
SecondSubmoduleImport: 96ms 97ms -1.0% 96ms 97ms -1.1%
SimpleComplexArithmetic: 28ms 27ms +1.7% 29ms 28ms +1.9%
SimpleDictManipulation: 59ms 59ms -0.4% 59ms 59ms -0.4%
SimpleFloatArithmetic: 27ms 27ms +2.2% 27ms 27ms +2.3%
SimpleIntFloatArithmetic: 31ms 31ms +1.6% 31ms 31ms +1.7%
SimpleIntegerArithmetic: 31ms 31ms +0.8% 31ms 31ms +0.5%
SimpleListComprehensions: 33ms 33ms +0.2% 33ms 33ms +0.4%
SimpleListManipulation: 31ms 31ms -0.0% 32ms 32ms -1.5%
SimpleLongArithmetic: 21ms 22ms -2.5% 22ms 22ms -1.8%
SmallLists: 43ms 44ms -3.3% 43ms 44ms -3.5%
SmallTuples: 51ms 53ms -2.9% 52ms 54ms -3.6%
SpecialClassAttribute: 80ms 81ms -1.0% 80ms 81ms -1.1%
SpecialInstanceAttribute: 42ms 42ms -0.5% 42ms 42ms -0.7%
StringMappings: 87ms 86ms +0.8% 87ms 86ms +0.9%
StringPredicates: 53ms 53ms +0.3% 53ms 53ms +0.2%
StringSlicing: 48ms 49ms -1.3% 48ms 49ms -0.8%
TryExcept: 25ms 25ms +0.6% 25ms 25ms +0.7%
TryFinally: 35ms 35ms +1.1% 36ms 35ms +0.7%
TryRaiseExcept: 13ms 13ms -0.5% 13ms 13ms -0.6%
TupleSlicing: 46ms 46ms +0.2% 48ms 50ms -3.2%
WithFinally: 54ms 53ms +0.8% 54ms 54ms +0.7%
WithRaiseExcept: 44ms 43ms +2.8% 44ms 43ms +2.7%
-------------------------------------------------------------------------------
Totals: 2156ms 2150ms +0.3% 2171ms 2169ms +0.1%

(this=pybench.version, other=pybench.orig)

malemburg · 2016-04-19T11:01:23Z

On 19.04.2016 12:52, STINNER Victor wrote:

As usual, I'm very skeptical on the pybench results which almost look like noise. I don't understand how my change can make any operation *faster*, whereas some benchmarks are faster with the patch...

This can easily happen as a result of different memory layout, but is
very much dependent on the machine architecture, CPU, memory type, etc.

Dict microbenchmarks:

       DictCreation:    38ms    36ms   +4.8%    39ms    37ms   +3.9%
  DictWithFloatKeys:    40ms    40ms   -0.8%    40ms    40ms   -0.4%
DictWithIntegerKeys:    33ms    31ms   +7.2%    33ms    31ms   +7.6%
 DictWithStringKeys:    29ms    28ms   +0.4%    29ms    29ms   +0.7%

SimpleDictManipulation: 59ms 59ms -0.4% 59ms 59ms -0.4%

Only dict creation and the integer keys benchmark results are relevant.

Could you perhaps check what's causing these slowdowns ?

vstinner · 2016-04-19T11:11:56Z

Could you perhaps check what's causing these slowdowns ?

It's obvious, no? My patch causes the slowdown.

On a timeit microbenchmark, I don't see such slowdown. That's also why
I suspect that pybench is unstable.

python3.6 -m timeit '{}' says 105 ns with and without the patch.

python3.6 -m timeit 'd={}; d[1]=1; d[2]=2; d[3]=3; d[4]=4; d[5]=5;
d[6]=6; d[7]=7; d[8]=8; d[9]=9; d[10]=10' says 838 ns with and without
the patch.

I have to "cheat": I run timeit enough times until I see the "minimum".

malemburg · 2016-04-19T11:25:31Z

On 19.04.2016 13:11, STINNER Victor wrote:

STINNER Victor added the comment:

> Could you perhaps check what's causing these slowdowns ?

It's obvious, no? My patch causes the slowdown.

Well, yes, of course :-) I meant whether there's anything you
can do about those slowdowns.

On a timeit microbenchmark, I don't see such slowdown. That's also why
I suspect that pybench is unstable.

python3.6 -m timeit '{}' says 105 ns with and without the patch.

python3.6 -m timeit 'd={}; d[1]=1; d[2]=2; d[3]=3; d[4]=4; d[5]=5;
d[6]=6; d[7]=7; d[8]=8; d[9]=9; d[10]=10' says 838 ns with and without
the patch.

I have to "cheat": I run timeit enough times until I see the "minimum".

Those operations are too fast for timeit. The overhead associated
with looping is much larger than the time it takes to run the
operation itself. That's why in pybench I put the operations into
blocks of repeated statements. The interpreter then doesn't spend
time on branching when going from one statement execution to the
next (inside those blocks) and you get closer to the real runtime
of the operation you're trying to measure.

vstinner · 2016-04-19T23:21:52Z

Results of the CPython benchmark suite on dict_version-8.patch.

IMHO regex_v8 can be ignored, this benchmark is unstable (issue bpo-26275).

Original python: ../pep509/python
3.6.0a0 (default:3a9b47b062b9, Apr 19 2016, 16:23:15)
[GCC 5.3.1 20151207 (Red Hat 5.3.1-2)]

Patched python: ../pep509/python
3.6.0a0 (default:96f61aab2c6e, Apr 19 2016, 15:21:15)
[GCC 5.3.1 20151207 (Red Hat 5.3.1-2)]

$ python3 -u perf.py --rigorous -b all ../default.copy/python ../pep509/python

INFO:root:Automatically selected timer: perf_counter
(...)
Report on Linux smithers 4.4.4-301.fc23.x86_64 #1 SMP Fri Mar 4 17:42:42 UTC 2016 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
Min: 6.714927 -> 6.856992: 1.02x slower
Avg: 6.726773 -> 6.984359: 1.04x slower
Significant (t=-5.16)
Stddev: 0.01160 -> 0.11111: 9.5816x larger

### call_method ###
Min: 0.312567 -> 0.323260: 1.03x slower
Avg: 0.313399 -> 0.323821: 1.03x slower
Significant (t=-284.62)
Stddev: 0.00042 -> 0.00048: 1.1430x larger

### call_simple ###
Min: 0.242091 -> 0.228922: 1.06x faster
Avg: 0.243613 -> 0.229294: 1.06x faster
Significant (t=2134.23)
Stddev: 0.00010 -> 0.00005: 1.9124x smaller

### etree_generate ###
Min: 0.255097 -> 0.265558: 1.04x slower
Avg: 0.256947 -> 0.267478: 1.04x slower
Significant (t=-67.89)
Stddev: 0.00111 -> 0.00108: 1.0250x smaller

### etree_parse ###
Min: 0.281726 -> 0.290976: 1.03x slower
Avg: 0.283626 -> 0.292642: 1.03x slower
Significant (t=-62.59)
Stddev: 0.00109 -> 0.00094: 1.1540x smaller

### etree_process ###
Min: 0.217164 -> 0.229552: 1.06x slower
Avg: 0.219401 -> 0.231242: 1.05x slower
Significant (t=-75.05)
Stddev: 0.00112 -> 0.00111: 1.0053x smaller

### fannkuch ###
Min: 1.012875 -> 0.985196: 1.03x faster
Avg: 1.014760 -> 0.992104: 1.02x faster
Significant (t=46.44)
Stddev: 0.00287 -> 0.00395: 1.3778x larger

### float ###
Min: 0.259226 -> 0.251918: 1.03x faster
Avg: 0.266530 -> 0.260176: 1.02x faster
Significant (t=10.69)
Stddev: 0.00403 -> 0.00437: 1.0836x larger

### json_load ###
Min: 0.430602 -> 0.433593: 1.01x slower
Avg: 0.431278 -> 0.486924: 1.13x slower
Significant (t=-24.86)
Stddev: 0.00045 -> 0.02238: 49.8974x larger

### normal_startup ###
Min: 0.319092 -> 0.326082: 1.02x slower
Avg: 0.320093 -> 0.326842: 1.02x slower
Significant (t=-70.52)
Stddev: 0.00067 -> 0.00069: 1.0283x larger

### regex_effbot ###
Min: 0.048969 -> 0.048313: 1.01x faster
Avg: 0.050019 -> 0.048415: 1.03x faster
Significant (t=31.41)
Stddev: 0.00051 -> 0.00006: 8.8152x smaller

### regex_v8 ###
Min: 0.051040 -> 0.043412: 1.18x faster
Avg: 0.051325 -> 0.043577: 1.18x faster
Significant (t=45.60)
Stddev: 0.00120 -> 0.00120: 1.0013x larger

### richards ###
Min: 0.157625 -> 0.161126: 1.02x slower
Avg: 0.158952 -> 0.162519: 1.02x slower
Significant (t=-34.92)
Stddev: 0.00073 -> 0.00071: 1.0351x smaller

### silent_logging ###
Min: 0.070547 -> 0.069902: 1.01x faster
Avg: 0.072246 -> 0.069934: 1.03x faster
Significant (t=65.51)
Stddev: 0.00035 -> 0.00003: 11.9828x smaller

The following not significant results are hidden, use -v to show them:
call_method_slots, call_method_unknown, chameleon_v2, chaos, django_v3, etree_iterparse, fastpickle, fastunpickle, formatted_logging, go, hexiom2, json_dump_v2, mako_v2, meteor_contest, nbody, nqueens, pathlib, pickle_dict, pickle_list, pidigits, raytrace, regex_compile, simple_logging, spectral_norm, startup_nosite, telco, tornado_http, unpack_sequence, unpickle_list.

python-dev · 2016-09-08T20:06:40Z

New changeset d43f819caea7 by Victor Stinner in branch 'default':
Add a new private version to the builtin dict type
https://hg.python.org/cpython/rev/d43f819caea7

vstinner · 2016-09-08T20:07:23Z

The PEP-509 has been approved by Guido, I just pushed the implementation.

python-dev · 2016-09-10T04:52:36Z

New changeset c3776dd858f0 by Victor Stinner in branch 'default':
Try to fix sizeof unit tests on dict
https://hg.python.org/cpython/rev/c3776dd858f0

vstinner added the type-feature A feature request or enhancement label Jan 9, 2016

vstinner changed the title ~~Add dict.__version__ read-only property~~ PEP 509: Add ma_version to PyDictObject Jan 11, 2016

vstinner closed this as completed Sep 8, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PEP 509: Add ma_version to PyDictObject #70246

PEP 509: Add ma_version to PyDictObject #70246

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

pitrou commented Jan 9, 2016

vstinner commented Jan 9, 2016

brettcannon commented Jan 9, 2016

vstinner commented Jan 9, 2016

brettcannon commented Jan 10, 2016

vstinner commented Jan 11, 2016

vstinner commented Jan 11, 2016

vstinner commented Jan 13, 2016

vstinner commented Jan 13, 2016

vstinner commented Jan 13, 2016

vstinner commented Jan 22, 2016

1st1 commented Jan 22, 2016

vstinner commented Jan 22, 2016

vstinner commented Jan 23, 2016

vstinner commented Apr 14, 2016

vstinner commented Apr 15, 2016

vstinner commented Apr 19, 2016

vstinner commented Apr 19, 2016

vstinner commented Apr 19, 2016

vstinner commented Apr 19, 2016

malemburg commented Apr 19, 2016

vstinner commented Apr 19, 2016

malemburg commented Apr 19, 2016

vstinner commented Apr 19, 2016

python-dev mannequin commented Sep 8, 2016

vstinner commented Sep 8, 2016

python-dev mannequin commented Sep 10, 2016

PEP 509: Add ma_version to PyDictObject #70246

PEP 509: Add ma_version to PyDictObject #70246

Comments

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

vstinner commented Jan 9, 2016

pitrou commented Jan 9, 2016

vstinner commented Jan 9, 2016

brettcannon commented Jan 9, 2016

vstinner commented Jan 9, 2016

brettcannon commented Jan 10, 2016

vstinner commented Jan 11, 2016

vstinner commented Jan 11, 2016

vstinner commented Jan 13, 2016

vstinner commented Jan 13, 2016

vstinner commented Jan 13, 2016

vstinner commented Jan 22, 2016

1st1 commented Jan 22, 2016

vstinner commented Jan 22, 2016

vstinner commented Jan 23, 2016

vstinner commented Apr 14, 2016

vstinner commented Apr 15, 2016

vstinner commented Apr 19, 2016

vstinner commented Apr 19, 2016

vstinner commented Apr 19, 2016

vstinner commented Apr 19, 2016

malemburg commented Apr 19, 2016

vstinner commented Apr 19, 2016

malemburg commented Apr 19, 2016

vstinner commented Apr 19, 2016

python-dev mannequin commented Sep 8, 2016

vstinner commented Sep 8, 2016

python-dev mannequin commented Sep 10, 2016