New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dict creation performance regression #60669
Comments
On my system, {} has become significantly slower in 3.3: $ python3.2 -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0314 usec per loop
$ python3.3 -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0892 usec per loop
$ hg id -i
ee7b713fec71+
$ ./python -m timeit -n 1000000 '{}'
1000000 loops, best of 3: 0.0976 usec per loop Is this because of the dict randomization? |
I confirm that. $ ./python -m timeit -n 1000000 '{};{};{};{};{};{};{};{};{};{}' 2.6: 0.62 usec Randomization is not affecting it. |
Ah, this is an effect of PEP-412. The difference exists only for creating dicts up to 5 items (inclusive). |
May be using the free list for keys will restore performance. |
Does this regression impact any real-world program? |
That is a blow-off response. A huge swath of the language is affected by dictionary performance (keyword args, module lookups, attribute lookup, etc). Most programs will be affected to some degree -- computationally bound programs will notice more and i/o bound programs won't notice at all. |
I was merely suggesting to report actual (non-micro) benchmark numbers. |
As I understand, the new dict created on every call of function with keyword arguments. This slow down every such call about 0.1 µsec. This is about 10% of int('42', base=16). In the sum, some programs can slow down to a few percents. Direct comparison of 3.2 and 3.3 is meaningless because of the influence of other factors (first of all PEP-393). |
Ok, but |
Not as fast as a call without keywords, |
Here is a really simple patch. It speed up to 1.9x an empty dict creation (still 1.6x slower than 3.2). Make your measurements of real-world programs. |
Ok, here are some benchmark results with your patch: ### call_method ### ### call_method_unknown ### ### call_simple ### ### fastpickle ### ### float ### ### iterative_count ### ### json_dump_v2 ### ### json_load ### ### nbody ### ### nqueens ### ### pathlib ### ### richards ### ### silent_logging ### ### 2to3 ### ### chameleon ### ### mako ### In the end, it's mostly a wash. |
Antoine, I would consider this a performance regression to solve for 3.3.1. Small dictionary creation is everywhere in CPython. |
Again, feel free to provide real-world benchmark numbers proving the |
Yes, it's predictable. The gain is too small and undistinguished on a background of random noise. May be more long-running benchmark will show more reliable results, but in any case, the gain is small. My long-running hard-optimized simulation is about 5% faster on 3.2 or patched 3.4 comparing to 3.3. But this is not a typical program. |
The patch gives a measurable speedup (./python is a patched 3.3.0+). IMO we should apply it. It's small and I can see no harm, too. $ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'"; echo $cmd; eval $cmd; done
python2.7 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.162 usec per loop
python3.2 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.142 usec per loop
python3.3 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.669 usec per loop
./python -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'
10000000 loops, best of 3: 0.381 usec per loop
$ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 'int(\"1\", base=16)'"; echo $cmd; eval $cmd; done
python2.7 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.268 usec per loop
python3.2 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.302 usec per loop
python3.3 -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.477 usec per loop
./python -R -m timeit -n 10000000 'int("1", base=16)'
10000000 loops, best of 3: 0.356 usec per loop |
So what we should do with this? |
You said it yourself: """The gain is too small and undistinguished on a background of random noise""". Closing would be reasonable, unless someone exhibits a reasonable benchmark where there is a significant difference. In any case, it's too late for perf improvements on 3.4. |
This patch looks correct and like the right thing to do. I recommend applying it to 3.5. |
Antoine, are you still oppose to this patch? |
The patch adds complication to the already complicated memory management of dicts. It increases maintenance burden in critical code. Have we found any case where it makes a tangible difference? |
Closed in the favor of bpo-23601. |
Serhiy Storchaka:
Cool! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: