msg254746 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2015-11-16 19:41 |
If you look at bit.ly/pycon-ca-keynote you will notice that the etree_parse and etree_iterparse benchmarks were horrible for everyone. Because of how badly everyone seemed to do, I think the benchmarks should be verified to be doing reasonable things on implementations other than CPython 2.7.
|
msg254748 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2015-11-16 20:07 |
Would you have a quick summary for those not willing to watch a whole keynote?
|
msg254751 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2015-11-16 20:14 |
That link is to a Jupyter notebook so you don't have to watch anything. Plus the video is not even up yet so you can't skip the keynote even if you wanted to since you can't watch it yet. :)
|
msg254752 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2015-11-16 20:32 |
Ok, so when you say "horrible for everyone", this is really IronPython and Jython, right? :-) Other runtimes seem to do ok (perhaps not stellar, but ok).
|
msg254753 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2015-11-16 20:38 |
Well, Jython and IronPython obviously did the worst, but even Python 3 didn't do as well as I would have expected, so I still want to double-check the benchmarks to see if it's obvious why CPython 2.7 beats out everyone.
|
msg254763 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-16 22:23 |
I think these histograms would look better with logarithmic scale.
|
msg254764 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2015-11-16 22:33 |
Let's not pollute the issue with a critique of my notebook. You can feel free to email me personally to discuss it if you want, including why I purposefully didn't use a logarithmic scale.
|
msg255021 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-20 21:34 |
Sorry Brett.
How tests were ran? There are two implementations of ElementTree, accelerated and non-accelerated. xml.etree.ElementTree by default is accelerated in Python 3, but non-accelerated in Python 2.
$ python2.7 bm_elementtree.py -n 7 --take_geo_mean
0.463665158795
$ python2.7 bm_elementtree.py -n 7 --take_geo_mean --etree-module=xml.etree.ElementTree
5.46309932568
$ python3.4 bm_elementtree.py -n 7 --take_geo_mean --etree-module=xml.etree.ElementTree
0.813397633467649
$ python3.4 bm_elementtree.py -n 7 --take_geo_mean --etree-module=xml.etree.ElementTree --no-accelerator
5.31174765817514
If run the test with the same options --etree-module=xml.etree.ElementTree, it will use accelerated implementation in Python 3 and non-accelerated in Python 2.
|
msg255024 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2015-11-20 22:18 |
The commands I used are in the notebook for each implementation and you can get the same result with `python3 perf.py -b etree python2 python3`.
|
msg255027 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-20 22:44 |
The slowing down Python 3 can be related to adding XMLPullParser (issue17741).
|
msg255050 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-21 11:06 |
Proposed patch optimizes iterparse(). Now it is only 33% slower than in 2.7 (was 2.6 times slower).
|
msg255394 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-26 00:05 |
Updated to tip.
|
msg255936 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2015-12-05 08:33 |
Serhiy's latest patch LGTM.
|
msg256013 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-12-06 15:01 |
Thank you for your review Brett. First than apply this optimization I want to fix errors propagating issue (issue25814). The patch for it is mainly the simplified part of the patch for this issue.
|
msg256039 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2015-12-07 00:31 |
New changeset dd67c8c53aea by Serhiy Storchaka in branch 'default':
Issue #25638: Optimized ElementTree.iterparse(); it is now 2x faster.
https://hg.python.org/cpython/rev/dd67c8c53aea
|
msg256059 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-12-07 12:18 |
The iterparse benchmark in 3.6 still is 30% slower than in 2.7. The parse benchmark is 70% slower. Hence there are other causes of the slowing down.
One of causes is that in 3.x an empty dict instead of None is passed to start handler as attrib parameter if the start tag has no attributes. This makes parsing parsing about 10% slower.
|
msg256158 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-12-09 18:15 |
Following patch speeds up ElementTree parsing (the result of the etree parse benchmark is improved by 10%). Actually it restores 2.7 code and avoids creating an empty dict for attributes if not needed.
|
msg256167 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2015-12-10 07:52 |
New changeset 1fe904420c20 by Serhiy Storchaka in branch 'default':
Issue #25638: Optimized ElementTree parsing; it is now 10% faster.
https://hg.python.org/cpython/rev/1fe904420c20
|
msg256172 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-12-10 09:05 |
Thank you for your review Brett. Now the parse benchmark in 3.6 is only 50% slower than in 2.7. Will continue to find bottlenecks.
|
msg261651 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-03-12 13:55 |
I am not able to find the cause of the slowdown.
I think this issue can be closed now. The etree_parse and etree_iterparse benchmarks are working appropriately and showing real regression in CPython 3.x. The cause of the regression is not known.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:23 | admin | set | github: 69824 |
2018-12-14 22:10:01 | vstinner | set | pull_requests:
+ pull_request10409 |
2016-03-12 17:21:48 | brett.cannon | set | status: open -> closed resolution: not a bug |
2016-03-12 13:55:48 | serhiy.storchaka | set | assignee: serhiy.storchaka -> messages:
+ msg261651 |
2015-12-10 09:05:11 | serhiy.storchaka | set | messages:
+ msg256172 stage: patch review -> |
2015-12-10 07:52:33 | python-dev | set | messages:
+ msg256167 |
2015-12-09 18:15:52 | serhiy.storchaka | set | files:
+ etree_start_handler_no_attrib.patch
messages:
+ msg256158 stage: patch review |
2015-12-07 12:18:06 | serhiy.storchaka | set | messages:
+ msg256059 stage: commit review -> (no value) |
2015-12-07 00:31:38 | python-dev | set | nosy:
+ python-dev messages:
+ msg256039
|
2015-12-06 15:01:51 | serhiy.storchaka | set | dependencies:
+ Propagate all errors from ElementTree.iterparse messages:
+ msg256013 |
2015-12-05 08:33:04 | brett.cannon | set | assignee: brett.cannon -> serhiy.storchaka messages:
+ msg255936 stage: patch review -> commit review |
2015-11-26 09:40:50 | serhiy.storchaka | link | issue25707 dependencies |
2015-11-26 00:05:54 | serhiy.storchaka | set | files:
+ etree_iterparse_2.patch
messages:
+ msg255394 |
2015-11-21 11:06:46 | serhiy.storchaka | set | files:
+ etree_iterparse.patch
type: performance components:
+ Extension Modules, Library (Lib), XML versions:
+ Python 3.6 keywords:
+ patch nosy:
+ scoder, eli.bendersky
messages:
+ msg255050 stage: patch review |
2015-11-20 22:44:24 | serhiy.storchaka | set | messages:
+ msg255027 |
2015-11-20 22:18:20 | brett.cannon | set | messages:
+ msg255024 |
2015-11-20 21:34:36 | serhiy.storchaka | set | messages:
+ msg255021 |
2015-11-16 22:33:12 | brett.cannon | set | messages:
+ msg254764 |
2015-11-16 22:23:12 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg254763
|
2015-11-16 20:38:33 | brett.cannon | set | messages:
+ msg254753 |
2015-11-16 20:32:54 | pitrou | set | messages:
+ msg254752 |
2015-11-16 20:14:14 | brett.cannon | set | messages:
+ msg254751 |
2015-11-16 20:07:56 | pitrou | set | messages:
+ msg254748 |
2015-11-16 19:41:33 | brett.cannon | set | assignee: brett.cannon |
2015-11-16 19:41:23 | brett.cannon | create | |