This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: sorted(range(1000)) is slower in Python 3.7 than in 3.5
Type: performance Stage:
Components: Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ericvw, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2016-12-01 16:21 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
3.5.json.gz vstinner, 2016-12-01 16:32
Messages (3)
msg282194 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-01 16:21
Follow-up of my comment http://bugs.python.org/issue23507#msg282187 :

"sorted(list): Median +- std dev: [3.5] 17.5 us +- 1.0 us -> [3.7] 19.7 us +- 1.1 us: 1.12x slower (+12%)"

"(...) sorted(list) is slower. I don't know why sorted(list) is slower. It doesn't use a key function, and so should not be impacted by FASTCALL changes made since Python 3.6."
msg282196 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-01 16:32
Benchmark:

   ./python -m perf timeit -s 'x=list(range(1000))' 'sorted(x)'

Python 3.6 and 3.7 compared to Python 3.5:

   $ python3 -m perf compare_to 3.5.json.gz 3.6.json.gz 3.7.json.gz
   Median +- std dev: [3.5] 18.4 us +- 0.9 us -> [3.6] 20.5 us +- 0.9 us: 1.11x slower (+11%)
   Median +- std dev: [3.5] 18.4 us +- 0.9 us -> [3.7] 19.8 us +- 1.1 us: 1.08x slower (+8%)

I compiled Python with "./configure && make". The benchmark should be run again using LTO+PGO compilation to get more reliable benchmark results.

It seems like the benchmark is not very stable even with system tune (python3 -m perf system tune, isolcpus and rcu_nocbs in the Linux command line). I ran the benchmark 3 times using --append to concatenate all runs to get enough samples.

Histograms:

$ python3 -m perf hist 3.5.json.gz 3.6.json.gz 3.7.json.gz 
[ 3.5.json ]
15.0 us:  1 #
15.2 us:  0 |
15.5 us:  3 ###
15.7 us:  4 ####
16.0 us:  7 #######
16.2 us:  5 #####
16.5 us: 16 ################
16.7 us:  4 ####
17.0 us:  8 ########
17.2 us: 10 ##########
17.4 us:  7 #######
17.7 us:  5 #####
17.9 us:  5 #####
18.2 us: 23 ########################
18.4 us: 77 ###############################################################################
18.7 us:  5 #####
18.9 us:  0 |
19.2 us:  0 |
19.4 us:  0 |
19.7 us:  0 |
19.9 us:  0 |
20.1 us:  0 |
20.4 us:  0 |
20.6 us:  0 |
20.9 us:  0 |
21.1 us:  0 |

[ 3.6.json ]
15.0 us:  0 |
15.2 us:  0 |
15.5 us:  0 |
15.7 us:  0 |
16.0 us:  0 |
16.2 us:  0 |
16.5 us:  0 |
16.7 us:  0 |
17.0 us:  0 |
17.2 us:  0 |
17.4 us:  2 ###
17.7 us:  2 ###
17.9 us:  3 #####
18.2 us:  4 #######
18.4 us:  3 #####
18.7 us:  7 ############
18.9 us:  5 ########
19.2 us:  8 #############
19.4 us:  6 ##########
19.7 us:  7 ############
19.9 us:  9 ###############
20.1 us: 24 ########################################
20.4 us: 16 ###########################
20.6 us: 47 ###############################################################################
20.9 us: 27 #############################################
21.1 us: 10 #################

[ 3.7.json ]
15.0 us:  0 |
15.2 us:  0 |
15.5 us:  0 |
15.7 us:  0 |
16.0 us:  0 |
16.2 us:  1 ##
16.5 us:  0 |
16.7 us:  2 ###
17.0 us:  4 ######
17.2 us:  6 #########
17.4 us:  4 ######
17.7 us: 11 #################
17.9 us: 10 ################
18.2 us: 14 ######################
18.4 us: 10 ################
18.7 us:  5 ########
18.9 us:  3 #####
19.2 us: 10 ################
19.4 us:  6 #########
19.7 us: 13 #####################
19.9 us: 50 ###############################################################################
20.1 us: 10 ################
20.4 us: 19 ##############################
20.6 us:  2 ###
20.9 us:  0 |
21.1 us:  0 |
msg282213 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-02 02:01
On my laptop, the revision introducing the performance regression is:
---
changeset:   101858:5a62d682636e
user:        Brett Cannon <brett@python.org>
date:        Fri Jun 10 14:37:21 2016 -0700
files:       Doc/library/os.rst Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
description:
Issue #27186: Add os.PathLike support to DirEntry

Initial patch thanks to Jelle Zijlstra.
---

This change is unrelated to sorted(list). It looks more like a "random performance" caused by code placement:

* https://haypo.github.io/analysis-python-performance-issue.html
* https://haypo.github.io/journey-to-stable-benchmark-deadcode.html

According to perf record/perf report, the benchmark spends most of its time in PyObject_RichCompareBool() and long_richcompare():

Overhead  Command  Shared Object       Symbol
  41,98%  python   python              [.] PyObject_RichCompareBool
  35,36%  python   python              [.] long_richcompare
   8,52%  python   python              [.] listsort
   6,29%  python   python              [.] listextend
   5,31%  python   python              [.] list_dealloc

So I guess that the exact code placement of these two functions has a "signifiant" impact on performance. "Significant":

* rev b0be24a2f16c (fast): Median +- std dev: 15.0 us +- 0.1 us
* rev 5a62d682636e (slow): Median +- std dev: 16.3 us +- 0.0 us

The revision 5a62d682636e makes sorted(list) 9% slower.

--

Enabling PGO on compilation should help to get a more reliable code placement, and so more stable performances.

I suggest to close this issue as NOTABUG: ./configure --with-pgo should already fix this issue.
History
Date User Action Args
2022-04-11 14:58:40adminsetgithub: 73038
2016-12-05 16:31:44vstinnersetstatus: open -> closed
resolution: not a bug
2016-12-03 01:44:47terry.reedysettitle: sorted(range(1000)) is slower in Python 3.7 compared to Python 3.5 -> sorted(range(1000)) is slower in Python 3.7 than in 3.5
2016-12-02 02:01:35vstinnersetmessages: + msg282213
2016-12-01 17:50:03ericvwsetnosy: + ericvw
2016-12-01 16:32:13vstinnersetfiles: + 3.5.json.gz

messages: + msg282196
2016-12-01 16:21:27vstinnercreate