classification
Title: Idlelib.browser: stop sorting dicts created by pyclbr
Type: performance Stage: patch review
Components: IDLE Versions: Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: terry.reedy Nosy List: cheryl.sabella, terry.reedy
Priority: normal Keywords: patch

Created on 2017-12-23 02:41 by terry.reedy, last changed 2018-05-20 23:22 by terry.reedy.

Pull Requests
URL Status Linked Edit
PR 5011 open cheryl.sabella, 2017-12-25 23:56
Messages (6)
msg308945 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-12-23 02:41
pyclbr does a linear scan of a file and returns a dict, and Class objects have a dict of children. For objects in the file being scanned, insertion order is the same as line number order.  idlelib.browser.transform children filters out objects not in the file, changes the .name of some objects in the file, appends to a list, and sorts by line number.  (I should have done the sort in place.)  For insertion order dicts, the sort does nothing. In 3.7, insertion order is a language guarantee, not just a (current) CPython implementation detail.  The sort does not change the order and is therefore no longer needed. We can reduce
    return sorted(obs, key=lambda o: o.lineno)
to
    return obs

However, I do want to make sure that there is at least one test that will break if there is a bug somewhere.
msg309023 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2017-12-25 00:42
Terry,

In test_browser, would it be a good test if mock_pyclbr_tree was created as 
`mock_pyclbr_tree = {'C0': C0, 'f0': f0}` 
instead of 
`mock_pyclbr_tree = {'f0': f0, 'C0': C0}`?

The reason that I'm asking is that, under 2.7 and 3.3, this dictionary does not retain the insertion order, but changes it from (C0, f0) to (f0, C0).  

Or did you just want a test that checks that the line numbers are still in order when the code is returned from transform_children?
msg309026 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-12-25 03:57
(f0, C0) is the correct insertion and line number order for the dict that would be returned by pyclbr for a fleshed-out file.  After 'sorted' is removed, the mock must imitate that.

One could argue that previously, and even now for 3.6, the initial insertion order should be 'wrong' to test that format_children sorts the return list.  But what is the chance that there will ever be an alternate 3.6 (or even later) implementation that runs tkinter and therefore might run IDLE but has a dict that does not preserve insert order?

I probably should not worry about the dict insert order guarantee being rescinded in a future version.  Maybe instead I should just add a comment that transform_children depends on iteration order being insert order == line-number order.
msg309048 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2017-12-26 00:02
Terry,

I'm not sure if I phrased my initial question correctly.  I've made a pull request to show you what I was thinking.

What I had tried to show with the test case change is that, prior to the guarantee of insertion order on the dictionary, the check `eq(tcl, [C0, f0])` would have failed without the sort because `transform_children` would have returned `[f0, C0]`.  With the sort in place or with the guarantee insertion order, this test doesn't fail.

Thanks!
msg317207 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-05-20 23:16
The patch is trivial, and I assume that the test change is adequate, but I want to put this issue on hold for the present because
a) the performance gain is minuscule (the code cleanup is worth a bit more);
b) I am reluctant to backport something to 3.6 that depends on an unofficial implementation change, but not doing so will make backports of changes in this area not automatic;
c) I think I want to switch the browsers to ttk.Treeview, I hope before 3.6 goes off maintenance, and I expect that doing so will change this area of the code, and possibly the line being changed.

'The present' ends when 3.6 maintenance ends in 6 months or at least the module browser is patched, whichever is first.
msg317210 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-05-20 23:22
I should mention that using Treeview could mean not using pyclbr.    Instead, we might extract the parsing loop and modify it to build tree nodes as functions and classes are encountered.
History
Date User Action Args
2018-05-20 23:22:38terry.reedysetmessages: + msg317210
2018-05-20 23:16:57terry.reedysetstage: test needed -> patch review
messages: + msg317207
versions: + Python 3.8
2017-12-26 00:02:15cheryl.sabellasetmessages: + msg309048
stage: patch review -> test needed
2017-12-25 23:56:17cheryl.sabellasetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request4900
2017-12-25 03:57:57terry.reedysetmessages: + msg309026
2017-12-25 00:42:23cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg309023
2017-12-23 02:41:09terry.reedycreate