New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use set literals instead of creating a set from a list #67012
Comments
There are many places where the old-style of creating a set from a list still persists. The literal notation is idiomatic, cleaner looking, and faster. Here's a typical change: diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py -_LITERAL_CODES = set([LITERAL, NOT_LITERAL]) Here are typical timings: $ py34 -m timeit '{10, 20, 30}'
10000000 loops, best of 3: 0.145 usec per loop
$ py34 -m timeit 'set([10, 20, 30])'
1000000 loops, best of 3: 0.477 usec per loop |
Note, to keep the tests stable, nothing in Lib/tests should be changed. Any update should target the rest of Lib and Doc. |
I will prepare a 3.5 patch for this. There are not many instances other than those you found (but several times as many in tests). I presume that most non-test instances were converted by the 2to3 fixer. How about frozenset([...]) to frozenset({...})? There are 4 occurrences of this. The semantic match between frozenset and {...} is better than with [...], but the visual gain in nearly nil. I will leave the one idlelib instance in CodeContext for when I am editing the file anyway (for both 3.4 and 3.5), which should be soon. |
I did not look at Docs yet. I could not repeat the timing results on my machine running from the command line, as I got '0.015 usec per loop' for both, and same for both frozenset variations. Running timeit.repeat interactively and selecting the best reproduced your timing ratio: .16 to .42. For frozenset, I get .36 to .42 in favor of changing to frozenset({...}). |
Isn't such changes considered code churn? If it is not, I have a huge patch which makes Python sources to use more modern idioms, including replacing set constructors with set literals (I have counted three occurrences not in tests). Are you interesting to look on it Raymond? |
[I will prepare a 3.5 patch for this.] Thanks, I will review when you're done. [How about frozenset([...]) to frozenset({...})? ] Yes, the frozenset() examples should change to match the actual repr: >>> frozenset([10, 20, 30])
frozenset({10, 20, 30}) |
[Isn't such changes considered code churn?] This sort of thing is always a judgment call. The patch will affect very few lines of code, give a little speed-up, and make the code easier to read. In the case of the docs, it is almost always worthwhile to update to the current, idiomatic form. Also, the set literal case is special because it has built-in language support, possible peephole optimizations, and there was a repr change as well. That said, it is rarely a good idea to change tests because we don't have tests for tests and because the end-user will never see any value. On the balance, I think this one is a reasonable thing to do, but I would show a great deal more hesitancy for a "a huge patch which makes Python sources to use more modern idioms." |
My timing for set((1,2,3)) is .29, faster than for set([1,2,3]) (.42) but still slower than for {1,2,3} (.16). So I will change such instances also. The same timing for frozenset((1,2,3)) (.29) is faster than the best timing for frozenset({1,2,3}), (.36), so I will not change that unless discussed and agreed on. |
I don't see the tuple form used anywhere in the code.
Maybe, I should just make the patch. It's becoming harder to talk about than to just fix. |
Serhiy, about your 'huge patch' to modernize code: I am more positive than some because:
On the other hand, 'huge' patches can be too much to discuss, justify, and review all at once. Using {.. } for sets consistently is a nice-sized chunk to consider. We can identify, discuss, and decide on each sub-case (I have identified 4 so far). It has the additional benefit of being a performance enhancement. 'set((...' is used in distutils (which I will not change) and in many tests. So that is not an issue. 'frozenset((' is used 5 times in regular module code. |
Attaching a patch. Doesn't change tests for the reasons mentioned above. |
Okay, I missed the frozenset(( examples in my search. There are all in one-time set-up code. Attaching a patch for them as well. |
You have missed Parser/asdl.py and Tools/clinic/clinic.py. |
Serhiy, as I said before, please omit idlelib/CodeContext. You both skipped reprlib.py. Should it be changed to produce the standard repr() result? The existing lines: F:\Python\dev\35\lib\reprlib.py: 91: return self._repr_iterable(x, level, 'set([', '])', self.maxset) If it is, its tests will have to be changed too. |
Hmm, didn't look at those parts of the tree. I'll change the one-line in Parser and leave the little atrocities in clinic.py for Larry to fix :-) Reprlib was skipped intentionally. There is a separate tracker item for it. http://bugs.python.org/issue22824 |
If there are no objections, I would like to apply my two patches (plus the one-line asdl.py change) and leave the rest to the discretion the module maintainers (mock, code context, clinic, and 2-to-3). |
New changeset 4480506137ed by Raymond Hettinger in branch 'default': |
Larry, would you care to apply or approve Serhiy's updates to clinic.py? |
Serhiy: set_literal_2.patch doesn't apply cleanly, so I don't get a "review" link. And apparently Raymond checked in some other changes separately. Could you redo your patch so it has the Clinic changes, and ensure I get a "review" link? |
Here is updated patch for clinic only. |
The patch is totally fine. I wonder why it was like that in the first place! |
Serhiy, go ahead and apply the clinic.py patch. Can you also make a separate mock patch and assign it to Michael Foord for review? |
New changeset f4e75efdc7f1 by Serhiy Storchaka in branch 'default': |
Here is a patch. It also replaces constructing sets from generators with set comprehensions. |
mock patch LGTM |
IMO, the _non_defaults set comprehension in mock.py ought to be replaced with a set of internable string constants. |
OK, someone can copy and paste this. non_defaults = {
'__get__', '__set__', '__delete__', '__reversed__', '__missing__',
'__reduce__', '__reduce_ex'__, '__getinitargs__', '__getnewargs__',
'__getstate__', '__setstate__', '__getformat__', '__setformat__',
'__repr__', '__dir__', '__subclasses__', '__format__',
) |
Updated Serhiy's patch. |
Patch looks good to me. |
New changeset b6e6a86a92a7 by Serhiy Storchaka in branch 'default': New changeset 86a694781bee by Serhiy Storchaka in branch '3.4': |
Docs changes were applied to 3.4 too. Here is a patch for lib2to3. |
In Python 3.5, I still found some "set([" and "frozenset([" in Lib/lib2to3, Lib/test/, Lib/stringrep.py, Lib/unittest/test/ and Lib/idlelib/CodeContext.py if someone is motived to patch them. (Ok, Serhiy wrote a patch for lib2to3.) |
Tests are intentionally omitted, Lib/stringrep.py is very special case (it's |
New changeset ce66b65ad8d6 by Terry Jan Reedy in branch '2.7': New changeset daec40891d43 by Terry Jan Reedy in branch '3.4': |
New changeset 7c2811521261 by Victor Stinner in branch 'default': |
2to3 patch lgtm. Please apply to 3.4, too, though. |
New changeset c3f960cff3e6 by Serhiy Storchaka in branch '3.4': New changeset d3e43f7ecca8 by Serhiy Storchaka in branch 'default': |
That's all I think. Distutils is too conservative for such changes. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: