classification
Title: Use set literals instead of creating a set from a list
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: benjamin.peterson, berker.peksag, ezio.melotti, haypo, larry, michael.foord, python-dev, rhettinger, serhiy.storchaka, terry.reedy
Priority: normal Keywords: easy, patch

Created on 2014-11-09 03:04 by rhettinger, last changed 2014-12-13 19:56 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
set_literal.patch rhettinger, 2014-11-09 20:53 Set literal patch review
more_set_literals.patch rhettinger, 2014-11-09 21:05 More set literals review
set_literal_2.patch serhiy.storchaka, 2014-11-09 21:18
set_literal_clinic.patch serhiy.storchaka, 2014-11-11 07:53 review
set_literal_mock.patch serhiy.storchaka, 2014-11-15 12:10 review
issue22823-mock.diff berker.peksag, 2014-12-10 00:17 review
set_literal_2to3.patch serhiy.storchaka, 2014-12-11 08:43 review
Messages (38)
msg230879 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 03:04
There are many places where the old-style of creating a set from a list still persists.  The literal notation is idiomatic, cleaner looking, and faster.

Here's a typical change:


  diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
  --- a/Lib/sre_compile.py
  +++ b/Lib/sre_compile.py
  @@ -22,10 +22,10 @@
   else:
       MAXCODE = 0xFFFFFFFF
 
  -_LITERAL_CODES = set([LITERAL, NOT_LITERAL])
  -_REPEATING_CODES = set([REPEAT, MIN_REPEAT, MAX_REPEAT])
  -_SUCCESS_CODES = set([SUCCESS, FAILURE])
  -_ASSERT_CODES = set([ASSERT, ASSERT_NOT])
  +_LITERAL_CODES = {LITERAL, NOT_LITERAL}
  +_REPEATING_CODES = {REPEAT, MIN_REPEAT, MAX_REPEAT}
  +_SUCCESS_CODES = {SUCCESS, FAILURE}
  +_ASSERT_CODES = {ASSERT, ASSERT_NOT}

Here are typical timings:

  $ py34 -m timeit '{10, 20, 30}'
   10000000 loops, best of 3: 0.145 usec per loop

  $ py34 -m timeit 'set([10, 20, 30])'
  1000000 loops, best of 3: 0.477 usec per loop
msg230881 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 03:25
Note, to keep the tests stable, nothing in Lib/tests should be changed.  Any update should target the rest of Lib and Doc.
msg230898 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-09 18:56
I will prepare a 3.5 patch for this.  There are not many instances other than those you found (but several times as many in tests).  I presume that most non-test instances were converted by the 2to3 fixer.

How about frozenset([...]) to frozenset({...})?  There are 4 occurrences of this. The semantic match between frozenset and {...} is better than with [...], but the visual gain in nearly nil.

I will leave the one idlelib instance in CodeContext for when I am editing the file anyway (for both 3.4 and 3.5), which should be soon.
msg230900 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-09 19:10
I did not look at Docs yet.

I could not repeat the timing results on my machine running from the command line, as I got '0.015 usec per loop' for both, and same for both frozenset variations.  Running timeit.repeat interactively and selecting the best reproduced your timing ratio: .16 to .42.  For frozenset, I get .36 to .42 in favor of changing to frozenset({...}).
msg230903 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-09 20:00
Isn't such changes considered code churn?

If it is not, I have a huge patch which makes Python sources to use more modern idioms, including replacing set constructors with set literals (I have counted three occurrences not in tests). Are you interesting to look on it Raymond?
msg230904 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 20:03
[I will prepare a 3.5 patch for this.]

Thanks, I will review when you're done.

[How about frozenset([...]) to frozenset({...})? ]

Yes, the frozenset() examples should change to match the actual repr:

   >>> frozenset([10, 20, 30])
   frozenset({10, 20, 30})
msg230905 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 20:17
[Isn't such changes considered code churn?]

This sort of thing is always a judgment call.  The patch will affect very few lines of code, give a little speed-up, and make the code easier to read.  In the case of the docs, it is almost always worthwhile to update to the current, idiomatic form.  Also, the set literal case is special because it has built-in language support, possible peephole optimizations, and there was a repr change as well.  That said, it is rarely a good idea to change tests because we don't have tests for tests and because the end-user will never see any value.

On the balance, I think this one is a reasonable thing to do, but I would show a great deal more hesitancy for a "a huge patch which makes Python sources to use more modern idioms."
msg230906 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-09 20:22
My timing for set((1,2,3)) is .29, faster than for set([1,2,3]) (.42) but still slower than for {1,2,3} (.16).  So I will change such instances also.

The same timing for frozenset((1,2,3)) (.29) is faster than the best timing for frozenset({1,2,3}), (.36), so I will not change that unless discussed and agreed on.
msg230910 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 20:37
> The same timing for frozenset((1,2,3)) (.29) is faster than the best
> timing for frozenset({1,2,3}), (.36),

I don't see the tuple form used anywhere in the code.
The timing is a bit quicker for the tuple form because the peephole optimizer constant folds the tuple (use dis to see this).


> so I will not change that 
> unless discussed and agreed on.

Maybe, I should just make the patch.  It's becoming harder to talk about than to just fix.
msg230911 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-09 20:46
Serhiy, about your 'huge patch' to modernize code:

I am more positive than some because:

1) To me, a one-time gentile change is not 'churning'.

2) As we link to many, most, or even all python-coded stdlib modules (I think there is a proposal for 'all'), there is more benefit to using modern idioms.

On the other hand, 'huge' patches can be too much to discuss, justify, and review all at once.

Using {.. } for sets consistently is a nice-sized chunk to consider.  We can identify, discuss, and decide on each sub-case (I have identified 4 so far).  It has the additional benefit of being a performance enhancement.
---

'set((...' is used in distutils (which I will not change) and in many tests.  So that is not an issue.  'frozenset((' is used 5 times in regular module code.
msg230912 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 20:53
Attaching a patch.  Doesn't change tests for the reasons mentioned above.  
Leaves idle, 2-to-3, and mocking for their respective module maintainers to deal with holistically (as part of their routine maintenance).
msg230914 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 21:05
Okay, I missed the frozenset(( examples in my search.  There are all in one-time set-up code.  Attaching a patch for them as well.
msg230915 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-09 21:18
You have missed Parser/asdl.py and Tools/clinic/clinic.py.
msg230916 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-09 21:37
Serhiy, as I said before, please omit idlelib/CodeContext.

You both skipped reprlib.py.  Should it be changed to produce the standard repr() result? The existing lines:

F:\Python\dev\35\lib\reprlib.py: 91:         return self._repr_iterable(x, level, 'set([', '])', self.maxset)
F:\Python\dev\35\lib\reprlib.py: 95:         return self._repr_iterable(x, level, 'frozenset([', '])',

If it is, its tests will have to be changed too.
msg230917 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 21:41
Hmm, didn't look at those parts of the tree.  I'll change the one-line in Parser and leave the little atrocities in clinic.py for Larry to fix :-)

Reprlib was skipped intentionally.  There is a separate tracker item for it.  http://bugs.python.org/issue22824
msg230918 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 21:48
If there are no objections, I would like to apply my two patches (plus the one-line asdl.py change) and leave the rest to the discretion the module maintainers (mock, code context, clinic, and 2-to-3).
msg230920 - (view) Author: Roundup Robot (python-dev) Date: 2014-11-09 23:56
New changeset 4480506137ed by Raymond Hettinger in branch 'default':
Issue #22823: Use set literals instead of creating a set from a list
https://hg.python.org/cpython/rev/4480506137ed
msg230921 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-09 23:59
Larry, would you care to apply or approve Serhiy's updates to clinic.py?
msg231002 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-11-11 06:52
Serhiy: set_literal_2.patch doesn't apply cleanly, so I don't get a "review" link.  And apparently Raymond checked in some other changes separately.  Could you redo your patch so it has the Clinic changes, and ensure I get a "review" link?
msg231005 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-11 07:53
Here is updated patch for clinic only.
msg231009 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2014-11-11 08:35
The patch is totally fine.  I wonder why it was like that in the first place!
msg231142 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-14 00:19
Serhiy, go ahead and apply the clinic.py patch. Can you also make a separate mock patch and assign it to Michael Foord for review?
msg231204 - (view) Author: Roundup Robot (python-dev) Date: 2014-11-15 12:05
New changeset f4e75efdc7f1 by Serhiy Storchaka in branch 'default':
Issue #22823: Use set literals instead of creating a set from a tuple.
https://hg.python.org/cpython/rev/f4e75efdc7f1
msg231206 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-15 12:10
> Can you also make a separate mock patch and assign it to Michael Foord for review?

Here is a patch. It also replaces constructing sets from generators with set comprehensions.
msg231223 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-15 22:02
mock patch LGTM
msg231224 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-11-15 22:32
IMO, the _non_defaults set comprehension in mock.py ought to be replaced with a set of internable string constants.
msg231230 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-11-16 03:00
OK, someone can copy and paste this.

non_defaults = {
     '__get__', '__set__', '__delete__', '__reversed__', '__missing__',
     '__reduce__', '__reduce_ex'__, '__getinitargs__', '__getnewargs__',
     '__getstate__', '__setstate__', '__getformat__', '__setformat__',
     '__repr__', '__dir__', '__subclasses__', '__format__',
)
msg232406 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-12-10 00:17
Updated Serhiy's patch.
msg232453 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2014-12-10 23:28
Patch looks good to me.
msg232462 - (view) Author: Roundup Robot (python-dev) Date: 2014-12-11 08:36
New changeset b6e6a86a92a7 by Serhiy Storchaka in branch 'default':
Issue #22823: Use set literals instead of creating a set from a list.
https://hg.python.org/cpython/rev/b6e6a86a92a7

New changeset 86a694781bee by Serhiy Storchaka in branch '3.4':
Issue #22823: Fixed an output of sets in examples.
https://hg.python.org/cpython/rev/86a694781bee
msg232463 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-11 08:43
Docs changes were applied to 3.4 too.

Here is a patch for lib2to3.
msg232464 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-12-11 08:46
> Here is a patch for lib2to3.

In Python 3.5, I still found some "set([" and "frozenset([" in Lib/lib2to3, Lib/test/, Lib/stringrep.py, Lib/unittest/test/ and Lib/idlelib/CodeContext.py if someone is motived to patch them. (Ok, Serhiy wrote a patch for lib2to3.)
msg232465 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-11 09:14
Tests are intentionally omitted, Lib/stringrep.py is very special case (it's 
code is generated and outdated, see issue15239), idlelib is deferred by Terry. 
And there is yet one one-line change to Lib/distutils/msvc9compiler.py in 
set_literal_3.patch.
msg232468 - (view) Author: Roundup Robot (python-dev) Date: 2014-12-11 10:34
New changeset ce66b65ad8d6 by Terry Jan Reedy in branch '2.7':
Issue 22823: Use set literal in idlelib.
https://hg.python.org/cpython/rev/ce66b65ad8d6

New changeset daec40891d43 by Terry Jan Reedy in branch '3.4':
Issue 22823: Use set literal in idlelib.
https://hg.python.org/cpython/rev/daec40891d43
msg232496 - (view) Author: Roundup Robot (python-dev) Date: 2014-12-11 21:27
New changeset 7c2811521261 by Victor Stinner in branch 'default':
Issue #22823: Fix typo in unittest/mock.py
https://hg.python.org/cpython/rev/7c2811521261
msg232580 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2014-12-13 00:17
2to3 patch lgtm. Please apply to 3.4, too, though.
msg232618 - (view) Author: Roundup Robot (python-dev) Date: 2014-12-13 19:53
New changeset c3f960cff3e6 by Serhiy Storchaka in branch '3.4':
Issue #22823: Use set literals in lib2to3.
https://hg.python.org/cpython/rev/c3f960cff3e6

New changeset d3e43f7ecca8 by Serhiy Storchaka in branch 'default':
Issue #22823: Use set literals in lib2to3.
https://hg.python.org/cpython/rev/d3e43f7ecca8
msg232619 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-13 19:56
That's all I think. Distutils is too conservative for such changes.
History
Date User Action Args
2014-12-13 19:56:38serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg232619

stage: patch review -> resolved
2014-12-13 19:53:18python-devsetmessages: + msg232618
2014-12-13 19:46:27serhiy.storchakasetassignee: benjamin.peterson -> serhiy.storchaka
2014-12-13 00:17:54benjamin.petersonsetmessages: + msg232580
2014-12-11 21:27:15python-devsetmessages: + msg232496
2014-12-11 10:34:18python-devsetmessages: + msg232468
2014-12-11 09:14:40serhiy.storchakasetmessages: + msg232465
2014-12-11 08:46:28hayposetnosy: + haypo
messages: + msg232464
2014-12-11 08:43:42serhiy.storchakasetfiles: + set_literal_2to3.patch

nosy: + benjamin.peterson
messages: + msg232463

assignee: serhiy.storchaka -> benjamin.peterson
stage: commit review -> patch review
2014-12-11 08:36:23python-devsetmessages: + msg232462
2014-12-11 07:05:29rhettingersetassignee: michael.foord -> serhiy.storchaka
2014-12-10 23:28:55michael.foordsetmessages: + msg232453
2014-12-10 00:17:40berker.peksagsetfiles: + issue22823-mock.diff
nosy: + berker.peksag
messages: + msg232406

2014-11-16 03:00:02terry.reedysetmessages: + msg231230
2014-11-15 22:32:08rhettingersetmessages: + msg231224
2014-11-15 22:02:50terry.reedysetmessages: + msg231223
stage: patch review -> commit review
2014-11-15 12:10:39serhiy.storchakasetfiles: + set_literal_mock.patch

nosy: + michael.foord
messages: + msg231206

assignee: serhiy.storchaka -> michael.foord
stage: needs patch -> patch review
2014-11-15 12:05:36python-devsetmessages: + msg231204
2014-11-14 00:19:20rhettingersetassignee: larry -> serhiy.storchaka
messages: + msg231142
2014-11-11 08:35:03larrysetmessages: + msg231009
2014-11-11 07:53:22serhiy.storchakasetfiles: + set_literal_clinic.patch
keywords: + patch
messages: + msg231005
2014-11-11 06:52:57larrysetmessages: + msg231002
2014-11-09 23:59:16rhettingersetassignee: rhettinger -> larry

messages: + msg230921
nosy: + larry
2014-11-09 23:56:41python-devsetnosy: + python-dev
messages: + msg230920
2014-11-09 21:48:52rhettingersetmessages: + msg230918
2014-11-09 21:41:08rhettingersetmessages: + msg230917
2014-11-09 21:37:47terry.reedysetkeywords: - patch

messages: + msg230916
2014-11-09 21:18:06serhiy.storchakasetfiles: + set_literal_2.patch

messages: + msg230915
2014-11-09 21:05:54rhettingersetfiles: + more_set_literals.patch

messages: + msg230914
2014-11-09 20:53:43rhettingersetfiles: + set_literal.patch
keywords: + patch
messages: + msg230912
2014-11-09 20:46:27terry.reedysetmessages: + msg230911
2014-11-09 20:37:29rhettingersetassignee: rhettinger
messages: + msg230910
2014-11-09 20:22:55terry.reedysetmessages: + msg230906
2014-11-09 20:17:49rhettingersetmessages: + msg230905
2014-11-09 20:03:57rhettingersetmessages: + msg230904
2014-11-09 20:00:29serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg230903
2014-11-09 19:10:52terry.reedysetmessages: + msg230900
2014-11-09 18:56:44terry.reedysetnosy: + terry.reedy
messages: + msg230898
2014-11-09 04:06:22ezio.melottisetnosy: + ezio.melotti

stage: needs patch
2014-11-09 03:25:39rhettingersetmessages: + msg230881
2014-11-09 03:04:47rhettingercreate