Use set literals instead of creating a set from a list #67012

rhettinger · 2014-11-09T03:04:47Z

BPO	22823
Nosy	@rhettinger, @terryjreedy, @vstinner, @larryhastings, @benjaminp, @ezio-melotti, @voidspace, @berkerpeksag, @serhiy-storchaka
Files	set_literal.patch: Set literal patch more_set_literals.patch: More set literals set_literal_2.patch set_literal_clinic.patch set_literal_mock.patch issue22823-mock.diff set_literal_2to3.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/serhiy-storchaka'
closed_at = <Date 2014-12-13.19:56:38.138>
created_at = <Date 2014-11-09.03:04:47.462>
labels = ['easy', 'type-feature', 'library']
title = 'Use set literals instead of creating a set from a list'
updated_at = <Date 2014-12-13.19:56:38.138>
user = 'https://github.com/rhettinger'

bugs.python.org fields:

activity = <Date 2014-12-13.19:56:38.138>
actor = 'serhiy.storchaka'
assignee = 'serhiy.storchaka'
closed = True
closed_date = <Date 2014-12-13.19:56:38.138>
closer = 'serhiy.storchaka'
components = ['Library (Lib)']
creation = <Date 2014-11-09.03:04:47.462>
creator = 'rhettinger'
dependencies = []
files = ['37159', '37160', '37161', '37174', '37201', '37404', '37413']
hgrepos = []
issue_num = 22823
keywords = ['patch', 'easy']
message_count = 38.0
messages = ['230879', '230881', '230898', '230900', '230903', '230904', '230905', '230906', '230910', '230911', '230912', '230914', '230915', '230916', '230917', '230918', '230920', '230921', '231002', '231005', '231009', '231142', '231204', '231206', '231223', '231224', '231230', '232406', '232453', '232462', '232463', '232464', '232465', '232468', '232496', '232580', '232618', '232619']
nosy_count = 10.0
nosy_names = ['rhettinger', 'terry.reedy', 'vstinner', 'larry', 'benjamin.peterson', 'ezio.melotti', 'michael.foord', 'python-dev', 'berker.peksag', 'serhiy.storchaka']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue22823'
versions = ['Python 3.5']

rhettinger · 2014-11-09T03:04:46Z

There are many places where the old-style of creating a set from a list still persists. The literal notation is idiomatic, cleaner looking, and faster.

Here's a typical change:

diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -22,10 +22,10 @@
else:
MAXCODE = 0xFFFFFFFF

-_LITERAL_CODES = set([LITERAL, NOT_LITERAL])
-_REPEATING_CODES = set([REPEAT, MIN_REPEAT, MAX_REPEAT])
-_SUCCESS_CODES = set([SUCCESS, FAILURE])
-_ASSERT_CODES = set([ASSERT, ASSERT_NOT])
+_LITERAL_CODES = {LITERAL, NOT_LITERAL}
+_REPEATING_CODES = {REPEAT, MIN_REPEAT, MAX_REPEAT}
+_SUCCESS_CODES = {SUCCESS, FAILURE}
+_ASSERT_CODES = {ASSERT, ASSERT_NOT}

Here are typical timings:

  $ py34 -m timeit '{10, 20, 30}'
   10000000 loops, best of 3: 0.145 usec per loop

  $ py34 -m timeit 'set([10, 20, 30])'
  1000000 loops, best of 3: 0.477 usec per loop

rhettinger · 2014-11-09T03:25:39Z

Note, to keep the tests stable, nothing in Lib/tests should be changed. Any update should target the rest of Lib and Doc.

terryjreedy · 2014-11-09T18:56:44Z

I will prepare a 3.5 patch for this. There are not many instances other than those you found (but several times as many in tests). I presume that most non-test instances were converted by the 2to3 fixer.

How about frozenset([...]) to frozenset({...})? There are 4 occurrences of this. The semantic match between frozenset and {...} is better than with [...], but the visual gain in nearly nil.

I will leave the one idlelib instance in CodeContext for when I am editing the file anyway (for both 3.4 and 3.5), which should be soon.

terryjreedy · 2014-11-09T19:10:53Z

I did not look at Docs yet.

I could not repeat the timing results on my machine running from the command line, as I got '0.015 usec per loop' for both, and same for both frozenset variations. Running timeit.repeat interactively and selecting the best reproduced your timing ratio: .16 to .42. For frozenset, I get .36 to .42 in favor of changing to frozenset({...}).

serhiy-storchaka · 2014-11-09T20:00:30Z

Isn't such changes considered code churn?

If it is not, I have a huge patch which makes Python sources to use more modern idioms, including replacing set constructors with set literals (I have counted three occurrences not in tests). Are you interesting to look on it Raymond?

rhettinger · 2014-11-09T20:03:58Z

[I will prepare a 3.5 patch for this.]

Thanks, I will review when you're done.

[How about frozenset([...]) to frozenset({...})? ]

Yes, the frozenset() examples should change to match the actual repr:

   >>> frozenset([10, 20, 30])
   frozenset({10, 20, 30})

rhettinger · 2014-11-09T20:17:49Z

[Isn't such changes considered code churn?]

This sort of thing is always a judgment call. The patch will affect very few lines of code, give a little speed-up, and make the code easier to read. In the case of the docs, it is almost always worthwhile to update to the current, idiomatic form. Also, the set literal case is special because it has built-in language support, possible peephole optimizations, and there was a repr change as well. That said, it is rarely a good idea to change tests because we don't have tests for tests and because the end-user will never see any value.

On the balance, I think this one is a reasonable thing to do, but I would show a great deal more hesitancy for a "a huge patch which makes Python sources to use more modern idioms."

terryjreedy · 2014-11-09T20:22:56Z

My timing for set((1,2,3)) is .29, faster than for set([1,2,3]) (.42) but still slower than for {1,2,3} (.16). So I will change such instances also.

The same timing for frozenset((1,2,3)) (.29) is faster than the best timing for frozenset({1,2,3}), (.36), so I will not change that unless discussed and agreed on.

rhettinger · 2014-11-09T20:37:29Z

The same timing for frozenset((1,2,3)) (.29) is faster than the best
timing for frozenset({1,2,3}), (.36),

I don't see the tuple form used anywhere in the code.
The timing is a bit quicker for the tuple form because the peephole optimizer constant folds the tuple (use dis to see this).

so I will not change that
unless discussed and agreed on.

Maybe, I should just make the patch. It's becoming harder to talk about than to just fix.

terryjreedy · 2014-11-09T20:46:27Z

Serhiy, about your 'huge patch' to modernize code:

I am more positive than some because:

To me, a one-time gentile change is not 'churning'.
As we link to many, most, or even all python-coded stdlib modules (I think there is a proposal for 'all'), there is more benefit to using modern idioms.

On the other hand, 'huge' patches can be too much to discuss, justify, and review all at once.

Using {.. } for sets consistently is a nice-sized chunk to consider. We can identify, discuss, and decide on each sub-case (I have identified 4 so far). It has the additional benefit of being a performance enhancement.
---

'set((...' is used in distutils (which I will not change) and in many tests. So that is not an issue. 'frozenset((' is used 5 times in regular module code.

rhettinger · 2014-11-09T20:53:44Z

Attaching a patch. Doesn't change tests for the reasons mentioned above.
Leaves idle, 2-to-3, and mocking for their respective module maintainers to deal with holistically (as part of their routine maintenance).

rhettinger · 2014-11-09T21:05:55Z

Okay, I missed the frozenset(( examples in my search. There are all in one-time set-up code. Attaching a patch for them as well.

serhiy-storchaka · 2014-11-09T21:18:06Z

You have missed Parser/asdl.py and Tools/clinic/clinic.py.

terryjreedy · 2014-11-09T21:37:47Z

Serhiy, as I said before, please omit idlelib/CodeContext.

You both skipped reprlib.py. Should it be changed to produce the standard repr() result? The existing lines:

F:\Python\dev\35\lib\reprlib.py: 91: return self._repr_iterable(x, level, 'set([', '])', self.maxset)
F:\Python\dev\35\lib\reprlib.py: 95: return self._repr_iterable(x, level, 'frozenset([', '])',

If it is, its tests will have to be changed too.

rhettinger · 2014-11-09T21:41:09Z

Hmm, didn't look at those parts of the tree. I'll change the one-line in Parser and leave the little atrocities in clinic.py for Larry to fix :-)

Reprlib was skipped intentionally. There is a separate tracker item for it. http://bugs.python.org/issue22824

rhettinger · 2014-11-09T21:48:52Z

If there are no objections, I would like to apply my two patches (plus the one-line asdl.py change) and leave the rest to the discretion the module maintainers (mock, code context, clinic, and 2-to-3).

python-dev · 2014-11-09T23:56:42Z

New changeset 4480506137ed by Raymond Hettinger in branch 'default':
Issue bpo-22823: Use set literals instead of creating a set from a list
https://hg.python.org/cpython/rev/4480506137ed

rhettinger · 2014-11-09T23:59:17Z

Larry, would you care to apply or approve Serhiy's updates to clinic.py?

larryhastings · 2014-11-11T06:52:57Z

Serhiy: set_literal_2.patch doesn't apply cleanly, so I don't get a "review" link. And apparently Raymond checked in some other changes separately. Could you redo your patch so it has the Clinic changes, and ensure I get a "review" link?

serhiy-storchaka · 2014-11-11T07:53:22Z

Here is updated patch for clinic only.

larryhastings · 2014-11-11T08:35:03Z

The patch is totally fine. I wonder why it was like that in the first place!

rhettinger · 2014-11-14T00:19:20Z

Serhiy, go ahead and apply the clinic.py patch. Can you also make a separate mock patch and assign it to Michael Foord for review?

python-dev · 2014-11-15T12:05:37Z

New changeset f4e75efdc7f1 by Serhiy Storchaka in branch 'default':
Issue bpo-22823: Use set literals instead of creating a set from a tuple.
https://hg.python.org/cpython/rev/f4e75efdc7f1

serhiy-storchaka · 2014-11-15T12:10:37Z

Can you also make a separate mock patch and assign it to Michael Foord for review?

Here is a patch. It also replaces constructing sets from generators with set comprehensions.

terryjreedy · 2014-11-15T22:02:51Z

mock patch LGTM

rhettinger · 2014-11-15T22:32:09Z

IMO, the _non_defaults set comprehension in mock.py ought to be replaced with a set of internable string constants.

terryjreedy · 2014-11-16T03:00:02Z

OK, someone can copy and paste this.

non_defaults = {
     '__get__', '__set__', '__delete__', '__reversed__', '__missing__',
     '__reduce__', '__reduce_ex'__, '__getinitargs__', '__getnewargs__',
     '__getstate__', '__setstate__', '__getformat__', '__setformat__',
     '__repr__', '__dir__', '__subclasses__', '__format__',
)

berkerpeksag · 2014-12-10T00:17:40Z

Updated Serhiy's patch.

voidspace · 2014-12-10T23:28:55Z

Patch looks good to me.

python-dev · 2014-12-11T08:36:23Z

New changeset b6e6a86a92a7 by Serhiy Storchaka in branch 'default':
Issue bpo-22823: Use set literals instead of creating a set from a list.
https://hg.python.org/cpython/rev/b6e6a86a92a7

New changeset 86a694781bee by Serhiy Storchaka in branch '3.4':
Issue bpo-22823: Fixed an output of sets in examples.
https://hg.python.org/cpython/rev/86a694781bee

serhiy-storchaka · 2014-12-11T08:43:41Z

Docs changes were applied to 3.4 too.

Here is a patch for lib2to3.

vstinner · 2014-12-11T08:46:28Z

Here is a patch for lib2to3.

In Python 3.5, I still found some "set([" and "frozenset([" in Lib/lib2to3, Lib/test/, Lib/stringrep.py, Lib/unittest/test/ and Lib/idlelib/CodeContext.py if someone is motived to patch them. (Ok, Serhiy wrote a patch for lib2to3.)

serhiy-storchaka · 2014-12-11T09:14:41Z

Tests are intentionally omitted, Lib/stringrep.py is very special case (it's
code is generated and outdated, see bpo-15239), idlelib is deferred by Terry.
And there is yet one one-line change to Lib/distutils/msvc9compiler.py in
set_literal_3.patch.

python-dev · 2014-12-11T10:34:18Z

New changeset ce66b65ad8d6 by Terry Jan Reedy in branch '2.7':
bpo-22823: Use set literal in idlelib.
https://hg.python.org/cpython/rev/ce66b65ad8d6

New changeset daec40891d43 by Terry Jan Reedy in branch '3.4':
bpo-22823: Use set literal in idlelib.
https://hg.python.org/cpython/rev/daec40891d43

python-dev · 2014-12-11T21:27:16Z

New changeset 7c2811521261 by Victor Stinner in branch 'default':
Issue bpo-22823: Fix typo in unittest/mock.py
https://hg.python.org/cpython/rev/7c2811521261

benjaminp · 2014-12-13T00:17:54Z

2to3 patch lgtm. Please apply to 3.4, too, though.

python-dev · 2014-12-13T19:53:19Z

New changeset c3f960cff3e6 by Serhiy Storchaka in branch '3.4':
Issue bpo-22823: Use set literals in lib2to3.
https://hg.python.org/cpython/rev/c3f960cff3e6

New changeset d3e43f7ecca8 by Serhiy Storchaka in branch 'default':
Issue bpo-22823: Use set literals in lib2to3.
https://hg.python.org/cpython/rev/d3e43f7ecca8

serhiy-storchaka · 2014-12-13T19:56:38Z

That's all I think. Distutils is too conservative for such changes.

rhettinger added stdlib Python modules in the Lib dir easy type-feature A feature request or enhancement labels Nov 9, 2014

rhettinger self-assigned this Nov 9, 2014

rhettinger assigned larryhastings and unassigned rhettinger Nov 9, 2014

rhettinger assigned serhiy-storchaka and unassigned larryhastings Nov 14, 2014

serhiy-storchaka assigned voidspace and unassigned serhiy-storchaka Nov 15, 2014

rhettinger assigned serhiy-storchaka and unassigned voidspace Dec 11, 2014

serhiy-storchaka assigned benjaminp and unassigned serhiy-storchaka Dec 11, 2014

serhiy-storchaka assigned serhiy-storchaka and unassigned benjaminp Dec 13, 2014

serhiy-storchaka closed this as completed Dec 13, 2014

ezio-melotti transferred this issue from another repository Apr 10, 2022

Use set literals instead of creating a set from a list #67012

Use set literals instead of creating a set from a list #67012

Comments

rhettinger commented Nov 9, 2014

rhettinger commented Nov 9, 2014

rhettinger commented Nov 9, 2014

terryjreedy commented Nov 9, 2014

terryjreedy commented Nov 9, 2014

serhiy-storchaka commented Nov 9, 2014

rhettinger commented Nov 9, 2014

rhettinger commented Nov 9, 2014

terryjreedy commented Nov 9, 2014

rhettinger commented Nov 9, 2014

terryjreedy commented Nov 9, 2014

rhettinger commented Nov 9, 2014

rhettinger commented Nov 9, 2014

serhiy-storchaka commented Nov 9, 2014

terryjreedy commented Nov 9, 2014

rhettinger commented Nov 9, 2014

rhettinger commented Nov 9, 2014

python-dev mannequin commented Nov 9, 2014

rhettinger commented Nov 9, 2014

larryhastings commented Nov 11, 2014

serhiy-storchaka commented Nov 11, 2014

larryhastings commented Nov 11, 2014

rhettinger commented Nov 14, 2014

python-dev mannequin commented Nov 15, 2014

serhiy-storchaka commented Nov 15, 2014

terryjreedy commented Nov 15, 2014

rhettinger commented Nov 15, 2014

terryjreedy commented Nov 16, 2014

berkerpeksag commented Dec 10, 2014

voidspace commented Dec 10, 2014

python-dev mannequin commented Dec 11, 2014

serhiy-storchaka commented Dec 11, 2014

vstinner commented Dec 11, 2014

serhiy-storchaka commented Dec 11, 2014

python-dev mannequin commented Dec 11, 2014

python-dev mannequin commented Dec 11, 2014

benjaminp commented Dec 13, 2014

python-dev mannequin commented Dec 13, 2014

serhiy-storchaka commented Dec 13, 2014