classification
Title: enum.Flag should be more set-like
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ethan.furman Nosy List: John Belmonte, Manjusaka, ethan.furman, hauntsaninja, jbelmonte, veky
Priority: normal Keywords: patch

Created on 2019-09-22 11:14 by John Belmonte, last changed 2021-05-10 21:36 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 22221 John Belmonte, 2020-10-11 14:24
PR 22734 closed jbelmonte, 2020-10-17 01:52
PR 24215 merged ethan.furman, 2021-01-14 01:32
PR 24342 merged ethan.furman, 2021-01-26 16:42
PR 25820 open hauntsaninja, 2021-05-02 20:18
Messages (18)
msg352967 - (view) Author: John Belmonte (John Belmonte) Date: 2019-09-22 11:14
I would like Flag class instances to have more set-like abilities:
  1. iteration, to walk through each set bit of the value
  2. len corresponding to #1
  3. subset operator

I may be told "implement it yourself as instance methods", or that #3 has an idiom (a & b is b).  Ideally though, every Flag user should be able to rely on these being implemented consistently.

When trying to implement #1 without devolving into bit fiddling, naturally one might try to use the class iterator.  Unfortunately the semantics of that enumeration include 0, aliases, and compound values.  I've used Flag in several situations and projects, and so far there hasn't been a case where that was the desired semantics.  Interesting though, if #1 were implemented in the standard library, then we could enumerate all bits of the Flag via iteration of `~MyFlag(0)`... though that's obscuring things behind another idiom.

Thank you for considering.
msg365051 - (view) Author: Vedran Čačić (veky) * Date: 2020-03-26 04:47
1. +0 _if_ the implementation is easy to explain. If backward compatibility is an issue, we can always add a property: 
for flag in flags.set:
(though set might imply unorderedness:)
2. -1. Guido said long ago that all lens should be O(1).
(Of course, if you do make it O(1), I have no objection.)
3. +1, absolutely.
msg365072 - (view) Author: Manjusaka (Manjusaka) * Date: 2020-03-26 13:49
1. not sure I gett the Point

2. not sure

3. absolutely yes
msg378436 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-11 14:24
Part of this issue (#1) was intended to be addressed by https://github.com/python/cpython/pull/22221 which added an `__iter__` implementation to Flag and IntFlag.  (The PR did not reference this issue, and was already merged last month.)

However that PR seems problematic on several counts:
   1. `__iter__` diverges from the existing `__contains__`.  The latter includes 0 and compound values
   2. the implementation is fairly heavy
   3. len() on an enum instance is going to be O(n)

I've put post-merge comments on the PR.

I think it would be safer to have an explicitly named `bits()` iterator on flag instances, rather than use `__iter__()`.
msg378439 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-11 15:32
Just a comment, (1) is analogous to str. iter('abc') gives only 'a', 'b' and 'c', while contains accepts '', 'ab', 'bc', and 'abc' too. At least in my mind, it's a pretty strong analogy.
msg378455 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-11 21:26
> Just a comment, (1) is analogous to str. iter('abc') gives only 'a', 'b' and 'c', while contains accepts '', 'ab', 'bc', and 'abc' too. At least in my mind, it's a pretty strong analogy.

I don't agree.  The "zero" bit does not exist, so having __contains__ return True on `Foo(0) in x` is misaligned with the iterator.  And having __contains__ return True for specific compound values just because they happen to be explicitly defined, while returning False for others, is arbitrary.  __contains__ seems to be of very little use, and moreover a trap for the unwary.  Assuming we have to live with that until Python 4, it's better to make an explicit iterator like `bits()` so that the API doesn't contradict itself.
msg378456 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-11 21:38
Of course, if it returns True only on _some_ bits combinations it doesn't make sense. I thought every element of a Boolean span would be _in_ the Foo.

But about "zero bit", I still think it's perfectly analogous to '' in 'abc'.
msg378461 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-11 22:23
I think https://github.com/python/cpython/pull/22221 should be reverted (considering the design issue, performance issue, and bugs), and lets have a proper design and review.

While just reading the code, I found an existing bug in Flag.  And the new __iter__ uses the buggy internal function, and so itself has bugs.

https://github.com/python/cpython/pull/22221#issuecomment-706776441
msg378552 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-13 10:13
It's completely undocumented, but today I noticed that Flag.__contains__() is actually a subset operation.


    def __contains__(self, other):
        ...
        return other._value_ & self._value_ == other._value_

It's an unfortunate departure from the `set` type, which uses `in` for membership test and issubset() / `<=` for subset test.

For set operations, the Flag individual bits should be considered the members of a set (not Flag compound values, which are themselves equivalent to a set).
msg378553 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-13 10:19
Again, I disagree. `str` used to work like this in Py2.0 (or somewhere around then), only 'x' was in 'xyz', not 'xy'. Then Guido came to his senses. :-)

This is not set theory, this is mereology. You don't differentiate between a digit and a one-digit number, a char and a one-char string, and in the same way you shouldn't differentiate between a bit and a one-bit flag.
msg378555 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-13 11:27
I agree that a bit and one-bit flag are the same.

> only 'x' was in 'xyz', not 'xy

I don't understand the comparison, because str 'a in b' tests if 'a' is a subsequence of 'b'.  It is not a subset operation ('xz' in 'xyz' is false).

I can understand the argument that Flag has a subset operator (currently __contains__), and given that one-bit flags can be used freely with the subset operator, there is no reason to add a bit membership operator.

However, since flag values are arguably a set of enabled bits, I think the use of `in` for subset is a confusing departure from the `set` API.
msg378606 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-14 07:13
Flag values are _not_ a set of enabled bits.

At least, it is a weird kind of set where a is the same as {a}. That's why I mentioned mereology... there is no reasonable "membership", just "inclusion". I understand that str is not a perfect analogy, but sets are even more wrong.
msg385062 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-01-14 04:20
The code sample:

    class Color(IntFlag):
        BLACK = 0
        RED = 1
        GREEN = 2
        BLUE = 4
        PURPLE = RED | BLUE
        WHITE = RED | GREEN | BLUE

Here's the summary of the changes:

- single-bit flags are canonical
- multi-bit and zero-bit flags are aliases
+ only canonical flags are returned during iteration

    >>> list(Color.WHITE)
    [<Color.RED: 1>, <Color.GREEN: 2>, <Color.BLUE: 4>]

- negating a flag or flag set returns a new flag/flag set with the
  corresponding positive integer value

    >> Color.GREEN
    <Color.GREEN: 2>

    >> ~Color.GREEN
    <Color.PURPLE: 5>

- `name`s of pseudo-flags are constructed from their members' names

    >>> (Color.RED | Color.GREEN).name
    'RED|GREEN'

- multi-bit flags, aka aliases, can be returned from operations

    >>> Color.RED | Color.BLUE
    <Color.PURPLE: 5>

    >>> Color(7)  # or Color(-1)
    <Color.WHITE: 7>

- membership / containment checking has changed slightly -- zero valued flags
  are never considered to be contained:

    >>> Color.BLACK in Color.WHITE
    False

  otherwise, if all bits of one flag are in the other flag, True is returned:

    >>> Color.PURPLE in Color.WHITE
    True

There is a new boundary mechanism that controls how out-of-range / invalid bits are handled: `STRICT`, `CONFORM`, `EJECT', and `KEEP':

  STRICT --> raises an exception when presented with invalid values
  CONFORM --> discards any invalid bits
  EJECT --> lose Flag status and become a normal int with the given value
  KEEP --> keep the extra bits
           - keeps Flag status and extra bits
           - they don't show up in iteration
           - they do show up in repr() and str()

The default for Flag is STRICT, the default for IntFlag is DISCARD, and the default for _convert_ is KEEP (see ssl.Options for an example of when KEEP is needed).
msg385672 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-01-25 22:26
New changeset 7aaeb2a3d682ecba125c33511e4b4796021d2f82 by Ethan Furman in branch 'master':
bpo-38250: [Enum] single-bit flags are canonical (GH-24215)
https://github.com/python/cpython/commit/7aaeb2a3d682ecba125c33511e4b4796021d2f82
msg385677 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-01-25 23:01
Thank you to everyone involved.  :-)

To answer the first three points that started this issue:

1. iteration -> each single-bit flag in the entire flag, or a
   combinations of flags, is returned one at a time -- not the
   empty set, not other multi-bit values

2. length is implemented -> `len(Color.BLUE | Color.RED) == 2`

3. subset is implemented as containment checking:
   `Color.BLUE in (Color.RED | Color.BLUE) is True`
msg385701 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 09:31
test_enum fails when Python is installed:

PPC64LE Fedora Rawhide Clang Installed 3.x:
https://buildbot.python.org/all/#builders/312/builds/597

0:01:37 load avg: 8.99 [232/426/1] test_enum failed -- running: test_tokenize (1 min 37 sec), test_unparse (31.7 sec), test_concurrent_futures (37.5 sec)
Failed to call load_tests:
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/unittest/loader.py", line 130, in loadTestsFromModule
    return load_tests(self, tests, pattern)
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/test/test_enum.py", line 20, in load_tests
    tests.addTests(doctest.DocFileSuite(
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/doctest.py", line 2511, in DocFileSuite
    suite.addTest(DocFileTest(path, **kw))
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/doctest.py", line 2433, in DocFileTest
    doc, path = _load_testfile(path, package, module_relative,
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/doctest.py", line 231, in _load_testfile
    file_contents = loader.get_data(filename)
  File "<frozen importlib._bootstrap_external>", line 1023, in get_data
FileNotFoundError: [Errno 2] No such file or directory: '/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/test/../../Doc/library/enum.rst'

test test_enum crashed -- Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/test/libregrtest/runtest.py", line 272, in _runtest_inner
    refleak = _runtest_inner2(ns, test_name)
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/test/libregrtest/runtest.py", line 236, in _runtest_inner2
    test_runner()
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang-installed/build/target/lib/python3.10/test/libregrtest/runtest.py", line 210, in _test_module
    raise Exception("errors while loading tests")
Exception: errors while loading tests


Tests are loaded by Lib/test/test_enum.py with:

def load_tests(loader, tests, ignore):
    tests.addTests(doctest.DocTestSuite(enum))
    tests.addTests(doctest.DocFileSuite(
            '../../Doc/library/enum.rst',
            optionflags=doctest.ELLIPSIS|doctest.NORMALIZE_WHITESPACE,
            ))
    return tests
msg385735 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-01-26 20:53
New changeset 01faf4542a8652adfbd3b3f897ba718e8ce43f5e by Ethan Furman in branch 'master':
bpo-38250: [Enum] only include .rst test if file available (GH-24342)
https://github.com/python/cpython/commit/01faf4542a8652adfbd3b3f897ba718e8ce43f5e
msg385737 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 22:15
PPC64LE Fedora Rawhide Clang Installed 3.x is back to green, thanks for the fix ;-)
https://buildbot.python.org/all/#/builders/312/builds/600
History
Date User Action Args
2021-05-10 21:36:22vstinnersetnosy: - vstinner
2021-05-02 20:18:34hauntsaninjasetnosy: + hauntsaninja

pull_requests: + pull_request24506
2021-02-01 20:13:25ethan.furmanlinkissue40042 superseder
2021-01-26 22:15:42vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg385737

stage: patch review -> resolved
2021-01-26 20:53:16ethan.furmansetmessages: + msg385735
2021-01-26 16:42:52ethan.furmansetstage: resolved -> patch review
pull_requests: + pull_request23161
2021-01-26 09:31:24vstinnersetstatus: closed -> open

nosy: + vstinner
messages: + msg385701

resolution: fixed -> (no value)
2021-01-25 23:11:52ethan.furmanlinkissue42915 superseder
2021-01-25 23:01:15ethan.furmansetstatus: open -> closed
resolution: fixed
messages: + msg385677

stage: patch review -> resolved
2021-01-25 22:26:37ethan.furmansetmessages: + msg385672
2021-01-14 04:20:53ethan.furmansetmessages: + msg385062
2021-01-14 01:32:53ethan.furmansetpull_requests: + pull_request23042
2020-10-17 01:52:04jbelmontesetnosy: + jbelmonte
pull_requests: + pull_request21698
2020-10-14 07:13:33vekysetmessages: + msg378606
2020-10-13 11:27:21John Belmontesetmessages: + msg378555
2020-10-13 10:19:45vekysetmessages: + msg378553
2020-10-13 10:13:12John Belmontesetmessages: + msg378552
2020-10-11 22:23:49John Belmontesetmessages: + msg378461
2020-10-11 21:38:22vekysetmessages: + msg378456
2020-10-11 21:26:22John Belmontesetmessages: + msg378455
2020-10-11 15:32:33vekysetmessages: + msg378439
2020-10-11 14:24:15John Belmontesetversions: + Python 3.10, - Python 3.9
messages: + msg378436
pull_requests: + pull_request21621

keywords: + patch
stage: patch review
2020-03-26 13:49:48Manjusakasetnosy: + Manjusaka
messages: + msg365072
2020-03-26 04:47:33vekysetnosy: + veky
messages: + msg365051
2020-03-25 19:25:38ethan.furmansetassignee: ethan.furman

nosy: + ethan.furman
2019-09-22 11:14:02John Belmontecreate