classification
Title: enum.Flag should be more set-like
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ethan.furman Nosy List: John Belmonte, Manjusaka, ethan.furman, jbelmonte, veky
Priority: normal Keywords: patch

Created on 2019-09-22 11:14 by John Belmonte, last changed 2020-10-17 01:52 by jbelmonte.

Pull Requests
URL Status Linked Edit
PR 22221 John Belmonte, 2020-10-11 14:24
PR 22734 open jbelmonte, 2020-10-17 01:52
Messages (12)
msg352967 - (view) Author: John Belmonte (John Belmonte) Date: 2019-09-22 11:14
I would like Flag class instances to have more set-like abilities:
  1. iteration, to walk through each set bit of the value
  2. len corresponding to #1
  3. subset operator

I may be told "implement it yourself as instance methods", or that #3 has an idiom (a & b is b).  Ideally though, every Flag user should be able to rely on these being implemented consistently.

When trying to implement #1 without devolving into bit fiddling, naturally one might try to use the class iterator.  Unfortunately the semantics of that enumeration include 0, aliases, and compound values.  I've used Flag in several situations and projects, and so far there hasn't been a case where that was the desired semantics.  Interesting though, if #1 were implemented in the standard library, then we could enumerate all bits of the Flag via iteration of `~MyFlag(0)`... though that's obscuring things behind another idiom.

Thank you for considering.
msg365051 - (view) Author: Vedran Čačić (veky) * Date: 2020-03-26 04:47
1. +0 _if_ the implementation is easy to explain. If backward compatibility is an issue, we can always add a property: 
for flag in flags.set:
(though set might imply unorderedness:)
2. -1. Guido said long ago that all lens should be O(1).
(Of course, if you do make it O(1), I have no objection.)
3. +1, absolutely.
msg365072 - (view) Author: Manjusaka (Manjusaka) * Date: 2020-03-26 13:49
1. not sure I gett the Point

2. not sure

3. absolutely yes
msg378436 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-11 14:24
Part of this issue (#1) was intended to be addressed by https://github.com/python/cpython/pull/22221 which added an `__iter__` implementation to Flag and IntFlag.  (The PR did not reference this issue, and was already merged last month.)

However that PR seems problematic on several counts:
   1. `__iter__` diverges from the existing `__contains__`.  The latter includes 0 and compound values
   2. the implementation is fairly heavy
   3. len() on an enum instance is going to be O(n)

I've put post-merge comments on the PR.

I think it would be safer to have an explicitly named `bits()` iterator on flag instances, rather than use `__iter__()`.
msg378439 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-11 15:32
Just a comment, (1) is analogous to str. iter('abc') gives only 'a', 'b' and 'c', while contains accepts '', 'ab', 'bc', and 'abc' too. At least in my mind, it's a pretty strong analogy.
msg378455 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-11 21:26
> Just a comment, (1) is analogous to str. iter('abc') gives only 'a', 'b' and 'c', while contains accepts '', 'ab', 'bc', and 'abc' too. At least in my mind, it's a pretty strong analogy.

I don't agree.  The "zero" bit does not exist, so having __contains__ return True on `Foo(0) in x` is misaligned with the iterator.  And having __contains__ return True for specific compound values just because they happen to be explicitly defined, while returning False for others, is arbitrary.  __contains__ seems to be of very little use, and moreover a trap for the unwary.  Assuming we have to live with that until Python 4, it's better to make an explicit iterator like `bits()` so that the API doesn't contradict itself.
msg378456 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-11 21:38
Of course, if it returns True only on _some_ bits combinations it doesn't make sense. I thought every element of a Boolean span would be _in_ the Foo.

But about "zero bit", I still think it's perfectly analogous to '' in 'abc'.
msg378461 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-11 22:23
I think https://github.com/python/cpython/pull/22221 should be reverted (considering the design issue, performance issue, and bugs), and lets have a proper design and review.

While just reading the code, I found an existing bug in Flag.  And the new __iter__ uses the buggy internal function, and so itself has bugs.

https://github.com/python/cpython/pull/22221#issuecomment-706776441
msg378552 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-13 10:13
It's completely undocumented, but today I noticed that Flag.__contains__() is actually a subset operation.


    def __contains__(self, other):
        ...
        return other._value_ & self._value_ == other._value_

It's an unfortunate departure from the `set` type, which uses `in` for membership test and issubset() / `<=` for subset test.

For set operations, the Flag individual bits should be considered the members of a set (not Flag compound values, which are themselves equivalent to a set).
msg378553 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-13 10:19
Again, I disagree. `str` used to work like this in Py2.0 (or somewhere around then), only 'x' was in 'xyz', not 'xy'. Then Guido came to his senses. :-)

This is not set theory, this is mereology. You don't differentiate between a digit and a one-digit number, a char and a one-char string, and in the same way you shouldn't differentiate between a bit and a one-bit flag.
msg378555 - (view) Author: John Belmonte (John Belmonte) Date: 2020-10-13 11:27
I agree that a bit and one-bit flag are the same.

> only 'x' was in 'xyz', not 'xy

I don't understand the comparison, because str 'a in b' tests if 'a' is a subsequence of 'b'.  It is not a subset operation ('xz' in 'xyz' is false).

I can understand the argument that Flag has a subset operator (currently __contains__), and given that one-bit flags can be used freely with the subset operator, there is no reason to add a bit membership operator.

However, since flag values are arguably a set of enabled bits, I think the use of `in` for subset is a confusing departure from the `set` API.
msg378606 - (view) Author: Vedran Čačić (veky) * Date: 2020-10-14 07:13
Flag values are _not_ a set of enabled bits.

At least, it is a weird kind of set where a is the same as {a}. That's why I mentioned mereology... there is no reasonable "membership", just "inclusion". I understand that str is not a perfect analogy, but sets are even more wrong.
History
Date User Action Args
2020-10-17 01:52:04jbelmontesetnosy: + jbelmonte
pull_requests: + pull_request21698
2020-10-14 07:13:33vekysetmessages: + msg378606
2020-10-13 11:27:21John Belmontesetmessages: + msg378555
2020-10-13 10:19:45vekysetmessages: + msg378553
2020-10-13 10:13:12John Belmontesetmessages: + msg378552
2020-10-11 22:23:49John Belmontesetmessages: + msg378461
2020-10-11 21:38:22vekysetmessages: + msg378456
2020-10-11 21:26:22John Belmontesetmessages: + msg378455
2020-10-11 15:32:33vekysetmessages: + msg378439
2020-10-11 14:24:15John Belmontesetversions: + Python 3.10, - Python 3.9
messages: + msg378436
pull_requests: + pull_request21621

keywords: + patch
stage: patch review
2020-03-26 13:49:48Manjusakasetnosy: + Manjusaka
messages: + msg365072
2020-03-26 04:47:33vekysetnosy: + veky
messages: + msg365051
2020-03-25 19:25:38ethan.furmansetassignee: ethan.furman

nosy: + ethan.furman
2019-09-22 11:14:02John Belmontecreate