Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enum.IntFlag regression: missing values cause TypeError #88408

Closed
hroncok mannequin opened this issue May 26, 2021 · 27 comments
Closed

enum.IntFlag regression: missing values cause TypeError #88408

hroncok mannequin opened this issue May 26, 2021 · 27 comments
Assignees
Labels
3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@hroncok
Copy link
Mannequin

hroncok mannequin commented May 26, 2021

BPO 44242
Nosy @belm0, @ethanfurman, @vedgar, @tacaswell, @hroncok, @pablogsal, @miss-islington, @Zheaoli, @belm0, @hauntsaninja, @jacobtylerwalls
PRs
  • bpo-44242: [Enum] remove missing bits test from Flag creation #26586
  • [3.10] bpo-44242: [Enum] remove missing bits test from Flag creation (GH-26586) #26635
  • bpo-44242: [Enum] improve error messages #26669
  • [3.10] bpo-44242: [Enum] improve error messages (GH-26669) #26671
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ethanfurman'
    closed_at = <Date 2021-06-14.19:05:28.255>
    created_at = <Date 2021-05-26.19:31:43.243>
    labels = ['type-bug', 'library', '3.10', '3.11']
    title = 'enum.IntFlag regression: missing values cause TypeError'
    updated_at = <Date 2021-06-14.19:05:28.255>
    user = 'https://github.com/hroncok'

    bugs.python.org fields:

    activity = <Date 2021-06-14.19:05:28.255>
    actor = 'ethan.furman'
    assignee = 'ethan.furman'
    closed = True
    closed_date = <Date 2021-06-14.19:05:28.255>
    closer = 'ethan.furman'
    components = ['Library (Lib)']
    creation = <Date 2021-05-26.19:31:43.243>
    creator = 'hroncok'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44242
    keywords = ['patch']
    message_count = 27.0
    messages = ['394457', '394461', '394464', '394466', '394467', '394468', '394469', '394470', '394471', '394472', '394473', '394475', '394479', '394486', '394487', '394500', '394593', '395134', '395136', '395431', '395539', '395572', '395616', '395628', '395629', '395810', '395838']
    nosy_count = 11.0
    nosy_names = ['jbelmonte', 'ethan.furman', 'veky', 'tcaswell', 'hroncok', 'pablogsal', 'miss-islington', 'Manjusaka', 'John Belmonte', 'hauntsaninja', 'jacobtylerwalls']
    pr_nums = ['26586', '26635', '26669', '26671']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue44242'
    versions = ['Python 3.10', 'Python 3.11']

    @hroncok
    Copy link
    Mannequin Author

    hroncok mannequin commented May 26, 2021

    With the change introduced in https://bugs.python.org/issue38250 7aaeb2a I observe a regression in behavior of enum.IntFlag with missing values.

    Consider this code (from pyproj):

        from enum import IntFlag
    
        class GeodIntermediateFlag(IntFlag):
            DEFAULT = 0x0
    
            NPTS_MASK = 0xF
            NPTS_ROUND = 0x0
            NPTS_CEIL = 0x1
            NPTS_TRUNC = 0x2
             
            DEL_S_MASK = 0xF0
            DEL_S_RECALC = 0x00
            DEL_S_NO_RECALC = 0x10
          
            AZIS_MASK = 0xF00
            AZIS_DISCARD = 0x000
            AZIS_KEEP = 0x100

    This is a valid code in Python 3.9, however produces a TypeError in Python 3.10.0b1:

        Traceback (most recent call last):
          File "intflag.py", line 3, in <module>
            class GeodIntermediateFlag(IntFlag):
          File "/usr/lib64/python3.10/enum.py", line 544, in __new__
            raise TypeError(
        TypeError: invalid Flag 'GeodIntermediateFlag' -- missing values: 4, 8, 32, 64, 128, 512, 1024, 2048

    Since I don't see this behavior mentioned in https://docs.python.org/3.10/library/enum.html or https://docs.python.org/3.10/whatsnew/3.10.html or https://docs.python.org/3.10/whatsnew/changelog.html I believe this is a regression.

    @hroncok hroncok mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 26, 2021
    @pablogsal
    Copy link
    Member

    We are already blocked in Python 3.10 beta 2, Ethan, could you give a look at this so we can introduce the fix when we release the next beta?

    @ethanfurman
    Copy link
    Member

    That is an intentional change. The cause is that the masks include bits that are not named in the Flag.

    The user-side fix is to add a boundary=KEEP option to the flag:

        class GeodIntermediateFlag(IntFlag, boundary=KEEP)

    The enum library fix could be one of two things:

    • automatically use the KEEP boundary when these conditions arise, and issue a DeprecationWarning; or

    • lose that particular check.

    I'm inclined to go with option 2, since boundary is designed to answer the question of what to do when Flag.A | Flag.B does not exist in Flag.

    @belm0
    Copy link
    Mannequin

    belm0 mannequin commented May 26, 2021

    To clarify, it's caused by these mask entries in the enum:

    NPTS_MASK = 0xF
    DEL_S_MASK = 0xF0
    AZIS_MASK = 0xF00

    Since the masks are not an aggregation of individual bits defined in the enum, it's an error.

    I'm inclined to go with [removing the check], since boundary is designed to answer the question of what to do when Flag.A | Flag.B does not exist in Flag.

    I think that would cause various problems in the API and implementation. For example, without underlying individual bits, repr() may be nonsensical.

    @belm0
    Copy link
    Mannequin

    belm0 mannequin commented May 26, 2021

    I wonder if CONFORM could tolerate these, since it's ultimately going to discard invalid bits. And then perhaps CONFORM is the default.

    @ethanfurman
    Copy link
    Member

    Actually, thinking about that a little bit more, KEEP was added for exactly this situation, as some stdlib flags exhibit the same behavior.

    So the real question is what should happen with, for example,

      GeodIntermediateFlag(0x80)

    ?

    The idea behind boundary is what should happen when values are created that don't have names in the Enum/Flag? The options for boundary are:

    STRICT -> an error is raised (default for Enum)
    EJECT -> the integer 0x80 is returned (not a flag)
    CONFORM -> unnamed bits are discarded (so the DEFAULT flag would be returned)
    KEEP -> an unnamed flag with value 0x80 is returned

    So KEEP is currently doing double-duty -- this reinforces my desire to go with option 2 and return KEEP to single-duty status.

    @ethanfurman
    Copy link
    Member

    Those are good points -- the difficulty is knowing which behavior the user wants. And if the desired run-time behavior doesn't match the boundary flag the user is stuck.

    @ethanfurman
    Copy link
    Member

    For example, if the default is CONFORM or KEEP, but the user wants an error if 0x80 comes up, they would have to explicitly check for that value since the Flag would happily return it instead of raising.

    @belm0
    Copy link
    Mannequin

    belm0 mannequin commented May 26, 2021

    Rather than make such masks containing unknown bits, this would be best practice if you want to use STRICT, correct?

    NPTS_ROUND = 0x0
    NPTS_CEIL = 0x1
    NPTS_TRUNC = 0x2
    NPTS_MASK = NPTS_ROUND | NPTS_CEIL | NPTS_TRUNC

    Otherwise, if your input may have unknown bits, use CONFORM.

    @ethanfurman
    Copy link
    Member

    Yes, that would be best practice.

    However, if the user is interfacing with other software/hardware, they may not have a choice on which bits make up the mask.

    @belm0
    Copy link
    Mannequin

    belm0 mannequin commented May 26, 2021

    However, if the user is interfacing with other software/hardware, they may not have a choice on which bits make up the mask.

    I think it's the same as saying "unknown bits may come on the input". An important use case is for forward compatibility, where new bits may be added but you don't want to break existing code.

    That is what CONFORM is for. The cited code should either declare the masks as I've shown if STRICT is desired, or use CONFORM if forward compatibility is desired.

    @ethanfurman
    Copy link
    Member

    That could be, and the user can set the boundary to whatever works best for their use-case, so long as the boundary they want to use does not conflict with the initial creation checks.

    Do you agree that simply removing the unnamed member check that takes place at Flag creation is a good way forward?

    @belm0
    Copy link
    Mannequin

    belm0 mannequin commented May 26, 2021

    Do you agree that simply removing the unnamed member check that takes place at Flag creation is a good way forward?

    It's uncomfortable, because then STRICT is not actually strict, and it's going to be show raw values in str/repr.

    CONFORM -> unnamed bits are discarded (so the DEFAULT flag would be returned)

    This behavior of CONFORM is a little dubious. I expect it to conform new values after the class is constructed. But the class members themselves should not have that transform applied, and raise an error on invalid bits.

    @ethanfurman
    Copy link
    Member

    I see your point about the str/repr.

    But the class members themselves should not have that transform applied, and raise
    an error on invalid bits.

    But I'm not sure I understand that. Either you are agreeing with me that we should lose the creation time check (not apply the transform), or we should still have the check (raise an error on invalid bits) which would still leave us in this situation.

    @ethanfurman
    Copy link
    Member

    I'll be offline for a couple hours, but I'll check back.

    @belm0
    Copy link
    Mannequin

    belm0 mannequin commented May 26, 2021

    Either [...] we should lose the creation time check (not apply the transform), or we should still have the check (raise an error on invalid bits) which would still leave us in this situation.

    That is only one option (which undesirable consequences are already identified). We should consider a few options carefully.

    From the start, if default Enum declarations are going to be subject to new runtime exceptions, it's certain that they'll be hit in some existing code. Anything other than the default being legacy-compatible, or defining the new stuff as Enum2, or deferring all the changes to Python 4, is going to lead to bug reports and disgruntled users. I don't think that suggests we should remove the runtime exceptions-- they are what supports the new API and implementation.

    Perhaps KEEP should be renamed to LEGACY, that should be the default, and there should be guidance on how to get the KEEP use cases you've found migrated to one of the other modes. (And changing the default could be considered for Python 4.)

    > But the [CONFORM] class members themselves should not have that transform applied, and raise
    > an error on invalid bits.

    But I'm not sure I understand that.

    It's an important point that should be fixed in the code. CONFORM should be regarding transforming external values into valid member instances. The enum author's declaration itself should be coherent and not rely on such transforms. Users will be surprised when they define a "FOO = 0xFF" and they read the value back as 0x1F or something. The runtime exception lets the enum author know about the issue in the declaration, and normalize it.

    @ethanfurman
    Copy link
    Member

    I'm very much of the practicality beats purity philosophy, so I want to support the OP's flag without making them jump through hoops.

    On the flip side, I also appreciate that there are folks that want the extra security...

    So here's my plan: remove the creation time check (which will remove the error for the OP), and then add a decorator to the enum module that does that creation time check -- just like the unique decorator enforces no aliases.

    Now I just need a good name for that decorator:

    • complete ?
    • exhaustive ?
    • all_named ?
    • check_for_unnamed ?

    @tacaswell
    Copy link
    Mannequin

    tacaswell mannequin commented Jun 4, 2021

    This change also affects PyQt6:

    Python 3.10.0b2+ (heads/3.10:d0991e2db3, Jun  1 2021, 11:42:08) [GCC 11.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import PyQt6.QtCore
    >>> PyQt6.QtCore.PYQT_VERSION_STR
    '6.1.1.dev2105301600'
    >>> PyQt6.QtCore.Qt.Key
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/tcaswell/.pybuild/bleeding/lib/python3.10/enum.py", line 615, in __call__
        return cls._create_(
      File "/home/tcaswell/.pybuild/bleeding/lib/python3.10/enum.py", line 762, in _create_
        return metacls.__new__(metacls, class_name, bases, classdict, boundary=boundary)
      File "/home/tcaswell/.pybuild/bleeding/lib/python3.10/enum.py", line 544, in __new__
        raise TypeError(
    TypeError: invalid Flag 'DropAction' -- missing values: 8, 16, 32, 64, 128, 32768
    >>> 

    remove the creation time check (which will remove the error for the OP), and then add a decorator to the enum module that does that creation time check

    I am very supportive of this plan!

    Could this be labeled as a higher priority / release blocker?

    @pablogsal
    Copy link
    Member

    Raising this to a release blocker for 3.10.0 beta 3 as it breaks PyQt6 and other projects.

    @ethanfurman
    Copy link
    Member

    New changeset eea8148 by Ethan Furman in branch 'main':
    bpo-44242: [Enum] remove missing bits test from Flag creation (GH-26586)
    eea8148

    @ethanfurman
    Copy link
    Member

    New changeset 7496486 by Ethan Furman in branch '3.10':
    [3.10] bpo-44242: [Enum] remove missing bits test from Flag creation (GH-26586) (GH-26635)
    7496486

    @ethanfurman
    Copy link
    Member

    Thank you everyone for your help.

    @ethanfurman ethanfurman added the 3.11 only security fixes label Jun 10, 2021
    @ethanfurman ethanfurman added 3.11 only security fixes and removed release-blocker labels Jun 10, 2021
    @ethanfurman
    Copy link
    Member

    Also changing error reporting to be less susceptible to DOS attacks.

    @ethanfurman ethanfurman reopened this Jun 11, 2021
    @ethanfurman ethanfurman reopened this Jun 11, 2021
    @ethanfurman
    Copy link
    Member

    New changeset c956734 by Ethan Furman in branch 'main':
    bpo-44242: [Enum] improve error messages (GH-26669)
    c956734

    @ethanfurman
    Copy link
    Member

    New changeset 0a186b1 by Miss Islington (bot) in branch '3.10':
    bpo-44242: [Enum] improve error messages (GH-26669)
    0a186b1

    @jacobtylerwalls
    Copy link
    Mannequin

    jacobtylerwalls mannequin commented Jun 14, 2021

    With the followup patch merged, can this be closed now?

    @ethanfurman
    Copy link
    Member

    Yup, just had to get back from the weekend. :-)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants