Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enum.Flag instance creation is slow #82226

Closed
anntzer mannequin opened this issue Sep 6, 2019 · 5 comments
Closed

enum.Flag instance creation is slow #82226

anntzer mannequin opened this issue Sep 6, 2019 · 5 comments
Labels
3.8 only security fixes

Comments

@anntzer
Copy link
Mannequin

anntzer mannequin commented Sep 6, 2019

BPO 38045
Nosy @ethanfurman, @serhiy-storchaka, @anntzer
PRs
  • bpo-38045: Avoid quadratic complexity when creating composite IntFlags. #16291
  • bpo-38045: Improve the performance of _decompose() in the enum.py #16483
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-05-14.11:47:45.317>
    created_at = <Date 2019-09-06.12:18:00.704>
    labels = ['3.8']
    title = 'enum.Flag instance creation is slow'
    updated_at = <Date 2020-05-14.11:47:45.316>
    user = 'https://github.com/anntzer'

    bugs.python.org fields:

    activity = <Date 2020-05-14.11:47:45.316>
    actor = 'Antony.Lee'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-05-14.11:47:45.317>
    closer = 'Antony.Lee'
    components = []
    creation = <Date 2019-09-06.12:18:00.704>
    creator = 'Antony.Lee'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38045
    keywords = ['patch']
    message_count = 5.0
    messages = ['351244', '351246', '351265', '357535', '368831']
    nosy_count = 3.0
    nosy_names = ['ethan.furman', 'serhiy.storchaka', 'Antony.Lee']
    pr_nums = ['16291', '16483']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue38045'
    versions = ['Python 3.8']

    @anntzer
    Copy link
    Mannequin Author

    anntzer mannequin commented Sep 6, 2019

    Consider the following example

        from enum import Flag
        F = Flag("F", list("abcdefghijklm"))
        for idx in range(2**len(F) - 1):
            F(idx)

    creating all possible combos of 13 flags, so 8192 instances (yes, I know the instances are cached later, but the initial cost is still significant). Locally, this takes over 4.5s to execute; profiling shows that ~30% of that time is spent executing enum._is_power_of_two(!):

        def _power_of_two(value):
            if value < 1:
                return False
            return value == 2 ** _high_bit(value)

    Indeed, replacing the implementation of _power_of_two by

        import math
        _powers_of_two = {
            2 ** i for i in range(math.ceil(math.log2(sys.maxsize)) + 1)}
        def _power_of_two(value):
            return (value in _powers_of_two if value <= sys.maxsize
                    else value == 2 ** _high_bit(value))

    shaves off ~30% of the runtime (obviously this won't help for huge (>sys.maxsize) flag values, but these are rarer I would think).

    An even better fix, I think, would be for Flag to cache flags_to_check (updating it at the same time as _value2member_map_) instead of recomputing it each time in _decompose

        flags_to_check = [
                (m, v)
                for v, m in list(flag._value2member_map_.items())
                if m.name is not None or _power_of_two(v)
                ]

    as this actually needlessly introduces quadratic complexity when iterating over all possible combos (because the size of _value2member_map_ is growing). (Also, it's not actually clear to me how one can end up with unnamed power-of-two instances in _value2member_map?)

    @anntzer anntzer mannequin added the 3.8 only security fixes label Sep 6, 2019
    @serhiy-storchaka
    Copy link
    Member

    The fast method to check if the value is a power of two:

    not value & (value - 1)
    

    @vstinner vstinner changed the title Flag instance creation is slow enum.Flag instance creation is slow Sep 6, 2019
    @anntzer
    Copy link
    Mannequin Author

    anntzer mannequin commented Sep 6, 2019

    Actually, microoptimizing _power_of_two is a red herring and the problem is the quadratic complexity of _decompose (value2member_map becomes bigger and bigger at each iteration). Replacing the implementation of _decompose by the following fixes the performance issue (the original snippet executes in ~150ms, which may be even more optimizable but already a 30x improvement :)), and still passes test_enum.py (from cpython 3.7.4):

        def _decompose(flag, value):
            """Extract all members from the value."""
            # _decompose is only called if the value is not named
            not_covered = value
            members = []
            for member in flag:
                member_value = member.value
                if member_value and member_value & value == member_value:
                    members.append(member)
                    not_covered &= ~member_value
            if issubclass(flag, IntFlag) and value > 0:
                while not_covered:
                    new_value = 2 ** _high_bit(not_covered)
                    if new_value not in flag._value2member_map_:
                        # construct singleton pseudo-members
                        pseudo_member = int.__new__(flag, value)
                        pseudo_member._name_ = None
                        pseudo_member._value_ = value
                        flag._value2member_map_.setdefault(value, pseudo_member)
                    members.append(flag(new_value))
                    not_covered &= ~new_value
            if not members and value in flag._value2member_map_:
                members.append(flag._value2member_map_[value])
            members.sort(key=lambda m: m._value_, reverse=True)
            if len(members) > 1 and members[0].value == value:
                # we have the breakdown, don't need the value member itself
                members.pop(0)
            return members, not_covered

    Checking for issubclass(flag, IntFlag) is a bit ugly but that's because Flag and IntFlag really need two separate implementations of _decompose -- the former wants to error out on uncovered values, the latter should create them on-the-fly.

    @ethanfurman
    Copy link
    Member

    New changeset 0b41a92 by Ethan Furman (HongWeipeng) in branch 'master':
    bpo-38045: Improve the performance of _decompose() in enum.py (GH-16483)
    0b41a92

    @anntzer
    Copy link
    Mannequin Author

    anntzer mannequin commented May 14, 2020

    Now fixed, I believe.

    @anntzer anntzer mannequin closed this as completed May 14, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants