Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an efficient popcount method for integers #74068

Closed
niklasf mannequin opened this issue Mar 22, 2017 · 27 comments
Closed

Add an efficient popcount method for integers #74068

niklasf mannequin opened this issue Mar 22, 2017 · 27 comments
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@niklasf
Copy link
Mannequin

niklasf mannequin commented Mar 22, 2017

BPO 29882
Nosy @tim-one, @rhettinger, @mdickinson, @vstinner, @njsmith, @markshannon, @serhiy-storchaka, @vedgar, @DimitrisJim, @niklasf, @gbtami
PRs
  • bpo-29882: Add an efficient popcount method for integers #771
  • bpo-29882: Add _Py_popcount32() function #20518
  • bpo-29882: _Py_popcount32() doesn't need 64x64 multiply #30774
  • bpo-29882: Fix portability bug introduced in GH-30774 #30794
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-05-29.16:28:39.179>
    created_at = <Date 2017-03-22.17:30:42.549>
    labels = ['interpreter-core', 'type-feature', '3.10']
    title = 'Add an efficient popcount method for integers'
    updated_at = <Date 2022-01-23.09:59:43.073>
    user = 'https://github.com/niklasf'

    bugs.python.org fields:

    activity = <Date 2022-01-23.09:59:43.073>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-05-29.16:28:39.179>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2017-03-22.17:30:42.549>
    creator = 'niklasf'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29882
    keywords = []
    message_count = 26.0
    messages = ['290003', '290013', '290014', '290015', '290016', '290017', '290073', '344215', '344216', '344223', '344224', '344225', '369860', '369878', '369879', '369881', '369887', '369966', '370323', '370423', '370447', '370456', '370987', '372497', '411211', '411357']
    nosy_count = 12.0
    nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'vstinner', 'casevh', 'njs', 'Mark.Shannon', 'serhiy.storchaka', 'veky', 'Jim Fasarakis-Hilliard', 'niklasf', 'gbtami']
    pr_nums = ['771', '20518', '30774', '30794']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue29882'
    versions = ['Python 3.10']

    @niklasf
    Copy link
    Mannequin Author

    niklasf mannequin commented Mar 22, 2017

    An efficient popcount (something equivalent to bin(a).count("1")) would
    be useful for numerics, parsing binary formats, scientific applications
    and others.

    DESIGN DECISIONS

    • Is a popcount method useful enough?
    • How to handle negative values?
    • How should the method be named?

    SURVEY

    gmpy calls the operation popcount and returns -1/None for negative values:

    >>> import gmpy2
    >>> gmpy2.popcount(-10)
    -1

    >> import gmpy
    >> gmpy.popcount(-10)

    From the documentation [1]:

    If x < 0, the number of bits with value 1 is infinite
    so -1 is returned in that case.

    (I am not a fan of the arbitrary return value).

    The bitarray module has a count(value=True) method:

    >>> from bitarray import bitarray
    >>> bitarray(bin(123456789).strip("0b")).count()
    16

    Bitsets [2] exposes __len__.

    There is an SSE4 POPCNT instruction. C compilers call the corresponding
    intrinsic popcnt or popcount (with some prefix and suffix) and they
    accept unsigned arguments.

    Rust calls the operation count_ones [3]. Ones are counted in the binary
    representation of the *absolute* value. (I propose to do the same).

    Introducing popcount was previously considered here but closed for lack
    of a PEP or patch: http://bugs.python.org/issue722647

    Sensible names could be bit_count along the lines of the existing
    bit_length or popcount for gmpy compability and to distinguish it more.

    PERFORMANCE

    $ ./python -m timeit "bin(123456789).count('1')"  # equivalent
    1000000 loops, best of 5: 286 nsec per loop
    $ ./python -m timeit "(123456789).bit_count()"  # fallback
    5000000 loops, best of 5: 46.3 nsec per loop

    [1] https://gmpy2.readthedocs.io/en/latest/mpz.html#mpz-functions
    [2] https://pypi.python.org/pypi/bitsets/0.7.9
    [3] https://doc.rust-lang.org/std/primitive.i64.html#method.count_ones

    @niklasf niklasf mannequin added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Mar 22, 2017
    @mdickinson
    Copy link
    Member

    Can you give some examples of concrete use-cases? I've spent the last six years or so writing scientific applications and parsing all sorts of odd binary formats, and haven't needed or wanted a popcount yet.

    (I am not a fan of the arbitrary return value).

    Agreed: if this were implemented, I think raising ValueError would be the most appropriate thing to do for negative inputs.

    @niklasf
    Copy link
    Mannequin Author

    niklasf mannequin commented Mar 22, 2017

    Searching popcount in Python files on GitHub yields
    a considerable number of examples:

    https://github.com/search?utf8=%E2%9C%93&q=popcount+extension%3Apy&type=Code

    Perhaps intresting:

    • In CPython itself: See count_set_bits in
      Modules/mathmodule.c

    • Domain specific applications: Bitboards in Chess,
      fairly shuffling cards in Poker, comparing molecules

    • Size of bitsets (see bitarray and bitsets I listed above).
      Probably for this reason also as a first class citizen
      in Redis: https://redis.io/commands/bitcount.

    Probably most important:

    ---

    Btw. not a concrete application. I just stumbled upon this.
    popcnt was considered important enough to be included in the
    rather limited WebAssembly instruction set:
    https://github.com/WebAssembly/spec/raw/master/papers/pldi2017.pdf

    @mdickinson
    Copy link
    Member

    Many of those applications are really for bitstrings (chess bitboards, hamming distance), which aren't really the same thing as integers.

    Nice find for the mathmodule.c case. I'd forgotten about that one (though according to git blame, apparently I'm responsible for checking it in). It's a fairly obscure corner case, though.

    Overall, I'm -1 on adding this: I don't think it meets the bar of being useful enough to justify the extra method. I'd suggest that people needing this kind of efficient bitstring operation use a 3rd-party bitstring library instead.

    @serhiy-storchaka
    Copy link
    Member

    I think that adding bitarray or bitset (or both) in the stdlib would better satisfy the needs. There are open issues for adding ability to read or set selected bits or range of bits in int or for bitwise operations on bytes. I think that bitarray and bitset would provide better interface for these operations.

    @tim-one
    Copy link
    Member

    tim-one commented Mar 22, 2017

    See also:

    https://en.wikipedia.org/wiki/Hamming_weight
    

    As that says, there are a number of languages and processors with first class support for a popcount function. I've frequently implemented it in Python when using integers as integer bitsets (i is in the set if and only if bit 2**i is set in the integer), which often - except for finding the cardinality - runs much faster than using general Python sets.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Mar 24, 2017

    I like the name bit_count and I'll gladly add it to gmpy2 with the appropriate changes to exceptions, etc.

    @rhettinger
    Copy link
    Contributor

    Is everyone comfortable with how negative numbers are handled by this patch? It might be better to limit the domain and raise a ValueError rather than make a presumption about what the user intends.

    @serhiy-storchaka
    Copy link
    Member

    I am going to add the imath module. If we decide to add popcount(), it may be better to add it in this module instead of int class.

    @mdickinson
    Copy link
    Member

    Is everyone comfortable with how negative numbers are handled by this patch?

    Not entirely, but it's not terribly wrong and it's consistent with how int.bit_length works. (I'm also a bit uncomfortable about int.bit_lengths behaviour on negative numbers, but it is the way it is.)

    @tim-one
    Copy link
    Member

    tim-one commented Jun 1, 2019

    I prefer that a negative int raise ValueError, but am OK with it using the absolute value instead (i.e., what it does now).

    @mdickinson
    Copy link
    Member

    I am going to add the imath module.

    Aimed at 3.8, or 3.9? This seems somewhat rushed for 3.8.

    @mdickinson
    Copy link
    Member

    I'm re-reviewing this discussion three years on. I'd be happy for this to go into 3.10. Are there other strong opinions? It would be good to either update and merge the PR, or close as rejected.

    @tim-one
    Copy link
    Member

    tim-one commented May 25, 2020

    I see I never explicitly said +1, so I will now: +1 on merging this :-)

    @vstinner
    Copy link
    Member

    Adding a function to a new hypothetical imath module sounds reasonable.

    I'm less comfortable with adding a new method to int type: it would mean that any int subtype "should" implement it.

    For example, should numpy.int64 get this method as well?

    What is the effect on https://docs.python.org/3.9/library/numbers.html?

    Does it make sense to call (True).popcount()?

    @vstinner
    Copy link
    Member

    In CPython itself: See count_set_bits in Modules/mathmodule.c

    Python/hamt.c contains an optimized function:

    static inline uint32_t
    hamt_bitcount(uint32_t i)
    {
    /* We could use native popcount instruction but that would
    require to either add configure flags to enable SSE4.2
    support or to detect it dynamically. Otherwise, we have
    a risk of CPython not working properly on older hardware.

       In practice, there's no observable difference in
       performance between using a popcount instruction or the
       following fallback code.
    
       The algorithm is copied from:
       https://graphics.stanford.edu/~seander/bithacks.html
    \*/
    i = i - ((i \>\> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i \>\> 2) & 0x33333333);
    return (((i + (i \>\> 4)) & 0xF0F0F0F) * 0x1010101) \>\> 24;
    

    }

    Python/pymath.c provides a "unsigned int _Py_bit_length(unsigned long d)" function used by math.factorial, _PyLong_NumBits(), int.__format__(), long / long, _PyLong_Frexp() and PyLong_AsDouble(), etc.

    Maybe we could add a _Py_bit_count().

    See also bpo-29782: "Use __builtin_clzl for bits_in_digit if available" which proposes to micro-optimize _Py_bit_length().

    --

    In the meanwhile, I also added pycore_byteswap.h *internal* header which provides static inline function which *do* use builtin functions like __builtin_bswap32().

    @mdickinson
    Copy link
    Member

    For example, should numpy.int64 get this method as well?

    That's for the NumPy folks to decide (and I've added Nathaniel Smith to the nosy in case he wants to comment), but I don't see any particularly strong reason that NumPy would need to add it. It looks as though the NumPy integer types have survived happily without a bit_length method, for example - I don't even see any issues in the NumPy tracker suggesting that anyone missed it. (Though perhaps that's because in the case of a NumPy int one always has at least an upper bound on the bit_length available.)

    What is the effect on https://docs.python.org/3.9/library/numbers.html?

    No effect, just as int.bit_length has no effect.

    Does it make sense to call (True).popcount()?

    It would be spelled True.bit_count() if the PR goes in as-is, but sure, why not. :-)

    @mdickinson
    Copy link
    Member

    PR is updated and mergeable.

    @mdickinson
    Copy link
    Member

    New changeset 8bd216d by Niklas Fiekas in branch 'master':
    bpo-29882: Add an efficient popcount method for integers (#771)
    8bd216d

    @mdickinson mdickinson added 3.10 only security fixes and removed 3.7 (EOL) end of life labels May 29, 2020
    @markshannon
    Copy link
    Member

    Why are calling a population count method "bit_count()"?
    That seems likely to cause confusion with "bit_length()".

    I might reasonable expect that 0b1000.bit_count() be 4, not 1. It does have 4 bits.
    Whereas 0b1000.population_count() is clearly 1.

    I have no objection to adding this method, just the choice of name.

    @mdickinson
    Copy link
    Member

    Why are calling a population count method "bit_count()"?

    Naming things is hard, but I don't think this is an unreasonable name, and it's not without precedent. Java similarly has Integer.bitCount and BigInteger.bitCount. MySQL has BIT_COUNT.

    @mdickinson
    Copy link
    Member

    A couple of other data points:

    @mark Shannon: what name would you suggest, and why? The term "population count" feels too non-obvious and specialist to me, and anything involving "Hamming" likewise.

    "count_ones" isn't obviously a bit operation.

    "count_set_bits"?

    @vstinner
    Copy link
    Member

    vstinner commented Jun 8, 2020

    New changeset c6b292c by Victor Stinner in branch 'master':
    bpo-29882: Add _Py_popcount32() function (GH-20518)
    c6b292c

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Jun 28, 2020

    Well, bit_sum is what it really is. But I agree it's a terrible name. :-)

    @vstinner
    Copy link
    Member

    New changeset cd8de40 by Victor Stinner in branch 'main':
    bpo-29882: _Py_popcount32() doesn't need 64x64 multiply (GH-30774)
    cd8de40

    @mdickinson
    Copy link
    Member

    New changeset 83a0ef2 by Mark Dickinson in branch 'main':
    bpo-29882: Fix portability bug introduced in python/issues-test-cpython#30774 (bpo-30794)
    83a0ef2

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @donhuvy
    Copy link

    donhuvy commented Nov 13, 2022

    Explicit is better than implicit.
    Simple is better than complex.
    Readability counts.
    Now is better than never.
    Although never is often better than right now.

    If the implementation is easy to explain, it may be a good idea.

    https://stackoverflow.com/questions/74417841/stupid-why-bit-count-equal-binself-count1-also-it-means-not-count-0-i/74417963#74417963

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants