[Tim]
> The justification for the shift count isn't self-evident, and
> appears to me to be an instance of the generalization of Kummer's
> theorem to multinomial coefficients.
Not sure there's any generalisation here: I think it *is* just Kummer's theorem. Though I confess I wasn't aware that this was a named theorem - I was working directly from what I now discover is called [Legendre's formula](https://en.wikipedia.org/wiki/Legendre%27s_formula), which I originally learned from "Concrete Mathematics" by Knuth et. al., where they also didn't mention any particular names. It's equation 4.24 in my edition; it may have a different number in the 2nd edition.
Kummer's theorem says that the 2-valuation of n-choose-k is the number of carries when k is added to n-k in binary.
Notation: write `bit(x, i)` for the bit at position `i` of `x` - i.e., `(x >> i) & 1`
In the absence of carries when adding `k` to `n-k`, `bit(n, i) = bit(k, i) ^ bit(n-k, i)`. We have an incoming carry whenever `bit(n, i) != bit(k, i) ^ bit(n-k, i)`; i.e., whenever `bit(n ^ k ^ (n-k), i)` is `1`. So the number of carries is the population count of `n ^ k ^ (n-k)`.
> I think it would be clearer at first sight to rely instead on that 2**i/(2**j * 2**k) = 2**(i-j-k), which is shallow.
Sounds fine to me, especially if it makes little performance difference. |