As noticed by veky on the review, _high_bit() is slow and can be optimized using int.bit_length(). Attached bit_length.patch implements this.

_high_bit(0) returns -1. Maybe an exception must be raised if the argument is < 1? (also fail for negative number)
