Message 77897 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhettinger
Recipients	fredrikj, loewis, mark.dickinson, pitrou, rhettinger, terry.reedy, vstinner
Date	2008-12-16.12:41:39
SpamBayes Score	5.3607507e-11
Marked as misclassified	No
Message-id	<1229431303.3.0.869291081934.issue3439@psf.upfronthosting.co.za>
In-reply-to

Content
In the whatsnew docs, add two lines showing the binary representation of 37. That will provide clarification as to why the answer is six: >>> n = 37 + >>> bin(37) + '0b100101' >>> n.bit_length() 6 Also, the main entry in the docs should be built-out just a bit. I like that the current entry provides a precise mathematical spec that is easily validated, but it should also mention the more intuitive notion of the "number of bits in a number's binary representation excluding leading zeros (for example the decimal number 37, which is 0b100101 in binary, has six binary digits)." Possibly, the BitLengthTables can be dropped in favor of the old shift-one, add-one code (to be inserted right after the shift-eight, add-eight code). The two tables consume a lot of memory and the relevant entry is not likely to be in cache when needed (cache misses are very slow). It is simpler and possibly faster to skip the table lookup unless the table is made much smaller. Plus, it makes the code a bit more maintainable and self-evidently correct (not requiring as extensive test coverage to validate the whole table). My own crack at optimization looks like this: static const char BitLengthTable[32] = {0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5}; while (n >= 32) { r += 5; n >>= 5; } r += (long)(BitLengthTable[n]); If you don't adopt mine, at least consider making a table of chars instead of longs (this will give a 4:1 or 8:1 table compression, reducing the memory footprint and increasing the likelihood of a cache hit). I would feel better about the test code if it also directly validated the definitions in docs and thoroughly exercised the tables: for x in range(-65000, 65000): k = x.numbits if x > 0: assert 2 (k-1) <= x < 2k assert k == 1 + math.floor(math.log(x) / math.log(2)) elif x == 0: assert k == 0 else: assert k == (-x).numbits() Other than that, I'm basically happy with the patch.

In the whatsnew docs, add two lines showing the binary representation of
37.  That will provide clarification as to why the answer is six:

      >>> n = 37
+     >>> bin(37)
+     '0b100101'
      >>> n.bit_length()
      6

Also, the main entry in the docs should be built-out just a bit.  I like
that the current entry provides a precise mathematical spec that is
easily validated, but it should also mention the more intuitive notion
of the "number of bits in a number's binary representation excluding
leading zeros (for example the decimal number 37, which is 0b100101 in
binary, has six binary digits)."

Possibly, the BitLengthTables can be dropped in favor of the old
shift-one, add-one code (to be inserted right after the shift-eight,
add-eight code).  The two tables consume a lot of memory and the
relevant entry is not likely to be in cache when needed (cache misses
are very slow).  It is simpler and possibly faster to skip the table
lookup unless the table is made much smaller.  Plus, it makes the code a
bit more maintainable and self-evidently correct (not requiring as
extensive test coverage to validate the whole table).

My own crack at optimization looks like this:

        static const char BitLengthTable[32] =
            {0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
             5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5};
        while (n >= 32) { r += 5;   n >>= 5; }
        r += (long)(BitLengthTable[n]);

If you don't adopt mine, at least consider making a table of chars
instead of longs (this will give a 4:1 or 8:1 table compression,
reducing the memory footprint and increasing the likelihood of a cache hit).

I would feel better about the test code if it also directly validated
the definitions in docs and thoroughly exercised the tables:

for x in range(-65000, 65000):
    k = x.numbits
    if x > 0:
         assert 2 ** (k-1) <= x < 2**k
         assert k == 1 + math.floor(math.log(x) / math.log(2))
    elif x == 0:
         assert k == 0
    else:
         assert k == (-x).numbits()



Other than that, I'm basically happy with the patch.

History
Date	User	Action	Args
2008-12-16 12:41:43	rhettinger	set	recipients: + rhettinger, loewis, terry.reedy, mark.dickinson, pitrou, vstinner, fredrikj
2008-12-16 12:41:43	rhettinger	set	messageid: <1229431303.3.0.869291081934.issue3439@psf.upfronthosting.co.za>
2008-12-16 12:41:42	rhettinger	link	issue3439 messages
2008-12-16 12:41:39	rhettinger	create