Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a numbits() method for int and long types #47689

Closed
fredrikj mannequin opened this issue Jul 24, 2008 · 95 comments
Closed

create a numbits() method for int and long types #47689

fredrikj mannequin opened this issue Jul 24, 2008 · 95 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@fredrikj
Copy link
Mannequin

fredrikj mannequin commented Jul 24, 2008

BPO 3439
Nosy @rhettinger, @terryjreedy, @mdickinson, @orsenthil, @pitrou, @vstinner
Files
  • numbits.diff
  • numbits-6.patch
  • bit_length_pybench.patch: Temporary patch to pybench, for comparisons.
  • bit_length10.patch
  • bit_length11.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2008-12-17.16:20:52.373>
    created_at = <Date 2008-07-24.17:20:43.790>
    labels = ['interpreter-core', 'type-feature']
    title = 'create a numbits() method for int and long types'
    updated_at = <Date 2010-06-23.07:53:24.373>
    user = 'https://bugs.python.org/fredrikj'

    bugs.python.org fields:

    activity = <Date 2010-06-23.07:53:24.373>
    actor = 'mark.dickinson'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2008-12-17.16:20:52.373>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2008-07-24.17:20:43.790>
    creator = 'fredrikj'
    dependencies = []
    files = ['10972', '12327', '12340', '12364', '12365']
    hgrepos = []
    issue_num = 3439
    keywords = ['patch', 'needs review']
    message_count = 95.0
    messages = ['70216', '70221', '70223', '70228', '70230', '70231', '70312', '71115', '71116', '71376', '71384', '74383', '74725', '74729', '74749', '74750', '74751', '74754', '74755', '74756', '74757', '74759', '74766', '75498', '75751', '75753', '75767', '75770', '75771', '77221', '77407', '77630', '77632', '77636', '77675', '77676', '77677', '77678', '77681', '77682', '77685', '77689', '77714', '77721', '77722', '77723', '77724', '77725', '77726', '77728', '77730', '77738', '77741', '77742', '77747', '77750', '77754', '77756', '77782', '77897', '77898', '77905', '77907', '77908', '77909', '77911', '77913', '77918', '77920', '77922', '77923', '77924', '77925', '77926', '77928', '77929', '77930', '77932', '77935', '77937', '77970', '77980', '77984', '78056', '78066', '78067', '78072', '108322', '108333', '108397', '108401', '108402', '108418', '108422', '108437']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'terry.reedy', 'zooko', 'mark.dickinson', 'orsenthil', 'pitrou', 'vstinner', 'fredrikj']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue3439'
    versions = ['Python 3.1', 'Python 2.7']

    @fredrikj
    Copy link
    Mannequin Author

    fredrikj mannequin commented Jul 24, 2008

    Python 3.0b2 (r30b2:65106, Jul 18 2008, 18:44:17) [MSC v.1500 32 bit
    (Intel)] on
     win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import math
    >>> math.frexp(10**100)
    (0.5714936956411375, 333)
    >>> math.frexp(10**1000)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: Python int too large to convert to C double
    >>>

    (Same behavior in Python 2.5.2, and presumably 2.6 although I haven't
    checked the latter.)

    I think it should be easy to make frexp work for large integers by
    calling PyLong_AsScaledDouble and adding the exponents. It would be
    logical to fix this since math.log(n) already works for large integers.

    My reason for requesting this change is that math.frexp is the fastest
    way I know of to accurately count the number of bits in a Python integer
    (it is more robust than math.log(n,2) and makes it easy to verify that
    the computed size is exact) and this is something I need to do a lot.

    Actually, it would be even more useful to have a standard function to
    directly obtain the bit size of an integer. If I'm not mistaken,
    PyLong_NumBits does precisely this, and would just have to be wrapped.
    Aside from my own needs (which don't reflect those of the Python
    community), there is at least one place in the standard library where
    this would be useful: decimal.py contains an inefficient implementation
    (_nbits) that could removed.

    @fredrikj fredrikj mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jul 24, 2008
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jul 24, 2008

    Would you like to work on a patch?

    @rhettinger
    Copy link
    Contributor

    I prefer your idea to expose PyLong_Numbits(). IMO, frexp() is very
    much a floating point concept and should probably remain that way.

    @rhettinger
    Copy link
    Contributor

    Another reason to leave frexp() untouched is that it is tightly
    coupled to ldexp() as its inverse, for a lossless roundtrip:

    assert ldexp(*frexp(pi)) == pi

    This relationship is bound to get mucked-up or confused if frexp starts
    accepting large integers that are no exactly representable as floats
    (i.e. 2**100+1).

    @fredrikj
    Copy link
    Mannequin Author

    fredrikj mannequin commented Jul 24, 2008

    Raymond, yes, I think that a separate numbits function would better,
    although exposing this functionality would not prevent also changing the
    behavior of frexp. As I said, math.log already knows about long
    integers, so handling long integers similarly in frexp would not be all
    that unnatural. (On the other hand, it is true that math.sqrt, math.pow,
    math.cos, etc could all theoretically be "fixed" to work with
    larger-than-double input, and that slippery slope is probably better
    avoided.)

    Good point about roundtripping, but the problem with that argument is
    that frexp already accepts integers that cannot be represented exactly,
    e.g.:

    >>> ldexp(*frexp(10**100)) == 10**100
    False

    Anyway, if there is support for exposing _PyLong_Numbits, should it be a
    method or a function? (And if a function, placed where? Should it accept
    floating-point input?)

    I'm attaching a patch (for the trunk) that adds a numbits method to the
    int and long types. My C-fu is limited, and I've never hacked on Python
    before, so the code is probably broken or otherwise bad in several ways
    (but in that case you can tell me about it and I will learn something
    :-). I did not bother to optimize the implementation for int, and the
    tests may be redundant/incomplete/placed wrongly.

    A slight problem is that _PyLong_NumBits is restricted to a size_t, so
    it raises an OverflowError on 32-bit platforms for some easily
    physically representable numbers:

    >>> (1<<3*10**9).numbits()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OverflowError: long int too large to convert to int

    This would need to be fixed somehow.

    If numbits becomes a method, should it also be added to the Integral
    ABC? GMPY's mpz type, by the way, defines a method numdigits(self,
    base). This generalization would possibly be overkill, but it's worth
    considering.

    If it's too late to add this method/function for 2.6 and 3.0, please
    update the issue version field as appropriate.

    @rhettinger
    Copy link
    Contributor

    numbers.Integral is already way too fat of an API. Am -1 on expanding
    it further. Recommend sticking with the simplest, least invasive,
    least pervasive version of your request, a numbits() method for ints.

    FWIW, in Py2.6 you can already write:

      def numbits(x):
          return len(bin(abs(x))) - 2

    @mdickinson
    Copy link
    Member

    I'd also be interested in having _PyLong_NumBits exposed to Python in some
    way or another. It's something I've needed many times before, and it's
    used in the decimal module, for example.

    My favorite workaround uses hex instead of bin:

    4*len('%x'%x) - correction_dictionary[first_hexdigit_of_x]

    but this is still O(log x) rather than O(1).

    @mdickinson
    Copy link
    Member

    With the patch, the following code causes a
    non-keyboard-interruptible interpreter hang.

    >> from sys import maxint
    >> (-maxint-1).numbits()

    [... interpreter hang ...]

    The culprit is, of course, the statement

    if (n < 0)
    n = -n;

    in int_numbits: LONG_MIN is negated to itself (this may
    even be undefined behaviour according to the C standards).

    The patch also needs documentation, and that documentation
    should clearly spell out what happens for zero and for
    negative numbers. It's not at all clear that everyone
    will expect (0).numbits() to be 0, though I agree that this
    is probably the most useful definition in practice.

    One could make a case for (0).numbits() raising ValueError:
    for some algorithms, what one wants is an integer k such
    that 2**(k-1) <= abs(n) < 2**k; when n == 0 no such
    integer exists.

    Other than those two things, I think the patch looks fine.

    @mdickinson
    Copy link
    Member

    One possible fix would be to compute the absolute value
    of n as an unsigned long. I *think* the following is
    portable and avoids any undefined behaviour coming
    from signed arithmetic overflow.

    unsigned long absn;
    if (n < 0)
    absn = 1 + (unsigned long)(-1-n);
    else
    absn = (unsigned long)n;

    Might this work?

    Perhaps it would also be worth changing the tests
    in test_int from e.g.

    self.assertEqual((-a).numbits(), i+1)

    to

    self.assertEqual(int(-a).numbits(), i+1)

    This would have caught the -LONG_MAX error.

    @fredrikj
    Copy link
    Mannequin Author

    fredrikj mannequin commented Aug 18, 2008

    Wow, newbie error. Thanks for spotting!

    In every case I can think of, I've wanted (0).numbits() to be 0. The
    explanation in the docstring can probably be improved. What other
    documentation is needed (where)?

    @mdickinson
    Copy link
    Member

    In every case I can think of, I've wanted (0).numbits() to be 0.

    Me too, in most cases, though I've encountered the occasional case where
    raising ValueError would be more appropriate and would catch some bugs
    that might otherwise pass silently.

    So I agree that (0).numbits() should be 0, but I think there's enough
    potential ambiguity that there should be a sentence in the documentation
    making this explicit. Two of the most obvious (wrong) formulas for
    numbits are: (1) numbits(n) = ceiling(log_2(n)), and (2) numbits(n) =
    len(bin(n))-2, but neither of these formulas gives the right result for
    0, or for negative numbers for that matter.

    The explanation in the docstring can probably be improved. What other
    documentation is needed (where)?

    The docstring looked okay to me. There should be more comprehensive
    ReST documentation in the Doc/ directory somewhere, probably in
    Doc/library/stdtypes.rst

    @terryjreedy
    Copy link
    Member

    To add support to the proposal: there is currently yet another thread on
    c.l.p on how to calculate numbits efficiently. The OP needs it for
    prototyping cryptographic algorithms and found Python-level code slower
    than he wanted.

    @fredrikj
    Copy link
    Mannequin Author

    fredrikj mannequin commented Oct 14, 2008

    Some elaboration (that perhaps could be adapted into the documentation
    or at least source comments).

    There are two primary uses for numbits, both of which justify
    (0).numbits() == 0.

    The first is that for positive k, n = k.numbits() gives the minimum
    width of a register that can hold k, where a register can hold the 2**n
    integers 0, 1, ..., 2**n-1 (inclusive). This definition continues to
    make sense for k = 0, n = 0 (the empty register holds the 2**0 = 1
    values 0).

    In Python terms, one could say that self.numbits() "returns the smallest
    n such that abs(self) is in range(2**n)". Perhaps this would make a
    clearer docstring?

    Second, k.numbits() (plus/minus 1, or perhaps multiplied by a constant
    factor) measures the number of steps required to solve a problem of size
    k using various divide-and-conquer algorithms. The problem of size k = 0
    is trivial and therefore requires (0).numbits() == 0 steps.

    In particular, if L is a sorted list, then len(L).numbits() exactly
    gives the maximum number of comparisons required to find an insertion
    point in L using binary search.

    Finally, the convention (-k).numbits() == k.numbits() is useful in
    contexts where the number k itself is the input to a mathematical
    function. For example, in a function for multiplying two integers, one
    might want to choose a different algorithm depending on the sizes of the
    inputs, and this choice is likely to be independent of signs (if not,
    one probably needs to check signs anyway.)

    @vstinner
    Copy link
    Member

    I changed the title since I agree that numbits() with long integer is
    not related to floats.

    @vstinner vstinner changed the title math.frexp and obtaining the bit size of a large integer create a numbits() method for int and long types Oct 14, 2008
    @mdickinson
    Copy link
    Member

    Accidentally removed the following message from Victor Stinner;
    apologies. (Time to turn off tap-to-click on my trackpad, methinks.)

    See also issue bpo-3724 which proposes to support long integers for
    math.log2().

    One other note: in Fredrik's patch there's commented out code for a
    numbits *property* (rather than a method). Is there any good reason to
    make this a property? I don't have a good feeling for when something
    should be a method and when it should be a property, but in this case
    I'd be inclined to leave numbits as a method.

    Are there general guidelines for making things properties?

    @vstinner
    Copy link
    Member

    Accidentally removed the following message from Victor Stinner

    No problem.

    Is there any good reason to make this a property?

    Since numbits() cost is O(n) with n: number of digits. I prefer a
    method than a property because, IMHO, reading a property should be
    O(1) (read an attribute is different than *compute* a value).

    @fredrikj
    Copy link
    Mannequin Author

    fredrikj mannequin commented Oct 14, 2008

    One other note: in Fredrik's patch there's commented out code for a
    numbits *property* (rather than a method). Is there any good reason to
    make this a property?

    Aesthetically, I think numbits as a function would make more sense.
    (Maybe if the hypothetical imath module comes along...)

    Since numbits() cost is O(n) with n: number of digits. I prefer a
    method than a property because, IMHO, reading a property should be
    O(1) (read an attribute is different than *compute* a value).

    Unless I missed something, numbits() is O(1). Only the topmost word in a
    number needs to be examined.

    reading a property should be O(1) (read an attribute is different
    than *compute* a value).

    O(1) is necessary but not sufficient. My sense is that an attribute
    should access an existing "part" of an object while an operation that
    involves creating a "new" object should be a method. Compare
    complex.real/.imag and complex.conjugate().

    @vstinner
    Copy link
    Member

    Unless I missed something, numbits() is O(1).

    Ooops, you're right. I looked quickly at the patch and I
    read "while(n)" but n is a digit, not the number of digits! So it's
    very quick to compute number of bits.

    @terryjreedy
    Copy link
    Member

    I consider .numbits to be an internal property of ints and would prefer
    it accessed that way. To me, this sort of thing is what property() is for.

    Guido has said that the nuisance of tacking on otherwise unnecessary
    empty parens is a warning to the user that getting the answer might take
    a long time.

    Another tack is to notice that numbits is the length of the bit sequence
    representation of an int (excepting 0) and give ints a .__len__ method
    ;-). I would not expect that suggestion to fly very far, though.

    @fredrikj
    Copy link
    Mannequin Author

    fredrikj mannequin commented Oct 14, 2008

    Another tack is to notice that numbits is the length of the bit sequence
    representation of an int (excepting 0) and give ints a .__len__ method
    ;-). I would not expect that suggestion to fly very far, though.

    FWIW, I'm one of the people who'd additionally find indexing and slicing
    of the bits of integers very useful. It's not going to happen, though!

    @vstinner
    Copy link
    Member

    A property /looks/ like an attribute and an user might try to change
    its value: "x=1; x.numbits = 2" (gives 3 or 4 ? :-))

    @rhettinger
    Copy link
    Contributor

    Properties can be made read-only. Also, there is a good precedent:
    c=4+5j; print c.real

    @mdickinson
    Copy link
    Member

    One more minor deficiency in the patch: it gives incorrect results for
    very large integers. For example, on a 32-bit build of the trunk:

    >>> x = 1 << 2**31-1
    >>> x <<= 2**31-1
    >>> x.numbits()  # expect 4294967295
    4294967295L
    >>> x <<= 2
    >>> x.numbits()  # expect 4294967297
    4294967295L

    It would be nicer if the OverflowError from _PyLong_NumBits were
    propagated, so that the second case raises OverflowError instead of giving
    an incorrect result.

    Alternatively, in case of OverflowError one could recompute numbits
    correctly, without overflow, by using Python longs instead of a C size_t;
    but this would mean adding little-used, and probably little-tested, extra
    code for what must be a very rare special case. Probably not worth it.

    @vstinner
    Copy link
    Member

    vstinner commented Nov 4, 2008

    It would be nicer if the OverflowError from _PyLong_NumBits
    were propagated, so that the second case raises OverflowError
    instead of giving an incorrect result

    Why not, but I prefer your second proposition: return a long integer.
    Attached patch implements this solution.

    >>> x=1<<(2**31-1)
    >>> n=x.numbits(); n, n.numbits()
    (2147483648L, 32L)
    >>> x<<=(2**31-1)
    >>> n=x.numbits(); n, n.numbits()
    (4294967295L, 32L)
    >>> x<<=1
    >>> n=x.numbits(); n, n.numbits()
    (4294967296L, 33L) # yeah!

    With my patch, there are two functions:

    • _PyLong_NumBits(long)->size_t: may overflow
    • long_numbits(long)->long: don't raise overflow error, but may raise
      other errors like memory error

    @mdickinson
    Copy link
    Member

    Hi, Victor! Thanks for the updated patch.

    Your patch still hangs on:

    >> from sys import maxint
    >> (-maxint-1).numbits()

    on my 32-bit machine.

    @vstinner
    Copy link
    Member

    (-maxint-1).numbits() hangs

    Ooops, nice catch! It was an integer overflow, already present in
    fredrikj's original patch. The new patch fixes this bug but also
    included a documentation patch ;-)

    @mdickinson
    Copy link
    Member

    The latest patch from Victor looks good. A few comments:

    (1) the number of bits should be computed first directly using C
    arithmetic, and only recomputed using PyLong arithmetic if the C
    computations overflow. For one thing, overflow is going to be very rare
    in practice; for another, in the sort of applications that use
    .numbits(), speed of numbits() is often critical.

    (2) Just as a matter of style, I think "if (x == NULL)" is preferable
    to "if (!x)". At any rate, the former style is much more common in
    Python source.

    (3) the reference counting all looks good.

    (4) I quite like the idea of having numbits be a property rather than a
    method---might still be worth considering?

    @rhettinger
    Copy link
    Contributor

    Of course, the name should have been bit_length() instead of numbits().

    For the code equivalent, I'm aiming for something less process oriented
    and more focused on what it does. bit_length IS the number of bits in a
    binary representation without the sign or leading zeroes -- that
    definition IS a correct mental picture and does not require special
    cases for zero or for negatives.

    The purpose of the code equivalent is not efficiency or beauty; it to
    help communicate was a function does. If you want it to be more
    beautiful, it can be broken into multiple lines.

    I don't think you gain ANY explanatory power with code that says:
    bit_length is the number of right-shifts (or floor divisions by two) of
    the absolute value of a number until that number becomes zero. Tell
    that description to a high school student and prepare for a blank stare.

    FWIW, I think having a mental picture that was too process oriented was
    behind last night's mistake of thinking the (16).bit_length() was 5
    instead of 4. I theorize that error wouldn't have occurred if the
    mental picture was of len('10000') instead of power-of-two mental model
    where people (including some of the commenter here) tend to get the edge
    cases wrong.

    It would be better to have no code equivalent at all than to present the
    repeated //2 method as a definition. That is a process, but not a
    useful mental picture to help explain what bit_length is all about.
    Think about it, bit_length() is about the length in bits of a binary
    representation -- any code provided needs to translate directly to that
    definition.

    def bit_length(x):
        '''Length of a binary representation without the sign or leading zeroes:
        >>> (-37).numbits()
        6
        '''
        s = bin(x)          # binary representation:  bin(-37) --> '-0b100101'
        s = s.lstrip('-0b') # remove leading zeros and sign
        return len(s)

    @mdickinson
    Copy link
    Member

    Okay; I don't have strong feelings about the form the Python code takes;
    I'll let you guys argue it out and then fix things accordingly

    @rhettinger
    Copy link
    Contributor

    IMO, the choices are something like my version or none at all. The
    repeated floor division by two of abs(x) has ZERO explanatory power and
    may even detract from a beginner's ability to understand what the method
    does. Show that code to most finance people and they will avoid the
    method entirely.

    Anyone who disagrees needs to show both code fragments to some junior
    programmers and see which best leads to understanding the method and
    being able to correctly predict the edge cases bordering powers of two,
    the zero case, and how negatives are handled.

    No fair trying this experiment on assembly language programmers ;-)

    @pitrou
    Copy link
    Member

    pitrou commented Dec 16, 2008

    Show that code to most finance people and they will avoid the
    method entirely.

    Why would finance people be interested in bit_length()?

    I think this discussion begins to feel like bikeshedding. Documentation
    can always be improved afterwards.

    @rhettinger
    Copy link
    Contributor

    Antoine, it's not bike-shedding at all. Communicative docs are
    important to users other than assembly language programmers. BTW, I am
    a finance person (a CPA).

    Yes, you're correct, I can fix-up the docs after the patch is posted.

    @mdickinson
    Copy link
    Member

    Updated patch.

    @mdickinson
    Copy link
    Member

    Bah. Fix test_int so that it actually works.

    @mdickinson
    Copy link
    Member

    ...and use proper unittest methods instead of asserts...

    @rhettinger
    Copy link
    Contributor

    Looks good. Marking as accepted.

    Before applying, consider adding back the part of the docs with the '1 +
    floor(...' definition. I think it provides a useful alternative way to
    look at what the method does. Also, it gives a useful mathematical
    expression that can be used in reasoning about invariants. More
    importantly, we should provide it because it is so easy to make a
    mistake when rolling your own version of the formula (i.e. using ceil
    instead of floor or having an off by one error).

    @mdickinson
    Copy link
    Member

    Before applying, consider adding back the part of the docs with the '1 +
    floor(...' definition.

    My only (minor) objection to this definition is that a straight Python
    translation of it doesn't work, thanks to roundoff error and
    the limited precision of floating-point:

    >>> from math import floor, log
    >>> n = 2**101
    >>> n.bit_length()
    102
    >>> 1 + floor(log(n)/log(2))
    101.0
    >>> n = 2**80-1
    >>> n.bit_length()
    80
    >>> 1 + floor(log(n)/log(2))
    81.0

    But as you say, it provides another perspective; I'm fine with
    putting it back in.

    @rhettinger
    Copy link
    Contributor

    Also, consider writing it in the two argument form:

    1 + floor(log(n, 2))

    and using the word "approximately" or somesuch.

    @mdickinson
    Copy link
    Member

    Committed to trunk in r67822, py3k in r67824.

    @vstinner
    Copy link
    Member

    Committed

    Really? YEAH! Great job everybody ;-) I read the code and it looks
    valid. Micro optimisation (for smaller memory footprint): would it be
    possible to share the 32 int table (BitLengthTable) between int and
    long (in Python 2.x)?

    @rhettinger
    Copy link
    Contributor

    32 bytes isn't worth sharing between modules.

    @rhettinger
    Copy link
    Contributor

    Posted some doc cleanups in r67850 and r67851.

    @mdickinson
    Copy link
    Member

    About the footnote:

    floor(log(n, 2)) is poor code. This is not supposed to be a dramatic
    statement, just a statement of fact. Its correctness is dependent on
    minute details of floating point. It is poor code in exactly the same way
    that "while x < 1.0: x += 0.1" is poor code---behaviour in boundary cases
    is almost entirely unpredictable.

    If 1 + floor(log(n, 2)) happens to give the correct result in the common
    corner case where x is a power of 2, then that's due to little more than
    sheer luck. Correct rounding by itself is nowhere near enough to
    guarantee correct results.

    In the case of IEEE 754 doubles, a large part of the luck is that the
    closest double to log(2) just happens to be *smaller* than log(2) itself,
    so that the implicit division by log(2) in log(x, 2) tends to give a
    larger result than the true one; if things were the other way around, the
    formula above would likely fail for many (even small) n.

    So I don't like seeing this poor code in the Python reference manual, for
    two reasons: (1) it might get propagated to real world code, and (2) its
    presence in the docs reflects poorly on the numerical competence of the
    Python developers.

    IMO, either: (1) the warning needs to be stronger, or (2) the formulation
    should be given purely mathematically, without any explicit code, or (3)
    the formula should be left out of the docs altogether.

    Mark

    @rhettinger
    Copy link
    Contributor

    Other possible wording:

    ... so that k is approximately 1 + int(log(abs(x), 2)).

    @rhettinger rhettinger assigned rhettinger and unassigned mdickinson Dec 19, 2008
    @mdickinson
    Copy link
    Member

    ... so that k is approximately 1 + int(log(abs(x), 2)).

    I guess that could work.

    One other thing: the docs for the trunk seem to suggest that we should
    be using trunc here, rather than int. I'm looking at:

    http://docs.python.org/dev/library/stdtypes.html#numeric-types-int-
    float-long-complex

    and particularly the note (2), that says of int(x) and long(x):

    "Deprecated since version 2.6: Instead, convert floats to long
    explicitly with trunc()."

    @zooko
    Copy link
    Mannequin

    zooko mannequin commented Jun 21, 2010

    There is a small mistake in the docs:

    def bit_length(x):
        'Number of bits necessary to represent self in binary.'
        s = bin(x)          # binary representation:  bin(-37) --> '-0b100101'
        s = s.lstrip('-0b') # remove leading zeros and minus sign
        return len(s)       # len('100101') --> 6

    is probably supposed to be:

    def bit_length(x):
        'Number of bits necessary to represent x in binary.'
        s = bin(x)          # binary representation:  bin(-37) --> '-0b100101'
        s = s.lstrip('-0b') # remove leading zeros and minus sign
        return len(s)       # len('100101') --> 6

    @orsenthil
    Copy link
    Member

    There is a small mistake in the docs:

    Yes there was. Fixed in 82146.

    @terryjreedy
    Copy link
    Member

    Minor addendum to Mark's last message: in the near release version of 2.7 (rc2), note 2 in 5.4. "Numeric Types — int, float, long, complex" now starts "Conversion from floats using int() or long() truncates toward zero like the related function, math.trunc()" and no longer marks the usage of long as deprecated.

    @mdickinson
    Copy link
    Member

    So there are two issues here:

    • deprecation of int(my_float) and long(my_float)
    • removal of long in 3.x

    I'm not sure which Terry is referring to here.

    On the first, I don't think use of int() with float arguments actually *is* deprecated in any meaningful way. At one point there was a push (related to PEP-3141) to deprecate truncating uses of int and introduce a new builtin trunk, but it never really took hold (and trunc ended up being relegated to the math module0; I certainly don't expect to see such deprecation happen within the lifetime of Python 3.x, so I don't think it would be appropriate to mention it in the 2.x docs.

    On the second, it's possible that there should be a mention somewhere in the 2.x docs that long() no longer exists in 3.x, and that for almost all uses int() works just as well. A separate issue should probably be opened for this.

    @mdickinson
    Copy link
    Member

    Aargh!
    'a new builtin trunk' -> 'a new builtin trunc'

    @vstinner
    Copy link
    Member

    Please open a new issue for the documentation problems, it's no more related to "numbits()" method and this issue is closed.

    @terryjreedy
    Copy link
    Member

    Whoops, sorry to create confusion when I was trying to put this issue to rest completely. Let me try better.

    In his last message, Raymond said "Other possible wording:
    ... so that k is approximately 1 + int(log(abs(x), 2))."

    That is what the current 2.7rc2 doc says.

    In response, Mark said "I guess that could work." but quoted footnote 2, which implied that 'int' should be changed to 'trunc' in the example above.

    This implied to me that there was a lingering .numbits doc issue.
    But footnote 2 is now changed, so I not longer think there is a doc issue, so I reported that, so no one else would think so.

    I hope this is the end of this.

    @mdickinson
    Copy link
    Member

    Ah; sorry for misunderstanding. Thanks for the explanation, Terry!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants