Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop support for 15-bit PyLong digits? #89732

Closed
mdickinson opened this issue Oct 22, 2021 · 16 comments
Closed

Drop support for 15-bit PyLong digits? #89732

mdickinson opened this issue Oct 22, 2021 · 16 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@mdickinson
Copy link
Member

BPO 45569
Nosy @tim-one, @terryjreedy, @mdickinson, @scoder
PRs
  • bpo-45569: Discover which buildbots are using 15-bit PyLong digits #30306
  • bpo-45569: Change PYLONG_BITS_IN_DIGIT default to 30 #30497
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-10-22.10:21:47.461>
    labels = []
    title = 'Drop support for 15-bit PyLong digits?'
    updated_at = <Date 2022-01-14.18:55:08.328>
    user = 'https://github.com/mdickinson'

    bugs.python.org fields:

    activity = <Date 2022-01-14.18:55:08.328>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2021-10-22.10:21:47.461>
    creator = 'mark.dickinson'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45569
    keywords = ['patch']
    message_count = 11.0
    messages = ['404746', '404794', '404852', '409365', '409385', '409415', '410139', '410421', '410425', '410511', '410589']
    nosy_count = 4.0
    nosy_names = ['tim.peters', 'terry.reedy', 'mark.dickinson', 'scoder']
    pr_nums = ['30306', '30497']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue45569'
    versions = []

    @mdickinson
    Copy link
    Member Author

    Looking at issue bpo-35037 (which is a compatibility issue having to do with PYLONG_BITS_IN_DIGIT), I'm wondering whether it would make sense to drop support for 15-bit PyLong digits altogether. This would simplify some of the code, eliminate a configuration option, and eliminate the scope for ABI mismatches like that occurring in bpo-35037.

    There were a couple of reasons that we kept support for 15-bit digits when 30-bit digits were introduced, back in bpo-4258.

    • At the time, we wanted to use long long for the twodigits type with 30-bit digits, and we couldn't guarantee that all platforms we cared about would have long long or another 64-bit integer type available.

    • It wasn't clear whether there were systems where using 30-bit digits in place of 15-bit digits might cause a performance regression.

    Now that we can safely rely on C99 support on all platforms we care about, the existence of a 64-bit integer type isn't an issue (and indeed, we already rely on the existence of such a type in dtoa.c and elsewhere in the codebase).

    As to performance, I doubt that there are still platforms where using 15-bit digits gives a performance advantage, but I admit I haven't checked.

    @tim-one
    Copy link
    Member

    tim-one commented Oct 22, 2021

    +1 "in theory". But I too don't know whether any platforms would be adversely affected, or how to find out :-(

    @terryjreedy
    Copy link
    Member

    How about: create a fake test file test/test_xintperf with test case and test method(s) that run timeit with a suite of int operations and print report. Create 'draft' PRs for main and 3.10 (and 3.9?). Run just this test on buildbots. (I believe I have seen this done.)

    Improvement: write script to find buildbot results and consolidate.

    Is there already a template for something like this? If not, make one.

    @mdickinson
    Copy link
    Member Author

    I posted a request for information on usage of 15-bit digits to python-dev: https://mail.python.org/archives/list/python-dev@python.org/thread/ZICIMX5VFCX4IOFH5NUPVHCUJCQ4Q7QM/

    @mdickinson
    Copy link
    Member Author

    Terry:

    create a fake test file test/test_xintperf [...]

    Sounds reasonable, though I'm not sure I know what exact timings I'd want to try. Maybe some of the stock integer-heavy Python benchmarks (pidigits, etc.).

    I realised that I have no idea whether any of the buildbots actually use 15-bit digits. I've created PR #74492 to find out.

    @mdickinson
    Copy link
    Member Author

    I've created PR #74492 to find out.

    Results: we have two Gentoo/x86 buildbots, and a 32-bit Windows build in GitHub Actions: those machines use 15-bit digits, as a result of the logic in pyport.h that chooses 15-bit digits if SIZEOF_VOID_P < 8. Everything else seems to be using 30-bit digits.

    GPS pointed out in the python-dev discussion that the other platform we should be thinking about is 32-bit ARM.

    @mdickinson
    Copy link
    Member Author

    First step in #74682, which changes the default to 30-bit digits unconditionally, instead of having the default be platform dependent.

    @mdickinson
    Copy link
    Member Author

    Adding Stefan Behnel to the nosy, since Cython is one of the few projects that might be directly affected by this change. Stefan: can you see any potential problems with changing the default to 30 here?

    @scoder
    Copy link
    Contributor

    scoder commented Jan 12, 2022

    Cython should be happy with whatever CPython uses (as long as CPython's header files agree with CPython's build ;-) ).

    I saw the RasPi benchmarks on the ML. That would have been my suggested trial platform as well.
    https://mail.python.org/archives/list/python-dev@python.org/message/5RJGI6THWCDYTTEPXMWXU7CK66RQUTD4/

    The results look ok. Maybe the slowdown for pickling is really the increased data size of integers. And it's visible that some compute-heavily benchmarks like pyaes did get a little slower. I doubt that they represent a real use case on such a platform, though. Doing any kind of number crunching on a RasPi without NumPy would appear like a rather strange adventure.

    That said, if we decide to keep 15-bit digits in the end, I wonder if "SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably fast 64-bit multiply or not" kind of decision – however that translates into code. I'm sure there are 32-bit platforms that would actually benefit from 30-bit digits today.

    If we find a platform that would be fine with 30-bits but lacks a fast 64-bit multiply, then we could still try to add a platform specific value size check for smaller numbers. Since those are common case, branch prediction might help us more often than not.

    But then, I wonder how much complexity this is even worth, given that the goal is to reduce the complexity. Platform maintainers can still decide to configure the digit size externally for the time being, if it makes a difference for them. Maybe switching off 15-bits by default is just good enough for the next couple of years to come. :)

    @mdickinson
    Copy link
    Member Author

    Thanks, Stefan. I think I'm going to go ahead with the first step of making 30-bit digits the default, then, but leaving the 15-bit digit option present.

    That said, if we decide to keep 15-bit digits in the end, I wonder if "SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably fast 64-bit multiply or not" kind of decision

    Agreed. And most platforms we care about _do_ seem to have such an instruction, so "30-bit digits unless the builder explicitly indicates otherwise - e.g., via configure options or pyconfig.h edits" seems reasonable.

    My other worry is division. It's less important than multiplication in the sense that I'd expect division operations to be rarer than multiplications in typical code, but the potential impact for code that _does_ make heavy use of division is greater. With 30-bit digits, all the longobject.c source actually *needs* is a 64-bit-by-32-bit unsigned division for cases where the result is guaranteed to fit in a uint32_t. If we're on x86, there's an instruction for that (DIVL), so you'd think that we'd be fine. But without using inline assembly, it seems impossible to persuade current versions of either of GCC or Clang[*] to generate that DIVL instruction - instead, they both want to do a 64-bit-by-64-bit division, and on x86 that involves making a call to a dedicated __udivti3 intrinsic, which is potentially multiple times slower than a simple DIVL.

    The division problem affects x64 as well: GCC and Clang currently generate a DIVQ instruction when all we need is a DIVL.

    If we find a platform that would be fine with 30-bits but lacks a fast 64-bit multiply, then we could still try to add a platform specific value size check for smaller numbers. Since those are common case, branch prediction might help us more often than not.

    Yes, I think that could work, both for multiplication and division.

    [*] Visual Studio 2019 _does_ apparently provide a _udiv64 intrinsic, which we should possibly be attempting to use: https://docs.microsoft.com/en-us/cpp/intrinsics/udiv64?view=msvc-170

    @mdickinson
    Copy link
    Member Author

    New changeset 025cbe7 by Mark Dickinson in branch 'main':
    bpo-45569: Change PYLONG_BITS_IN_DIGIT default to 30 (GH-30497)
    025cbe7

    @markshannon
    Copy link
    Member

    @mdickinson

    I'd like to flip this around and make sure that digits are always half words.
    In other words, digits should be 30 bits on 64 machines and 15 bits on 32 bit machines.

    This is desirable because it gives us some nice invariants with more efficient integers, #101291:

    • The long (non-compact) form would always have at least 2 digits, since a single digit value would fit into the compact form.
    • 2 * sizeof(digit) == sizeof(intptr_t)
    • digit * digit always fits in intptr_t.

    I'm assume the primary motivation for this change is to simplify the code, not performance.

    @markshannon
    Copy link
    Member

    Having have 30 bit digits on 32 bits platforms is awkward for #101291 as the compact ints have only 29 bits.

    @mdickinson
    Copy link
    Member Author

    @markshannon Sounds reasonable to me. I think the recent change in direction in Objects/longobject.c invalidates much of the discussion in this issue - it could be closed as out of date. I trust you to do whatever makes sense here. :-)

    @iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 30, 2023
    @erlend-aasland
    Copy link
    Contributor

    @markshannon Sounds reasonable to me. I think the recent change in direction in Objects/longobject.c invalidates much of the discussion in this issue - it could be closed as out of date. I trust you to do whatever makes sense here. :-)

    Based on the last comment from @mdickinson, it seems to me we can now close this issue. Correct me if I'm wrong.

    @tim-one
    Copy link
    Member

    tim-one commented Apr 11, 2024

    @erlend-aasland, yes, I too believe this issue has in fact been resolved.

    @gpshead gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants