Drop support for 15-bit PyLong digits? #89732

mdickinson · 2021-10-22T10:21:47Z

BPO	45569
Nosy	@tim-one, @terryjreedy, @mdickinson, @scoder
PRs	bpo-45569: Discover which buildbots are using 15-bit PyLong digits #30306 bpo-45569: Change PYLONG_BITS_IN_DIGIT default to 30 #30497

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-10-22.10:21:47.461>
labels = []
title = 'Drop support for 15-bit PyLong digits?'
updated_at = <Date 2022-01-14.18:55:08.328>
user = 'https://github.com/mdickinson'

bugs.python.org fields:

activity = <Date 2022-01-14.18:55:08.328>
actor = 'mark.dickinson'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2021-10-22.10:21:47.461>
creator = 'mark.dickinson'
dependencies = []
files = []
hgrepos = []
issue_num = 45569
keywords = ['patch']
message_count = 11.0
messages = ['404746', '404794', '404852', '409365', '409385', '409415', '410139', '410421', '410425', '410511', '410589']
nosy_count = 4.0
nosy_names = ['tim.peters', 'terry.reedy', 'mark.dickinson', 'scoder']
pr_nums = ['30306', '30497']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue45569'
versions = []

mdickinson · 2021-10-22T10:21:47Z

Looking at issue bpo-35037 (which is a compatibility issue having to do with PYLONG_BITS_IN_DIGIT), I'm wondering whether it would make sense to drop support for 15-bit PyLong digits altogether. This would simplify some of the code, eliminate a configuration option, and eliminate the scope for ABI mismatches like that occurring in bpo-35037.

There were a couple of reasons that we kept support for 15-bit digits when 30-bit digits were introduced, back in bpo-4258.

At the time, we wanted to use long long for the twodigits type with 30-bit digits, and we couldn't guarantee that all platforms we cared about would have long long or another 64-bit integer type available.
It wasn't clear whether there were systems where using 30-bit digits in place of 15-bit digits might cause a performance regression.

Now that we can safely rely on C99 support on all platforms we care about, the existence of a 64-bit integer type isn't an issue (and indeed, we already rely on the existence of such a type in dtoa.c and elsewhere in the codebase).

As to performance, I doubt that there are still platforms where using 15-bit digits gives a performance advantage, but I admit I haven't checked.

tim-one · 2021-10-22T16:45:20Z

+1 "in theory". But I too don't know whether any platforms would be adversely affected, or how to find out :-(

terryjreedy · 2021-10-23T05:52:58Z

How about: create a fake test file test/test_xintperf with test case and test method(s) that run timeit with a suite of int operations and print report. Create 'draft' PRs for main and 3.10 (and 3.9?). Run just this test on buildbots. (I believe I have seen this done.)

Improvement: write script to find buildbot results and consolidate.

Is there already a template for something like this? If not, make one.

mdickinson · 2021-12-30T12:45:12Z

I posted a request for information on usage of 15-bit digits to python-dev: https://mail.python.org/archives/list/python-dev@python.org/thread/ZICIMX5VFCX4IOFH5NUPVHCUJCQ4Q7QM/

mdickinson · 2021-12-30T19:57:23Z

Terry:

create a fake test file test/test_xintperf [...]

Sounds reasonable, though I'm not sure I know what exact timings I'd want to try. Maybe some of the stock integer-heavy Python benchmarks (pidigits, etc.).

I realised that I have no idea whether any of the buildbots actually use 15-bit digits. I've created PR #74492 to find out.

mdickinson · 2021-12-31T11:32:43Z

I've created PR #74492 to find out.

Results: we have two Gentoo/x86 buildbots, and a 32-bit Windows build in GitHub Actions: those machines use 15-bit digits, as a result of the logic in pyport.h that chooses 15-bit digits if SIZEOF_VOID_P < 8. Everything else seems to be using 30-bit digits.

GPS pointed out in the python-dev discussion that the other platform we should be thinking about is 32-bit ARM.

mdickinson · 2022-01-09T11:29:38Z

First step in #74682, which changes the default to 30-bit digits unconditionally, instead of having the default be platform dependent.

mdickinson · 2022-01-12T18:44:50Z

Adding Stefan Behnel to the nosy, since Cython is one of the few projects that might be directly affected by this change. Stefan: can you see any potential problems with changing the default to 30 here?

scoder · 2022-01-12T19:51:45Z

Cython should be happy with whatever CPython uses (as long as CPython's header files agree with CPython's build ;-) ).

I saw the RasPi benchmarks on the ML. That would have been my suggested trial platform as well.
https://mail.python.org/archives/list/python-dev@python.org/message/5RJGI6THWCDYTTEPXMWXU7CK66RQUTD4/

The results look ok. Maybe the slowdown for pickling is really the increased data size of integers. And it's visible that some compute-heavily benchmarks like pyaes did get a little slower. I doubt that they represent a real use case on such a platform, though. Doing any kind of number crunching on a RasPi without NumPy would appear like a rather strange adventure.

That said, if we decide to keep 15-bit digits in the end, I wonder if "SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably fast 64-bit multiply or not" kind of decision – however that translates into code. I'm sure there are 32-bit platforms that would actually benefit from 30-bit digits today.

If we find a platform that would be fine with 30-bits but lacks a fast 64-bit multiply, then we could still try to add a platform specific value size check for smaller numbers. Since those are common case, branch prediction might help us more often than not.

But then, I wonder how much complexity this is even worth, given that the goal is to reduce the complexity. Platform maintainers can still decide to configure the digit size externally for the time being, if it makes a difference for them. Maybe switching off 15-bits by default is just good enough for the next couple of years to come. :)

mdickinson · 2022-01-13T19:17:01Z

Thanks, Stefan. I think I'm going to go ahead with the first step of making 30-bit digits the default, then, but leaving the 15-bit digit option present.

That said, if we decide to keep 15-bit digits in the end, I wonder if "SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably fast 64-bit multiply or not" kind of decision

Agreed. And most platforms we care about _do_ seem to have such an instruction, so "30-bit digits unless the builder explicitly indicates otherwise - e.g., via configure options or pyconfig.h edits" seems reasonable.

My other worry is division. It's less important than multiplication in the sense that I'd expect division operations to be rarer than multiplications in typical code, but the potential impact for code that _does_ make heavy use of division is greater. With 30-bit digits, all the longobject.c source actually *needs* is a 64-bit-by-32-bit unsigned division for cases where the result is guaranteed to fit in a uint32_t. If we're on x86, there's an instruction for that (DIVL), so you'd think that we'd be fine. But without using inline assembly, it seems impossible to persuade current versions of either of GCC or Clang[*] to generate that DIVL instruction - instead, they both want to do a 64-bit-by-64-bit division, and on x86 that involves making a call to a dedicated __udivti3 intrinsic, which is potentially multiple times slower than a simple DIVL.

The division problem affects x64 as well: GCC and Clang currently generate a DIVQ instruction when all we need is a DIVL.

If we find a platform that would be fine with 30-bits but lacks a fast 64-bit multiply, then we could still try to add a platform specific value size check for smaller numbers. Since those are common case, branch prediction might help us more often than not.

Yes, I think that could work, both for multiplication and division.

[*] Visual Studio 2019 _does_ apparently provide a _udiv64 intrinsic, which we should possibly be attempting to use: https://docs.microsoft.com/en-us/cpp/intrinsics/udiv64?view=msvc-170

mdickinson · 2022-01-14T18:55:08Z

New changeset 025cbe7 by Mark Dickinson in branch 'main':
bpo-45569: Change PYLONG_BITS_IN_DIGIT default to 30 (GH-30497)
025cbe7

markshannon · 2023-04-27T12:12:06Z

@mdickinson

I'd like to flip this around and make sure that digits are always half words.
In other words, digits should be 30 bits on 64 machines and 15 bits on 32 bit machines.

This is desirable because it gives us some nice invariants with more efficient integers, #101291:

The long (non-compact) form would always have at least 2 digits, since a single digit value would fit into the compact form.
2 * sizeof(digit) == sizeof(intptr_t)
digit * digit always fits in intptr_t.

I'm assume the primary motivation for this change is to simplify the code, not performance.

markshannon · 2023-04-27T12:14:55Z

Having have 30 bit digits on 32 bits platforms is awkward for #101291 as the compact ints have only 29 bits.

mdickinson · 2023-04-27T16:52:19Z

@markshannon Sounds reasonable to me. I think the recent change in direction in Objects/longobject.c invalidates much of the discussion in this issue - it could be closed as out of date. I trust you to do whatever makes sense here. :-)

erlend-aasland · 2024-04-11T21:05:41Z

@markshannon Sounds reasonable to me. I think the recent change in direction in Objects/longobject.c invalidates much of the discussion in this issue - it could be closed as out of date. I trust you to do whatever makes sense here. :-)

Based on the last comment from @mdickinson, it seems to me we can now close this issue. Correct me if I'm wrong.

tim-one · 2024-04-11T21:09:15Z

@erlend-aasland, yes, I too believe this issue has in fact been resolved.

ezio-melotti transferred this issue from another repository Apr 10, 2022

sweeneyde mentioned this issue Jun 24, 2022

Py311 is 10-20% slower in loops of empty and variant assignments than Py38, Py39 and Py310 faster-cpython/ideas#420

Closed

iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 30, 2023

gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop support for 15-bit PyLong digits? #89732

Drop support for 15-bit PyLong digits? #89732

mdickinson commented Oct 22, 2021

mdickinson commented Oct 22, 2021

tim-one commented Oct 22, 2021

terryjreedy commented Oct 23, 2021

mdickinson commented Dec 30, 2021

mdickinson commented Dec 30, 2021

mdickinson commented Dec 31, 2021

mdickinson commented Jan 9, 2022

mdickinson commented Jan 12, 2022

scoder commented Jan 12, 2022

mdickinson commented Jan 13, 2022

mdickinson commented Jan 14, 2022

markshannon commented Apr 27, 2023

markshannon commented Apr 27, 2023

mdickinson commented Apr 27, 2023

erlend-aasland commented Apr 11, 2024

tim-one commented Apr 11, 2024

Drop support for 15-bit PyLong digits? #89732

Drop support for 15-bit PyLong digits? #89732

Comments

mdickinson commented Oct 22, 2021

mdickinson commented Oct 22, 2021

tim-one commented Oct 22, 2021

terryjreedy commented Oct 23, 2021

mdickinson commented Dec 30, 2021

mdickinson commented Dec 30, 2021

mdickinson commented Dec 31, 2021

mdickinson commented Jan 9, 2022

mdickinson commented Jan 12, 2022

scoder commented Jan 12, 2022

mdickinson commented Jan 13, 2022

mdickinson commented Jan 14, 2022

markshannon commented Apr 27, 2023

markshannon commented Apr 27, 2023

mdickinson commented Apr 27, 2023

erlend-aasland commented Apr 11, 2024

tim-one commented Apr 11, 2024