Author gregsmith
Recipients christian.heimes, gregsmith, loewis, mark.dickinson, pitrou, rhettinger, sopoforic, tim.peters
Date 2009-10-24.14:05:25
SpamBayes Score 8.32667e-16
Marked as misclassified No
Message-id <>
Antoine, "most uses of it are for small ints (< 2**32 or 2**64), ",
that's my point exactly, the current long type is quite slow for those
cases because of sign-magnitude implementation.

I don't see a problem with sign-magnitude for old long ints, or for GMP,
since in those cases the assumption is that you have a fair percentage
of large #s, this is not valid in Py3.0 where most are small. Most
operations are add and subtract, and in each such operation you need to
look at both signs, and decide if you want to really add or subtract,
and if you are subtracting, you then have to do a magnitude test to see
which way - all of that before you do any actual computation. That's a
lot of overhead for e.g. 'i -= 1'. Especially given that the sign &
length information are rolled together into a single value; you need a
fair number of tests & conditional branches. With a 2's complement
implementation you would test for the most common case where both values
are 1 digit (which would include 0 value) and in that case do an add (or
subtract) followed by a check for overflow, and that's the whole thing.
I'm guessing this is 10% - 25% as much time as in the current
implementation - this is one of those cases where the time for tests and
setup dominate over the actual computation.

Mark, what you've written looks fine to me. It would be a bit faster
with an add-with-carry, but there's no conditional branches in the
generated code, so it's not bad, it's still faster than it would be if
you were extracting carries from the msb of each result and masking them
afterwards. Bear in mind that some RISC processors don't have an
add-with-carry (lacking condition codes altogether) so the method of
comparing the sum to the inputs is the only method available. Given that
in Python the operation is in a loop, I think it's unreasonable to to
expect a compiler to identify an add-with-carry application, but at the
same time it's not necessary. Now that the long int is the *only* int
type, multi-digit ops will be relatively uncommon, and you want to focus
on reducing the overhead before and after the operation.

(And, for speed freaks who want to use long ints to implement large bits
arrays and want fast bit ops - my original motivation for this issue -
it would be possible to use SSE2 vector instructions on specific
processors. That can also be done with the 15-bit implementation, but it
works much better with 32).

The 'algorithmic C' package
is a C++ template class for arbitrary-length signed/unsigned integers.
It's not suitable for use in Python because of license issues and
because its integers are controlled by templates and are fixed-size at
compile time - but it uses Kx32-bit 2's complement format, and
implements adds, shifts, multiplies, logic ops, and limited division
ops. It works very well, with extremely low overhead on small values -
often the code is exactly the same as if you used int type  - that would
not be possible with a sign/magnitude approach.

As for multiplies and divides - it's certainly possible to proceed
through a multiply entirely in 2's complement, so no overhead is needed
for abs value, or changing sign at the end. For division, it's possible
if the denominator is > 0, and it may be possible if <0 (but probably
not worth the hassle).

I would be happy to do a design doc for this and write some of the inner
loops, but if (a) it's already being done or (b) there's no chance of it
being deployed then it would be a waste of time...
if there was definite interest in it (and reasonable schedule
expectations) I could write a lot more of it.
Date User Action Args
2009-10-24 14:05:30gregsmithsetrecipients: + gregsmith, tim.peters, loewis, rhettinger, mark.dickinson, pitrou, christian.heimes, sopoforic
2009-10-24 14:05:28gregsmithsetmessageid: <>
2009-10-24 14:05:27gregsmithlinkissue1087418 messages
2009-10-24 14:05:25gregsmithcreate