classification
Title: True division of integers could be more accurate
Type: behavior
Components: Interpreter Core Versions: Python 3.0, Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: tim_one Nosy List: christian.heimes, marketdickinson, tim_one, tjreedy
Priority: normal Keywords: patch

Created on 2008-01-12 05:20 by marketdickinson, last changed 2008-05-18 17:04 by marketdickinson.

Files
File name Uploaded Description Edit Remove
int_truediv.py marketdickinson, 2008-01-12 05:20 More accurate truediv for integers: example code
long_division.patch marketdickinson, 2008-05-18 17:04 patch for Objects/longobject.c
Messages
msg59798 (view) Author: Mark Dickinson (marketdickinson) Date: 2008-01-12 05:20
Division of two longs can produce results that are needlessly 
inaccurate:

>>> from __future__ import division
>>> 10**40/10**39
10.000000000000002

The correct result is, of course, 10.0, which is exactly representable 
as a float.

The attached snippet of Python code shows that things don't have to be 
this way.
msg59824 (view) Author: Christian Heimes (christian.heimes) Date: 2008-01-12 16:57
How fast is your algorithm compared to the current algorithm and where
starts the break even zone? Most users use only small integers so it
might be a good idea to use a simpler algorithm for longs with Py_SIZE()
== 1. This is important for py3k.
msg59830 (view) Author: Mark Dickinson (marketdickinson) Date: 2008-01-12 17:18
It would be easy and safe to just use a/b = float(a)/float(b) if both a and b are less 
than 2**53 (assuming IEEE doubles).  Then there wouldn't be any loss of speed for 
small integers.

For large integers the algorithm I posted should run in time linear in the number of 
digits of max(a, b), at least in the worst case (though with appropriate optimizations 
it could be made much faster for 'random' inputs).  The current algorithm has 
essentially O(1) runtime.  

To get proper timings I'd have to code this up properly.  I'll do this, unless there's 
a consensus that it would be a waste of time :)
msg59838 (view) Author: Christian Heimes (christian.heimes) Date: 2008-01-12 20:49
Mark Dickinson wrote:
> To get proper timings I'd have to code this up properly.  I'll do this, unless there's 
> a consensus that it would be a waste of time :)

Why don't you ask Tim? He is the right person for the discussion. I'm
only an interested amateur mathematician.

Christian
msg59839 (view) Author: Mark Dickinson (marketdickinson) Date: 2008-01-12 21:05
Tim:  is this worth fixing?
msg59988 (view) Author: Mark Dickinson (marketdickinson) Date: 2008-01-16 00:22
A related problem is that float(n) isn't always correctly rounded for an integer 
n.  A contrived example:

>>> n = 2**68 + 2**16 - 1
>>> float(n)
2.9514790517935283e+20

Here the difference between float(n) and the true value of n is around 0.99998 
ulps;  a correctly rounded float() would have error at most 0.5 ulps.

I don't regard this as terribly serious: from looking at the code, I *think* 
it's always true that the error is strictly less than 1 ulp, which is just 
enough to guarantee that float(n) == n whenever n is exactly representable as a 
float.

In contrast, the division of two integers can produce results that are up to 3.5 
ulps out from the true value.  This is, in my opinion, a worryingly large error 
for a simple calculation.
msg64034 (view) Author: Terry J. Reedy (tjreedy) Date: 2008-03-19 04:28
To my mind, the inaccurate result is a bug that should be fixed.

Note: (3.0a3)
>>> 10e40/10e39
10.0

The rationale for the division change is that (as far as reasonably
possible) arithmetic operations with same values should give same result
regardless of types.

I have not looked at either algorithm, but if long/long started by
finding divmod(), but added fractional value when remainer is non-zero
instead of tossing it, exact quotients would easily be exact (unless too
large).
msg67031 (view) Author: Mark Dickinson (marketdickinson) Date: 2008-05-18 17:04
Here's a patch that fixes the rounding of integer division.  It includes a 
fast path for the case where both integers are small (less than about 
3.5e12).
History
Date User Action Args
2008-05-18 17:04:25marketdickinsonsetfiles: + long_division.patch
keywords: + patch
messages: + msg67031
2008-03-19 04:28:40tjreedysetnosy: + tjreedy
messages: + msg64034
2008-01-16 00:22:46marketdickinsonsetmessages: + msg59988
2008-01-12 23:33:04rhettingersetassignee: tim_one
2008-01-12 21:05:34marketdickinsonsetnosy: + tim_one
messages: + msg59839
2008-01-12 20:49:42christian.heimessetmessages: + msg59838
2008-01-12 17:18:54marketdickinsonsetmessages: + msg59830
2008-01-12 16:57:16christian.heimessetpriority: normal
nosy: + christian.heimes
messages: + msg59824
2008-01-12 05:20:56marketdickinsonsetcomponents: + Interpreter Core
versions: + Python 2.6, Python 3.0
2008-01-12 05:20:28marketdickinsoncreate