classification
Title: timedelta.total_seconds needlessly inaccurate, especially for negative timedeltas
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: belopolsky, mark.dickinson, pitrou
Priority: normal Keywords: patch

Created on 2010-05-07 12:24 by mark.dickinson, last changed 2010-05-09 09:36 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
issue8644-py3k.diff belopolsky, 2010-05-08 01:47
issue8644.diff belopolsky, 2010-05-08 22:56 Backport to trunk
Messages (11)
msg105197 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-07 12:24
I just noticed (while responding to issue 8643) that timedelta.total_seconds() has some accuracy problems, especially for negative timedeltas:

Python 3.2a0 (py3k:80840:80842, May  7 2010, 12:29:35) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import timedelta
>>> td = timedelta(microseconds = -123)
>>> td.total_seconds() # expect -0.000123
-0.0001230000052601099

This could be easily fixed by using integer arithmetic internally instead of float arithmetic:

>>> 1e-6 * td.microseconds + td.seconds + 86400 * td.days
-0.0001230000052601099
>>> (td.microseconds + 1000000 * (td.seconds + 86400 * td.days)) / 1000000
-0.000123

This works especially nicely in combination with the new float repr and the improved accuracy of true division in 2.7 / 3.2, to give a total_seconds() whose repr is free of noise digits.  (Well, for small timedeltas, anyway.)

Since total_seconds is new in 2.7 and 3.2, I think it would be good to fix this before the 2.7 release.
msg105198 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-07 12:30
P.S.  This change would also make total_seconds() consistent with division by timedelta(seconds=1):

>>> td / timedelta(seconds=1)
-0.000123

In fact, this might even be a good way to re-implement total_seconds internally.
msg105209 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-05-07 16:36
I am attaching a patch for py3k branch.  I am +1 for backporting to 2.7 and I avoided relying on py3k timedelta/timedelta in the patch.  The tests  and docs will need to be modified for the backport.

Technically speaking, this is a change in documented behavior for 2.7 because according to current docs, td.total_seconds() is equivalent to ``td.microseconds / 1000000 + td.seconds + td.days * 24 * 3600``.  Therefore I would appreciate if reviewer made a decision on backport.
msg105232 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-07 21:33
Thanks for the patch!  Comments, in increasing order of fussiness:

(1) there should be a Py_DECREF(total_microseconds) in the case that the "PyLong_FromLong(1000000L)" call fails

(2) in the docstring, replace 'loose' by 'lose'

(3) also in the docstring, I suggest replacing "280 years" by "270 years":   "td == timedelta(seconds = td.total_seconds())" starts failing at 2**33 seconds (around 272.2 Julian years) rather than 2**53 microseconds.
msg105233 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-07 21:38
Grr.  s/docstring/docs/
msg105251 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-05-08 01:45
> (1) there should be a Py_DECREF(total_microseconds) ...

Yes, regrtest -R is not a substitute for thinking.

> (2) in the docstring, replace 'loose' by 'lose'
>

Neither is spellcheck. :-)

> (3) also in the docstring, I suggest replacing "280 years" by "270 years" ...

Yes, and precision is not accuracy.
msg105290 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-08 14:07
Looks good to me.

I definitely think this should go into 2.7 as well;  total_seconds has only seen the light of day in the beta releases so far, so it's unlikely we'd break anyone's code with this change.  (And IMO this is a bugfix, not a feature.)

It might be polite to wait a day or two for the 2.7 backport, though:  2.7 beta 2 is supposed to be being released today.
msg105291 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-08 14:08
If you want a second (third?) opinion, we could ask Antoine.
msg105294 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-08 14:36
[Antoine declines to offer an opinion.]

Committed to py3k in r80979.  I moved the Misc/NEWS entry from the 'Library' section to the 'Extension Modules' section.

Leaving open for the 2.7 backport.
msg105342 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-05-08 22:50
On Sat, May 8, 2010 at 10:07 AM, Mark Dickinson <report@bugs.python.org> wrote:
..
> I definitely think this should go into 2.7 as well;  total_seconds has only seen the light of day
> in the beta releases so far, so it's unlikely we'd break anyone's code with this change.
>  (And IMO this is a bugfix, not a feature.)

Tracker web site seems to be down, so I am trying my luck attaching a
patch to an e-mail reply.  It did work in the past.

The true division by 10**6 could probably be replaced with division by
1e6 in the tests and docs, but I am not sure if the two are always
equivalent.
msg105378 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-09 09:36
Perfect!  Applied in r81020.

You're correct that n/10**6 and n/1e6 aren't the same thing, at least for n large enough:

Python 2.7b2+ (trunk:81019:81020, May  9 2010, 10:33:17) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import division
[35402 refs]
>>> (2**53+1)/10**6
9007199254.740993
[35404 refs]
>>> (2**53+1)/1e6
9007199254.740992
[35404 refs]

In the second case, 2**53+1 first gets converted to a float, and then the division is performed, so there are two points at which a rounding error can be introduced.  The first case only involves one rounding error.
History
Date User Action Args
2010-05-09 09:36:28mark.dickinsonsetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg105378

stage: patch review -> resolved
2010-05-08 22:56:37belopolskysetfiles: + issue8644.diff
2010-05-08 22:54:27belopolskysetfiles: - issue8644.diff
2010-05-08 22:50:20belopolskysetfiles: + issue8644.diff

messages: + msg105342
2010-05-08 14:36:57mark.dickinsonsetresolution: accepted
versions: - Python 3.2
2010-05-08 14:36:40mark.dickinsonsetmessages: + msg105294
2010-05-08 14:08:12mark.dickinsonsetnosy: + pitrou
messages: + msg105291
2010-05-08 14:07:33mark.dickinsonsetmessages: + msg105290
2010-05-08 01:47:56belopolskysetfiles: - issue8644-py3k.diff
2010-05-08 01:47:47belopolskysetfiles: + issue8644-py3k.diff
2010-05-08 01:45:44belopolskysetfiles: - issue8644-py3k.diff
2010-05-08 01:45:30belopolskysetfiles: + issue8644-py3k.diff

messages: + msg105251
2010-05-07 21:38:27mark.dickinsonsetmessages: + msg105233
2010-05-07 21:33:17mark.dickinsonsetmessages: + msg105232
2010-05-07 16:36:03belopolskysetfiles: + issue8644-py3k.diff
keywords: + patch
messages: + msg105209

stage: needs patch -> patch review
2010-05-07 12:30:03mark.dickinsonsetmessages: + msg105198
2010-05-07 12:24:53mark.dickinsoncreate