Message 249300 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tim.peters
Recipients	aconrad, belopolsky, larry, mark.dickinson, r.david.murray, tbarbugli, tim.peters, trcarden, vivanov, vstinner
Date	2015-08-28.22:02:46
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1440799367.82.0.530507114927.issue23517@psf.upfronthosting.co.za>
In-reply-to

Content
> I wish we could use the same algorithm in > datetime.utcfromtimestamp as we use in float > to string conversion. This may allow the > following chain of conversions to round trip in most cases: > > float literal -> float -> datetime -> seconds.microseconds string I don't follow. float->string produces the shortest string that reproduces the float exactly. Any flavor of changing a timestamp to a microsecond-precision datetime is essentially converting a float * 1e6 to an integer - there doesn't seem to be a coherent concept of "shortest integer" that could apply. We have to fill every bit a datetime has. A variant of the code I posted could be "good enough": take the result we get now (truncate float1e6). Also add 1 ulp to the float and do that again. If the results are the same, we're done. If the results are different, and the difference is 1, take the second result. Else keep the first result. What this "means" is that we're rounding up if and only if the original is so close to the boundary that the tiniest possible amount of floating-point noise is all that's keeping it from giving a different result - but also that the float "has enough bits" to represent a 1-microsecond difference (which is true of current times, but in a couple centuries will become false). But that's all nuts compared to just rounding float1e6 to the nearest int, period. There's nothing wrong with doing that. Truncating is propagating the tiniest possible binary fp representation error all the way into the microseconds. It would be defensible _if_ we were using base-10 floats (in which "representation error" doesn't occur for values expressed _in_ base 10). But we're not. Truncating a base-2 float _as if_ it were a base-10 float is certain to cause problems. Like the one this report is about ;-)

> I wish we could use the same algorithm in
> datetime.utcfromtimestamp as we use in float
> to string conversion.  This may allow the
> following chain of conversions to round trip in most cases:
>
> float literal -> float -> datetime -> seconds.microseconds string

I don't follow.  float->string produces the shortest string that reproduces the float exactly.  Any flavor of changing a timestamp to a microsecond-precision datetime is essentially converting a float * 1e6 to an integer - there doesn't seem to be a coherent concept of "shortest integer" that could apply.  We have to fill every bit a datetime has.

A variant of the code I posted could be "good enough":  take the result we get now (truncate float*1e6).  Also add 1 ulp to the float and do that again.  If the results are the same, we're done.  If the results are different, and the difference is 1, take the second result.  Else keep the first result.  What this "means" is that we're rounding up if and only if the original is so close to the boundary that the tiniest possible amount of floating-point noise is all that's keeping it from giving a different result - but also that the float "has enough bits" to represent a 1-microsecond difference (which is true of current times, but in a couple centuries will become false).

But that's all nuts compared to just rounding float*1e6 to the nearest int, period.  There's nothing wrong with doing that.  Truncating is propagating the tiniest possible binary fp representation error all the way into the microseconds.  It would be defensible _if_ we were using base-10 floats (in which "representation error" doesn't occur for values expressed _in_ base 10).  But we're not.  Truncating a base-2 float _as if_ it were a base-10 float is certain to cause problems.  Like the one this report is about ;-)

History
Date	User	Action	Args
2015-08-28 22:02:47	tim.peters	set	recipients: + tim.peters, mark.dickinson, belopolsky, vstinner, larry, r.david.murray, aconrad, vivanov, tbarbugli, trcarden
2015-08-28 22:02:47	tim.peters	set	messageid: <1440799367.82.0.530507114927.issue23517@psf.upfronthosting.co.za>
2015-08-28 22:02:47	tim.peters	link	issue23517 messages
2015-08-28 22:02:46	tim.peters	create