Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timedelta multiply and divide by floating point #42364

Closed
agthorr mannequin opened this issue Sep 12, 2005 · 33 comments
Closed

timedelta multiply and divide by floating point #42364

agthorr mannequin opened this issue Sep 12, 2005 · 33 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@agthorr
Copy link
Mannequin

agthorr mannequin commented Sep 12, 2005

BPO 1289118
Nosy @tim-one, @rhettinger, @mdickinson, @abalkin, @vstinner
Files
  • dt.diff
  • issue1289118-py3k.diff
  • timedelta_arith.py: Python reference implementation for timedelta * float, timedelta / float
  • issue1289118-nodoc.diff: Code + tests, round to nearest even
  • issue1289118+issue8817-nodoc.diff: Use _PyLong_Divmod_Near
  • issue1289118+issue8817-withdoc.diff: Updated "Use _PyLong_Divmod_Near"
  • issue1289118-withdoc.diff: Updated patch after issue8817 patch commited
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/abalkin'
    closed_at = <Date 2010-05-31.19:01:14.720>
    created_at = <Date 2005-09-12.21:41:10.000>
    labels = ['type-feature', 'library']
    title = 'timedelta multiply and divide by floating point'
    updated_at = <Date 2010-05-31.19:01:14.719>
    user = 'https://bugs.python.org/agthorr'

    bugs.python.org fields:

    activity = <Date 2010-05-31.19:01:14.719>
    actor = 'belopolsky'
    assignee = 'belopolsky'
    closed = True
    closed_date = <Date 2010-05-31.19:01:14.720>
    closer = 'belopolsky'
    components = ['Library (Lib)']
    creation = <Date 2005-09-12.21:41:10.000>
    creator = 'agthorr'
    dependencies = []
    files = ['8372', '17419', '17426', '17450', '17460', '17464', '17476']
    hgrepos = []
    issue_num = 1289118
    keywords = ['patch']
    message_count = 33.0
    messages = ['26263', '26264', '26265', '26266', '26267', '26268', '55569', '55570', '78185', '103761', '103764', '103770', '103772', '103774', '103775', '106166', '106186', '106216', '106220', '106223', '106378', '106381', '106383', '106387', '106388', '106389', '106421', '106432', '106435', '106439', '106492', '106733', '106800']
    nosy_count = 7.0
    nosy_names = ['tim.peters', 'mcherm', 'rhettinger', 'mark.dickinson', 'belopolsky', 'vstinner', 'stutzbach']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1289118'
    versions = ['Python 3.2']

    @agthorr
    Copy link
    Mannequin Author

    agthorr mannequin commented Sep 12, 2005

    In python 2.4.1, the datetime.timedelta type allows for
    the multiplication and division by integers. However,
    it raises a TypeError for multiplication or division by
    floating point numbers. This is a counterintutive
    restriction and I can't think of any good reason for it.

    For example:

    >>> import datetime
    >>> datetime.timedelta(minutes=5)/2
    datetime.timedelta(0, 150)
    >>> datetime.timedelta(minutes=5)*0.5
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: unsupported operand type(s) for *:
    'datetime.timedelta' and 'float'

    @agthorr agthorr mannequin added stdlib Python modules in the Lib dir labels Sep 12, 2005
    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    Tim, do you prefer the current behavior?

    @mcherm
    Copy link
    Mannequin

    mcherm mannequin commented Sep 15, 2005

    Logged In: YES
    user_id=99874

    I, too, would like to know what Tim thinks, but for what it's
    worth (not much) I find Daniel's point fairly convincing...
    multiplication by floats is an operation that makes sense, has
    only one possible obvious meaning, and is not particularly
    likely to cause errors (the way multiplying Decimal's with
    floats does). So IF it's easy to implement, I say go for it.

    @tim-one
    Copy link
    Member

    tim-one commented Sep 15, 2005

    Logged In: YES
    user_id=31435

    timedelta arithmetic is 100% portable now, and wholly
    explainable in terms of universally understood integer
    arithmetic. Throw floats into it, and that's lost.

    That said, I don't have a strong objection to complicating the
    implementation if there _are_ strong use cases. The OP's
    example isn't "a use case": it's not worth anything to let
    someone multiply a timedelta by 0.5 instead of dividing by 2.
    I don't have a use case to offer in its place (never felt a need
    here).

    If someone wants to work on it, note that a timedelta can
    contain more than 53 bits of information, so, e.g., trying to
    represent a timedelta as an IEEE double-precision number of
    microseconds can lose information. This makes a high-
    qualty "computed as if to infinite precision with one rounding
    at the end" implementation of mixed datetime/float arithmetic
    tricky to do right.

    @agthorr
    Copy link
    Mannequin Author

    agthorr mannequin commented Sep 15, 2005

    Logged In: YES
    user_id=6324

    Let me elaborate on the use-case where I originally ran into
    this.

    I'm conducting a series of observation experiments where I
    measure the duration of an event. I then want to do various
    statistical analysis such as computing the mean, median,
    etc. Originally, I tried using standard functions such as
    lmean from the stats.py package. However, these sorts of
    functions divide by a float at the end, causing them to fail
    on timedelta objects. Thus, I have to either write my own
    special functions, or convert the timedelta objects to
    integers first (then convert them back afterwards).

    Basically, I want timedelta objects to look and act like
    fixed-point arithmetic objects so that I can reuse other
    functions on them that were originally developed to operate
    on floats. More importantly, I'd rather not have to
    maintain two different versions of the functions to deal
    with different types.

    For implementation, why not multiply the float times .day,
    .seconds, and .microseconds separately, then propagate and
    fraction parts into the next smallest group (e.g., 0.5 days
    becomes 24*60*60*0.5 seconds).

    I agree it'd be possible to lose information with the wrong
    sequence of operations, but that's always the case when
    using floating point numbers. In other words, that, too, is
    what I would expect from the timedelta implementation.

    @smontanaro
    Copy link
    Contributor

    Logged In: YES
    user_id=44345

    > Thus, I have to either write my own special functions, or convert
    > the timedelta objects to integers first (then convert them back
    > afterwards).

    How about adding tolong() that returns the number of microseconds
    in the timedelta and fromlong() that accepts a long representing
    microseconds and returns a timedelta object? That way the timedelta
    object does a reasonably simple thing and the user is still responsible
    for overflow the normal arithmetic stuff. You can do any sort of
    arithmetic operations on the long (including converting to other
    numeric types) with all the attendant caveats, then convert back to
    a timedelta object at the end.

    @smontanaro
    Copy link
    Contributor

    Attached is a diff to the datetime module that
    implements floating point division. Comments?
    Is it worthwhile to pursue? If so, I'll
    implement the other floating point arithmetic
    operations.

    @smontanaro
    Copy link
    Contributor

    Ummm... make that: "I'll implement multiplication."

    @vstinner
    Copy link
    Member

    I like this idea, it's the opposite of the issue bpo-2706.

    @AlexanderBelopolsky
    Copy link
    Mannequin

    AlexanderBelopolsky mannequin commented Apr 20, 2010

    This is in a way more similar to bpo-1083 than to bpo-2706. I am -1 on this RFE for the same reason as I am opposing allowing true division of timedelta by an int. The timedelta type is fundamentally an integer type. A type delta is just a certain number of microseconds. A timedelta divided by a number or multiplied by a float is logically a fractional number of microseconds and python does not have a type to represent it.

    Daniel's use case of passing timedeltas to a statistical packages is neatly addressed by bpo-2706's timedelta / timedelta (true) division. Just strip the dimensionality from your data by dividing each time delta by a chosen unit interval (depending on the problem, a second, a microsecond or even a day may be appropriate). The result will be a set of floats that your number crunching package will be happy to process.

    Another advantage of this approach is that floats can be processed more efficiently than timedeltas with FP arithmetics and intermediate results will be more accurate in most cases.

    I recommend accepting bpo-2706 and rejecting this issue together with bpo-2706.

    @AlexanderBelopolsky AlexanderBelopolsky mannequin added type-feature A feature request or enhancement labels Apr 20, 2010
    @AlexanderBelopolsky
    Copy link
    Mannequin

    AlexanderBelopolsky mannequin commented Apr 20, 2010

    I meant rejecting bpo-1083, of course.

    @mdickinson
    Copy link
    Member

    The timedelta type is fundamentally an integer type.

    I disagree strongly with this, and find this a bizarre point of view. Regardless of how the timedelta is stored internally, it's used to represent physical times. I doubt there are many applications that care about the fact that each timedelta is an integral number of microseconds.

    Multiplication or division of a time by a float or int makes perfect sense physically, and I think it should be a legal operation here.

    @mdickinson
    Copy link
    Member

    Not sure why this is marked for 3.3.

    @mdickinson
    Copy link
    Member

    I'll take a look at Skip's patch.

    @mdickinson
    Copy link
    Member

    Whoops. I meant to assign this to me, not Skip.

    @mdickinson mdickinson assigned mdickinson and unassigned smontanaro Apr 20, 2010
    @mdickinson
    Copy link
    Member

    Sorry, dropping this again. I've got caught up with too many non-datetime related issues.

    @abalkin
    Copy link
    Member

    abalkin commented May 20, 2010

    dt.diff does not apply to current SVN version anymore. I am attaching a quick update that does not change the actual calculation performed. See bpo-1289118-py3k.diff.

    I am still -1 for the reason I stated before, but I would like to review a working patch first before proposing a resolution.

    @mdickinson
    Copy link
    Member

    Alexander, I still don't understand your objection. What's the downside of allowing the multiplication or division of a timedelta by a float?

    Perhaps it's true that there are applications where timedeltas are best viewed as integers (with an implicitt 'microsecond' unit), but I think it's also true that there are plenty of applications where they're just regarded as a representation of a physical quantity, and there this proposal seems entirely appropriate.

    I *would* want the timedelta * float and timedelta / float operations to be correctly rounded, so that behaviour is entirely predictable; the current patch doesn't do that. But it wouldn't be hard to implement: there are functions available to express a float as a quotient of two integers, and after that the computation can be performed in integer arithmetic.

    @mdickinson
    Copy link
    Member

    Python reference implementation showing how to do correct rounding.

    @mdickinson
    Copy link
    Member

    N.B. There's already logic for doing div_nearest (i.e., divide one integer by another, returning the closest integer to the result) in the long_round function in Objects/longobject.c. It might be worth pulling that logic out and making it available in a _Py function so that it can be reused in other modules.

    @abalkin
    Copy link
    Member

    abalkin commented May 24, 2010

    I am attaching a patch that implements Mark's timedelta_arith.py algorithms in C.

    With rounding well defined, I am close to withdrawing my opposition to supporting mixed timedelta with float operations. There is still one issue that I believe is worth discussing before this feature is accepted. Time, unlike most other physical quantities has a non-arbitrary notion of direction. Therefore, in many applications, rounding towards past or towards future may be preferable to rounding to nearest.

    For example, one of the likely applications of floating point division would be to construct time series from continuous or differently sampled functions. If such series are used to measure correlations between cause and effect, it is important that effect is measured at a time following the cause and not at an early or the same moment.

    As Mark noted in private correspondence, this issue is mitigated by the fact that "with correct rounding, for timedeltas t and s, and a positive float x, it is guaranteed that t <= s implies t op x <= s op x" (where op is either * or /). It is still possible however, that even the case of t < s and t op x == s op x present a problem in some applications.

    Despite this issue, I would support round to nearest even choice over round to past or to future mainly because it is less likely to lead to surprises where (d1/d2) * d2 != d1. This choice also conforms with the round() builtin definition and is somewhat more difficult to implement right using existing means.

    Daniel, would you like to chime in on the questions of how the results of these operations should be rounded?

    If I don't hear principle objections from the "nosy" list, I'll add a documentation patch.

    @stutzbach
    Copy link
    Mannequin

    stutzbach mannequin commented May 24, 2010

    I don't have a strong feeling about the method of rounding. My thinking is: If my application is sensitive to how the last microsecond is rounded, then I shouldn't be using a type that only gives me 1-microsecond precision.

    (Likewise, if my application is sensitive to how the last binary digital of the floating point mantissa is rounded ... I'm in trouble)

    That said, round-to-nearest strikes me as the least-surprising approach.

    @mdickinson
    Copy link
    Member

    Re rounding: I'll just note that timedelta / timedelta -> float currently does round to nearest; I'd find it quite surprising if float * timedelta -> timedelta didn't round to nearest.

    @abalkin
    Copy link
    Member

    abalkin commented May 24, 2010

    It looks like we have a consensus on the rounding mode. Note, however
    that timedelta constructor rounds away from zero at least on
    Intel/MacOS X:

    [-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

    Should this be considered a bug?

    For comparison,

    [-10, -8, -8, -6, -6, -4, -4, -2, -2, 0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10]

    [-10, -8, -8, -6, -6, -4, -4, -2, -2, 0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10]

    @mdickinson
    Copy link
    Member

    Alexander, it looks like Roundup ate some of your message there. :)

    Yes, ideally I'd say that the constructor should be doing round-half-to-even. Though last time I looked, the constructor looked quite complicated (especially for float inputs); it may not be feasible to fix this easily.

    At any rate, we should open a separate issue for this.

    @abalkin
    Copy link
    Member

    abalkin commented May 24, 2010

    Indeed. Here is what I intended:

    """
    >>> from datetime import timedelta as d
    >>> [d(microseconds=i + .5)//d.resolution for i in range(-10,10)]
    [-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

    Should this be considered a bug?

    For comparison,

    >>> [d.resolution*(i+0.5)//d.resolution for i in range(-10,10)]
    [-10, -8, -8, -6, -6, -4, -4, -2, -2, 0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10]
    
    and
    >>> [round(i+0.5) for i in range(-10,10)]
    [-10, -8, -8, -6, -6, -4, -4, -2, -2, 0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10]
    """

    I checked the documentation and while it says: "If any argument is a float and there are fractional microseconds, the fractional microseconds left over from all arguments are combined and their sum is rounded to the nearest microsecond." it does not specify how half-integers should be handled.

    While it may not be a bug in strict sense, it looks like the code in question can be improved. I'll open a separate issue for this.

    @mdickinson
    Copy link
    Member

    By the way, does your patch do the right thing for

    timedelta(microseconds=1) / -4.0

    ? Because my Python code doesn't. :) [If n is negative, then the 2*r > n condition in div_nearest should be 2*r < n instead.]

    @mdickinson
    Copy link
    Member

    There's a patch in bpo-8817 that exposes a round-to-nearest form of divmod in a function called _PyLong_Divmod_Near; this would save on duplication of code.

    @abalkin
    Copy link
    Member

    abalkin commented May 25, 2010

    By the way, does your patch do the right thing for

    timedelta(microseconds=1) / -4.0

    No.

    >>> timedelta(microseconds=1) / -4.0
    datetime.timedelta(-1, 86399, 999999)

    (I just copied your python algorithm ...)

    I will merge with bpo-8817 patch and that should fix the problem.

    @abalkin
    Copy link
    Member

    abalkin commented May 25, 2010

    Attaching a combined bpo-1289118 + bpo-8817 patch. Datetime code now uses bpo-8817's _PyLong_Divmod_Near.

    @abalkin
    Copy link
    Member

    abalkin commented May 25, 2010

    Attaching a new patch with documentation changes, additional tests, updated bpo-8817 patch and a reference leak fix.

    @mdickinson
    Copy link
    Member

    The patch looks good to me. Please replace the tab characters in datetimemodule.c with spaces, though. :)

    @abalkin
    Copy link
    Member

    abalkin commented May 31, 2010

    Committed in r81625. Fixed white space and added a note to "new in 3.2" section of the RST doc.

    @abalkin abalkin closed this as completed May 31, 2010
    @abalkin abalkin closed this as completed May 31, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants