Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fractions benchmark #66648

Closed
scoder opened this issue Sep 22, 2014 · 22 comments
Closed

Add fractions benchmark #66648

scoder opened this issue Sep 22, 2014 · 22 comments
Labels
performance Performance or resource usage type-feature A feature request or enhancement

Comments

@scoder
Copy link
Contributor

scoder commented Sep 22, 2014

BPO 22458
Nosy @pitrou, @scoder, @vstinner, @asvetlov, @skrah, @serhiy-storchaka
Files
  • telco_fractions.py: telco benchmark adapted to use Fraction instead of Decimal
  • bm_telco_fractions.py: telco benchmark adapted to use Fraction, with option to use Decimal
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-13.09:04:06.663>
    created_at = <Date 2014-09-22.07:10:30.707>
    labels = ['type-feature', 'performance']
    title = 'Add fractions benchmark'
    updated_at = <Date 2016-09-13.09:04:06.661>
    user = 'https://github.com/scoder'

    bugs.python.org fields:

    activity = <Date 2016-09-13.09:04:06.661>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-13.09:04:06.663>
    closer = 'vstinner'
    components = ['Benchmarks']
    creation = <Date 2014-09-22.07:10:30.707>
    creator = 'scoder'
    dependencies = []
    files = ['36684', '36685']
    hgrepos = []
    issue_num = 22458
    keywords = []
    message_count = 22.0
    messages = ['227252', '227256', '227264', '227265', '227266', '227269', '227276', '227280', '227285', '227291', '273928', '274135', '274184', '274185', '274186', '274187', '274234', '274244', '274256', '274264', '274268', '276217']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'scoder', 'vstinner', 'asvetlov', 'skrah', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'third party'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue22458'
    versions = []

    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 22, 2014

    Fractions are great for all sorts of exact computations (including money/currency calculations), but are quite slow due to the need for normalisation at instantiation time.

    I adapted the existing telco benchmark to use Fraction instead of Decimal to make this problem more visible. One change I made was to take the data reading out of the measured loop. I/O is not part of what the benchmark should measure.

    Please consider adding it to the benchmark suite.

    @scoder scoder added performance Performance or resource usage type-feature A feature request or enhancement labels Sep 22, 2014
    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 22, 2014

    I just thought that it might also be nice to have a direct comparison with Decimal, so here's an updated benchmark that has an option "--use-decimal" to run the same code with Decimal instead of Fraction.

    Decimal is about 66x faster with Py3.4 on my side (due to the C implementation).

    @pitrou
    Copy link
    Member

    pitrou commented Sep 22, 2014

    Where are Fractions used in the real world?

    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 22, 2014

    As I said, where ever exact calculations are needed. I use them for currency calculations, for example, as they inherently avoid rounding errors during the calculations regardless of the relative size of values. They are basically like Decimal but with arbitrary precision.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 22, 2014

    Le 22/09/2014 14:51, Stefan Behnel a écrit :

    I use them for currency calculations, for example,
    as they inherently avoid rounding errors during the
    calculations regardless of the relative size of values.

    Do other people use them for that purpose, or are you
    the only one?

    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 22, 2014

    I admit that I keep meeting developers who are not aware of their merits, simply because many other programming languages don't have them available. Specifically, many developers with a Java background firmly believe that BigDecimal is the only way to do money calculations. That's clearly wrong, and the heap of (Big-)Decimal bugs in their software that I've been fixing over the years shows just how wrong it is. Using Fractions would have avoided most of them. Decimal is surprisingly difficult to use right, Fractions are actually difficult to get wrong. They are a major feature of the standard library.

    Your question is another good indication that the general awareness for them still needs to be raised. I'm not entirely sure how this relates to this ticket, though. I doubt that you are seriously suggesting that I am the only person using a module in the stdlib, so what exactly are you trying to say?

    @pitrou
    Copy link
    Member

    pitrou commented Sep 22, 2014

    My point is that if fractions are little used right now, there's little point in adding a benchmark about them in the "official" benchmark suite. The benchmark suite does not aim at measuring every possible aspect of Python performance, but at showcasing representative workloads.

    @mdickinson
    Copy link
    Member

    I've always viewed the Fraction type as more of a teaching tool than something intended for real-world calculations. If that's not how others see it, then it might be worth investing some effort in reimplementing Fractions.gcd in C: the Fraction type is almost unusably slow for serious work, and the gcd computations are a clear bottleneck. I've experimented in the past with a C-based implementation of Lehmer's algorithm for gcd computation, getting around a ten-fold speedup.

    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 22, 2014

    Speed improvements for fractions should have their own ticket(s). I created bpo-22464 for this.

    @serhiy-storchaka
    Copy link
    Member

    Where are Fractions used in the real world?

    For example Andrew Svetlov uses them in his online game [1]. May be it will explain this in detail.

    [1] http://asvetlov.blogspot.com/2012/08/numerics.html (on Russian).

    @vstinner
    Copy link
    Member

    Can you please send a pull request to the new https://github.com/python/performance project?

    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 1, 2016

    Done:
    python/pyperformance#10

    Note that I didn't replace the existing telco benchmark as it is more specific to Decimal. The new benchmark makes it possible to compare the decimal and fractions modules for similar operations, though.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 1, 2016

    I think string conversion should be part of this benchmark, or it should be renamed to fraction-arith or something.

    The formatting function can be a significant bottleneck, and if you use Fractions for financial calculations the result will still need to be printed in decimal floating point form.

    That said, is the purpose of this benchmark to show that fractions are slow and generate interest in changing the situation?

    @vstinner
    Copy link
    Member

    vstinner commented Sep 1, 2016

    I think string conversion should be part of this benchmark, or it should be renamed to fraction-arith or something.

    The "telco" benchmark seems to be standard and well defined. I agree
    that we should use a different name.

    > That said, is the purpose of this benchmark to show that fractions
    are slow and generate interest in changing the situation?

    I'm not really convinced that the benchmark is useful in the Python
    benchmark suite, but I'm not strongly opposed to adding it neither :-)

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 1, 2016

    I'm also not opposed to adding it (-0.000) as long as we rename it to bm_fractions.py and change the docstring to "based on the telco benchmark".

    @vstinner
    Copy link
    Member

    vstinner commented Sep 1, 2016

    @Stefan Krah: Please review the pull request.

    python/pyperformance#10

    I suggested to only run the benchmark with the fractions module. I don't see the point of running the benchmark with the decimal module.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 2, 2016

    I've left comments on GitHub.

    [scoder]

    As I said, where ever exact calculations are needed..

    Even if the formatting comment is addressed, the main problem with
    this benchmark is that *both* fraction and decimal calculations
    are in fact exact here.

    You can check that by printing out the values for "t" in the loop
    like I suggested on GitHub.

    This is of course one of the main features of Decimal: If you're
    willing to take the penalty of a large precision, calculations
    can often be exact; similar to taking the penalty of potentially
    huge numerators and denominators in fractions.

    So this benchmark cannot be used to show the superiority of exact
    fractions.

    @scoder
    Copy link
    Contributor Author

    scoder commented Sep 2, 2016

    So this benchmark cannot be used to show the superiority of exact fractions.

    I don't see how a benchmark would be a way to show that. It's certainly not the goal of this benchmark to show that one is computationally better than the other. But if a benchmark for one part of Python allows a direct, reasonable speed comparison with another, why would you object to make use of that?

    is the purpose of this benchmark to show that fractions are slow and generate interest in changing the situation?

    There have been a couple of attempts to make them faster already. The speed improvements in Py3.5 and 3.6 directly result from those changes. Thus, making these improvements visible and comparable across Python versions and different implementations is one of the goals of this benchmark.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 2, 2016

    The reason is that the benchmark is a softball for fractions. Using the fastest fraction implementation (gmpy.mpq) and the best printing method for both types, gmpy.mpq seems to be about 2.5 to 3 times slower than decimal.

    It is however easy to generate benchmarks where mpq is 3000 times slower, depending on how many prime factors accumulate in the denominator.

    Fractions will shine in most benchmarks where decimal is exact anyway.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 2, 2016

    Stefan Krah added the comment:

    Fractions will shine in most benchmarks where decimal is exact anyway.

    According to Stefan's latest message, the purpose of the benchmark is
    not the compare decimal to fractions, but measure the performance of
    fractions for one Python implementation. For example, it can be used
    to check that fractions is faster in Python 3.6 than Python 3.4
    fractions.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 2, 2016

    Which Stefan? :)

    Anyway, the first half of this issue was centered around the
    proposition that fractions are a "better decimal", and the
    latest pull request still has the "decimal backend" in it. :)

    Fast fractions have been around for a long time (Lisp/sbcl),
    but apparently they have never been widely used.

    A fraction-only benchmark is okay, but comparing fractions/decimals
    in one particular "official" benchmark is a bit strange.

    @vstinner
    Copy link
    Member

    The issue is not closed, but moved to
    python/pyperformance#10

    Please continue the discussion there. I'm now slowly trying to move benchmark issues to GitHub.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants