New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fractions instantiation revisited #72902
Comments
I've spent a bit of time lately trying to optimize the instantiation of Fractions. This is related to bpo-22464, but instead of focusing on constructing Fractions from ints, my attempts revolve around improving instantiation from strings, floats and decimals. CHANGES TO INSTANTIATION FROM STRINGS:
combined this gives me the following performance changes for instantiation of Fractions from different arguments (profiled with 1,000,000 random inputs each): 'x/y'-type of strings: scientific notation (e.g., '1.234E-7'): integer strings (e.g. '123456'): from other Fractions (to test effect on instantiation of numbers.Rational): from int subclass (as another numbers.Rational): SUMMARY of this part + very significant speedup for instatiation from strings of different forms with (near) negligible effects on instantiation from numbers.Rational. + correct parsing of PEP-515-like number strings
CHANGES TO INSTANTIATION FROM FLOAT AND DECIMAL:
Resulting performance changes: standard constructor with float: from_float: standard constructor with decimal: from_decimal: SUMMARY of this part:
I hope at least some of this is interesting. |
Profiling give you only approximate results. In normal execution the effect of your changes can be opposite. Could you please provide benchmarks without using profiling? |
sure, I just happened to have the profiling available since I used it to optimize things. Here's similar microbenchmarks using perf: STRINGS $ python -m perf timeit -s "from fractions import Fraction" "Fraction('1.23456e-7')"
.....................
Median +- std dev: 17.0 us +- 0.3 us
$ python -m perf timeit -s "from frc import Fraction" "Fraction('1.23456e-7')"
.....................
Median +- std dev: 8.95 us +- 0.16 us
$ python -m perf timeit -s "from fractions import Fraction" "Fraction('234/567')"
.....................
Median +- std dev: 12.6 us +- 0.1 us
$ python -m perf timeit -s "from frc import Fraction" "Fraction('234/567')"
.....................
Median +- std dev: 5.45 us +- 0.16 us
$ python -m perf timeit -s "from fractions import Fraction" "Fraction('123456')"
.....................
Median +- std dev: 12.4 us +- 0.6 us
$ python -m perf timeit -s "from frc import Fraction" "Fraction('123456')"
.....................
Median +- std dev: 5.77 us +- 0.12 us
$ python -m perf timeit -s "from fractions import Fraction; f=Fraction(3/4)" "Fraction(f)"
.....................
Median +- std dev: 4.36 us +- 0.06 us
$ python -m perf timeit -s "from frc import Fraction; f=Fraction(3/4)" "Fraction(f)"
.....................
Median +- std dev: 4.59 us +- 0.07 us
$ python -m perf timeit -s "from fractions import Fraction" -s "class myInt(int): pass" -s "i=myInt(123456)" "Fraction(i)"
.....................
Median +- std dev: 4.04 us +- 0.07 us
$ python -m perf timeit -s "from frc import Fraction" -s "class myInt(int): pass" -s "i=myInt(123456)" "Fraction(i)"
.....................
Median +- std dev: 4.27 us +- 0.06 us FLOATS $ python -m perf timeit -s "from fractions import Fraction" "Fraction(1.23456e-7)"
.....................
Median +- std dev: 6.30 us +- 0.28 us
$ python -m perf timeit -s "from frc import Fraction" "Fraction(1.23456e-7)"
.....................
Median +- std dev: 8.64 us +- 0.13 us
$ python -m perf timeit -s "from fractions import Fraction" "Fraction.from_float(1.23456e-7)"
.....................
Median +- std dev: 8.68 us +- 0.14 us
$ python -m perf timeit -s "from frc import Fraction" "Fraction.from_float(1.23456e-7)"
.....................
Median +- std dev: 3.37 us +- 0.17 us DECIMALS (using C implementation this time) $ python -m perf timeit -s "from fractions import Fraction; from decimal import Decimal; d=Decimal('123456')" "Fraction(d)".....................
Median +- std dev: 6.95 us +- 0.21 us
$ python -m perf timeit -s "from frc import Fraction; from decimal import Decimal; d=Decimal('123456')" "Fraction(d)"
.....................
Median +- std dev: 8.43 us +- 0.17 us
$ python -m perf timeit -s "from fractions import Fraction; from decimal import Decimal; d=Decimal('123456')" "Fraction.from_decimal(d)"
.....................
Median +- std dev: 11.6 us +- 0.2 us
$ python -m perf timeit -s "from frc import Fraction; from decimal import Decimal; d=Decimal('123456')" "Fraction.from_decimal(d)"
.....................
Median +- std dev: 4.14 us +- 0.28 us |
Might it make sense to make instantiation from numbers.Rational duck typing based as well? Just try to get the numerator and denominator attributes, on AttributeError, skip it and move on. Unless there is some concern that a non-Rational type might have both attributes and not intend them to mean it's a Rational? |
Similarly, type checking for int might be replaced with calling operator.index on the input and handling the TypeError; that way, anything that has declared itself logically int-like is handled without explicit type checking at all. |
No, I don't think the numeric tower ABC should be replaced by duck-typing. One of the very reasons the fractions module exists is that it showcases how to use the numeric tower. If you want a class to be picked up as a Rational it should be registered as such. Pretty much the same goes for integer-like things. The direct type check for ints only exists to speed up the most obvious case, but anything integer-like should be registered as a numbers.Integral and it will just work with fractions. Beautiful stuff! This is not the case for as_integer_ratio because that method is not defined in the numeric tower and this is why I think duck-typing could be of interest here (I'm not sure though whether its worth the somewhat decreased performance). My reasoning is that the standard constructor is anyway slower for floats and ints than the specialized classmethods for the two so people who really care about speed here (probably not too many) can just use the classmethods. |
This would allow the numpy As far as I can tell, Perhaps an |
Not sure if it's relevant for this specific change, but here's a benchmark that you could use for Fractions: bpo-22458 |
Just FYI and as further motivation, I reimplemented this dedicated parser for quicktions (in Cython, so the timings and speedups are not comparable). I was able to get another quite visible improvement by caching the values of "10 ** shift" for 0 <= shift < 58 in a tuple. Higher values are very unlikely in practice, and the memory size of a tuple with 58 values gives a nice multiple of the usual cache line size. (I originally stored 64 values in my commit but then cut it down later.) I suspect that the difference won't be as big for the Python implementation, but it still seems worth a try. The overall speedup that I got, compared to the initial regex implementation, is 50-70%. [regex] $ python3.7 -m timeit -s 'from quicktions import Fraction as F' 'F("153456/789344")' [regex] $ python3.7 -m timeit -s 'from quicktions import Fraction as F' 'F("15.3456789E+4")' It could be even higher if I additionally moved the int() integer parsing into Cython. Might try that at some point. But that's also not something that concerns the implementation in CPython. |
[Eric Wieser]
I'm confused: as far as I can tell, the NumPy floating-point types don't implement |
Yes, exactly. Conversely, there's little gain in implementing it until |
FYI, adding int.as_integer_ratio() is being taken care in a separate issue: https://bugs.python.org/issue33073 . I've set it aside for a beginning core-dev mentee to work on because it is simple and self-contained. Pretty much all the related work can be done here. |
See https://discuss.python.org/t/pep-3141-ratio-instead-of-numerator-denominator/2037/24?u=jdemeyer for a proposal to define a new dunder __ratio__ (instead of as_integer_ratio) for this. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: