Issue1682

Created on **2007-12-21 17:41** by **jyasskin**, last changed **2014-09-25 11:53** by **wolma**. This issue is now **closed**.

Files | ||||
---|---|---|---|---|

File name | Uploaded | Description | Edit | |

rational.patch | jyasskin, 2007-12-21 17:41 | |||

rational.patch | jyasskin, 2008-01-09 07:36 | V2, ported to 2.6 | ||

rational.patch | jyasskin, 2008-01-13 22:57 | V3, still minimal | ||

rational_tweaks.patch | jyasskin, 2008-01-18 04:05 | Some tweaks on top of the original implementation | ||

lehmer_gcd.py | mark.dickinson, 2008-02-19 19:31 | Revised Lehmer gcd algorithm | ||

lehmer_gcd_2.py | mark.dickinson, 2008-02-22 01:02 | Benchmark gcd implementations | ||

lehmer_gcd.patch | mark.dickinson, 2008-02-22 01:05 | |||

fraction_inline_arith.patch | jyasskin, 2008-02-29 06:37 |

Messages (92) | |||
---|---|---|---|

msg58949 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2007-12-21 17:41 | |

This is written against the 3.0 branch, but before submitting it, I need to backport numbers.py and submit it against 2.6 instead. I'm posting it now to get a head start on comments. The numbers.py part is necessary for this to work but will be a separate patch. |
|||

msg59424 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-07 01:31 | |

Some random comments: take with a large pinch of salt (1) In __init__ I think you want something like: self._numerator = int(numerator)//g instead of self._numerator = int(numerator / g) and similarly for the denominator. At the moment I get, for example: >>> Rational(10**23) rational.Rational(99999999999999991611392,1) (2) What's the reason for repr of a Rational being "rational.Rational(...)", while repr of a Decimal is just "Decimal(...)"? I'm not suggesting that either is wrong; just wondering whether there's some sort of guiding principle that would suggest one or the other. (3) My first thought on looking at the _gcd function was: "Shouldn't there be an abs() somewhere in there"; since the gcd of two (possibly negative) integers is often (usually?) defined as the largest *nonnegative* common divisor. But on closer examination it's clear that the result of _gcd(a, b) has the same sign as that of b (unless b == 0). So _gcd very cleverly chooses exactly the right sign so that the denominator after rescaling is positive. I'm not sure whether this is a happy accident or clever design, but either way it probably deserves a comment. (4) the __ceil__ operation could be spelt more neatly as return -(-a.numerator // a.denominator) ...but perhaps that just amounts to obfuscation :) (5) It looks as though two-argument round just truncates when the second argument is negative. Presmably this isn't what's wanted here? >>> round(Rational(26), -1) # expecting Rational(30, 1) rational.Rational(20,1) (6) The behaviour of the equality test is questionable when floats are involved. For example: >>> 10**23 == float(10**23) # expect False False >>> Rational(10**23) == float(10**23) # I'd also expect False here True Ideally, if x is a Rational and y is a float then I'd suggest that x == y should return True only when the numeric values really are equal. Coding this could be quite tricky, though. One option would be to convert the float to an (exactly equal) Rational first-- -there's code to do this in the version of Rational.py in the sandbox. (7) For purely selfish reasons, I for one will be very happy if/when this makes it into 2.6/3.0: I use Python a lot for teaching (geometry, number theory, linear algebra, ...); it's natural to work with rational numbers in this context; and it'll be much easier to tell an interested student to just download Python than to tell them they also need gmp and gmpy, or some other third party package, just to try out the code examples I've given them. |
|||

msg59425 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-07 02:21 | |

Two more questions: (1) should a Rational instance be hashable? (2) Should "Rational(5,2) == Decimal('2.5')" evaluate to True or False? If Rationals are hashable and 2.5 == Rational(5, 2) == Decimal('2.5') then the rule that objects that compare equal should have equal hash is going to make life *very* interesting... |
|||

msg59582 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-09 07:36 | |

Thanks again for the excellent comments. __init__: good catch. repr(Rational): The rule for repr is "eval(repr(object)) == object". Unfortunately, that doesn't decide between the two formats, since both assume some particular import statements. I picked the one more likely to be unique, and I assume Decimal picked the shorter one. I can go either way. _gcd's sign: It's a happy accident for me. Possibly Sjoerd Mullender designed it that way. I've added a comment and a test. __ceil__: I like that implementation better. 2-argument round: Fixed and tested. equality: Very good point. I've stolen the sandbox code and added Rational.from_float() using it. I think I also need to make this change to the comparisons. hashing: oops, yes these should be hashable. Decimal cheats by comparing != to even floats that it's equal to, so I'm going to assume that they also want Rational(5,2) != Decimal('2.5'). The new patch is against 2.6. |
|||

msg59583 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-09 09:04 | |

FWIW, I'm -1 on moving this module to the standard library. There has been basically *zero* demand for something like this. Because rational implementations seem to be more fun to write than to use, I think there are some competing implementations. |
|||

msg59608 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-01-09 16:09 | |

Raymond, do you *always* have to be such a grinch? The mere existance of competing implementations proves that there's a need. Hopefully having one in the stdlib will inspire others to contribute improvements rather than starting from scratch. |
|||

msg59614 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-09 17:48 | |

Not sure I'm the grinch on this one. I thought PEPs on this were rejected long ago and no one complained afterwards. After years of watching newsgroup discussions and feature requests, I do not recall a single request or seeing a single problem that was better solved with a rational package. On python-help and the tutorial newsgroup, I've never seen a question about the existing module in the Tools directory or even a question on the topic. I think rational modules are ubiquitous for the same reason as Sudoku solvers -- they are cute, an easy exercise, and fun to write. There is some variation in how each chooses to make compromises so the denominators do not get unusably large. Also, there is some variation in sophistication of the GCD/LCD algorithm. I had thought the standards for inclusion in the standard library had risen quite a bit (best-in-class category killers and whatnot). ISTM, this is aspiring cruft -- I cannot remember encountering a real business or engineering problem that needed rational arithmetic (the decimal module seems to meet the rare need for high precision arithmetic). All that being said, maybe the module with serve some sort of educational purpose or will serve to show-off the numeric tower abstract base classes. |
|||

msg59615 - (view) | Author: Facundo Batista (facundobatista) * | Date: 2008-01-09 17:55 | |

The PEP 239 is Rejected (http://www.python.org/dev/peps/pep-0239/). If a Rational data type would be included in the stdlib, my recommendation is that first that PEP would be reopened and pushed until get accepted. Also, note the kind of questions Mark is asking here: the analysis and answer of those questions belongs to a PEP, not to a patch. We risk to not be very rational (1/2 wink) in these decisions. I'd close this patch as Rejected (as for the PEP), at least until the PEP gets accepted (if ever). |
|||

msg59633 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-01-09 23:10 | |

The rejection of PEP 239 was years ago. PEP 3141 was accepted; it includes adding a rational type to the stdlib, and Jeffrey is doing this with my encouragement. The motivation is to have at least one example of each type in our modest numeric tower in the stdlib. I'm sure there will be plenty of uses for rational.py, e.g. in education. Of course, if someone has a better candidate or a proposed improvement, speak up now! It looks like Mark's suggestions can be treated as code review; I don't think we need a new PEP. Note that the mere existence of PEP 239 again points out that the demand is not zero. |
|||

msg59635 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-10 00:04 | |

If this goes in, I have some recommendations: * Follow the decimal module's lead in assiduously avoiding float->rational conversions. Something like Rat.from_float(1.1) is likely to produce unexpected results (like locking in an inexact input and having an unexpectedly large denominator). * Likewise, follow decimal's lead in avoiding all automatic coercisions from floats: Rational(3,10) + 0.3 for example. The two don't mix. * Consider following decimal's lead on having a context that can limit the maximum size of a denominator. There are various strategies for handling a limit overflow including raising an exception or finding the nearest rational upto the max denominator (there are some excellent though complex published algorithms for this) or rounding the nearest fixed-base (very easy). I'll dig out my HP calculator manuals at some point -- they had worked-out a number of challenges with fractional arithmetic (including their own version of an Inexact tag). * Consider adding Decimal.from_rational and Rational.from_decimal. I believe these are both easy and can be done losslessly. * Automatic coercions to/from Decimal need to respect the active decimal context. For example the result of Rational(3,11) + Decimal('3.1415926') is context dependent and may not be commutative. * When in doubt, keep the API minimal so we don't lock-in design mistakes. * Test the API by taking a few numerical algorithms and seeing how well they work with rational inputs (for starters, try http://docs.python.org/lib/decimal-recipes.html ). * If you do put in a method that accepts floats, make sure that it can accept arguments to control the rational approximation. Ideally, you would get something something like this Rational.approximate(math.pi, 6) --> 355/113 that could produce the smallest rationalal approximation to a given level of accuracy. |
|||

msg59636 - (view) | Author: Facundo Batista (facundobatista) * | Date: 2008-01-10 00:15 | |

2008/1/9, Raymond Hettinger said: > * Consider adding Decimal.from_rational and Rational.from_decimal. I > believe these are both easy and can be done losslessly. If it's lossless, why not just allow Decimal(Rational(...)) and Rational(Decimal(...))? |
|||

msg59639 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-10 00:29 | |

> If it's lossless, why not just allow >Decimal(Rational(...)) and Rational(Decimal(...))? Those conversions are fine. It's the float<-->rational conversions that are problematic. There are exact float to rational conversions using math.frexp() but I don't think the results tend to match what users expect (since 1.1 isn't exactly represented). Also, there may be overflow issues with trying to go from rationals to floats when the denomintor is very large. I think the history of rational modules is that simplistic implementations get built and then the writers find that exploding denominators limit their usefulness for anything other than trivial problems. The solution is to limit denominators but that involves less trivial algorithms and a more sophisticated API. |
|||

msg59640 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-01-10 00:38 | |

On Jan 9, 2008 4:29 PM, Raymond Hettinger <report@bugs.python.org> wrote: > I think the history of rational modules is that simplistic > implementations get built and then the writers find that exploding > denominators limit their usefulness for anything other than trivial > problems. The solution is to limit denominators but that involves less > trivial algorithms and a more sophisticated API. A "rational" type that limits denominators (presumably by rounding) isn't needed -- we alreay have Decimal. The proposed rational type is meant for "pure" mathematical uses, just like Python's long ints. Regarding float->rational, I propose to refuse Rational(1.1) for the same reasons as Decimal(1.1) is refused, but to add a separate constructor (a class method?) that converts a float to a rational exactly (as long as it isn't nan or inf), large denominators be damned. This can help people who are interested in taking floating point numbers apart. float(Rational()) should be fine. |
|||

msg59645 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-10 01:23 | |

> * Follow the decimal module's lead in assiduously avoiding > float->rational conversions. Something like Rat.from_float(1.1) is > likely to produce unexpected results (like locking in an inexact input > and having an unexpectedly large denominator). I agree that it's not usually what people ought to use, and I don't think it's an essential part of the API. It might be a useful starting point for the approximation methods though. .trim() and .approximate() in http://svn.python.org/view/sandbox/trunk/rational/Rational.py?rev=40988&view=auto and Haskell's approxRational (http://www.haskell.org/ghc/docs/latest/html/libraries/base/src/Data-Ratio.html) start from rationals instead of floats. On the other hand, it might make sense to provide explicit methods to approximate from floats instead of asking people to execute the two-phase process. I'm happy to give it a different name or drop it entirely. > * Likewise, follow decimal's lead in avoiding all automatic coercisions > from floats: Rational(3,10) + 0.3 for example. The two don't mix. I'm reluctant to remove the fallback to float, unless the consensus is that it's always a bad idea to support mixed-mode arithmetic (which is possible: I was surprised by the loss of precision of "10**23/1" while writing this). Part of the purpose of this class is to demonstrate how to implement cross-type operations. Note that it is an automatic coercion _to_ floats, so it doesn't look like you've gotten magic extra precision. > * Consider following decimal's lead on having a context that can limit > the maximum size of a denominator. There are various strategies for > handling a limit overflow including raising an exception or finding the > nearest rational upto the max denominator (there are some excellent > though complex published algorithms for this) or rounding the nearest > fixed-base (very easy). I'll dig out my HP calculator manuals at some > point -- they had worked-out a number of challenges with fractional > arithmetic (including their own version of an Inexact tag). Good idea, but I'd like to punt that to a later revision if you don't mind. If we do punt, that'll force the default context to be "infinite precision" but won't prevent us from adding explicit contexts. Do you see any problems with that? > * Consider adding Decimal.from_rational and Rational.from_decimal. I > believe these are both easy and can be done losslessly. Decimal.from_rational(Rat(1, 3)) wouldn't be lossless, but Rational.from_decimal is easier than from_float. Then Decimal.from_rational() could rely on just numbers.Rational, so it would be independent of this module. Is that a method you'd want on Decimal anyway? The question becomes whether we want the rational to import decimal to implement the typecheck, or just assume that .as_tuple() does the right thing. These are just as optional as .from_float() though, so we can also leave them for future consideration. > * Automatic coercions to/from Decimal need to respect the active decimal > context. For example the result of Rational(3,11) + > Decimal('3.1415926') is context dependent and may not be commutative. Since I don't have any tests for that, I don't know whether it works. I suspect it currently returns a float! :) What do you want it to do? Unfortunately, giving it any special behavior reduces the value of the class as an example of falling back to floats, but that shouldn't necessarily stop us from making it do the right thing. > * When in doubt, keep the API minimal so we don't lock-in design mistakes. Absolutely! > * Test the API by taking a few numerical algorithms and seeing how well > they work with rational inputs (for starters, try > http://docs.python.org/lib/decimal-recipes.html ). Good idea. I'll add some of those to the test suite. > * If you do put in a method that accepts floats, make sure that it can > accept arguments to control the rational approximation. Ideally, you > would get something something like this Rational.approximate(math.pi, 6) > --> 355/113 that could produce the smallest rationalal approximation to > a given level of accuracy. Right. My initial plan was to use Rational.from_float(math.pi).simplest_fraction_within(Rational(1, 10**6)) but I'm not set on that, and, because there are several choices for the approximation method, I'm skeptical whether it should go into the initial revision at all. |
|||

msg59650 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-10 01:58 | |

> Decimal.from_rational(Rat(1, 3)) wouldn't be lossless It should be implemented as Decimal(1)/Decimal(3) and then let the context handle any inexactness. > Rational.from_decimal is easier than from_float. Yes. Just use Decimal.as_tuple() to get the integer components. > Then Decimal.from_rational() could rely on just numbers. > Rational, so it would be independent of this module. >Is that a method you'd want on Decimal anyway? Instead, just expand the decimal constructor to accept a rational input. > Regarding float->rational, I propose to refuse Rational(1.1) > for the same reasons as Decimal(1.1) is refused, +1 > but to add a separate constructor (a class method?) that > converts a float to a rational exactly (as long as it > isn't nan or inf), large denominators be damned. This > can help people who are interested in taking floating > point numbers apart. Here's how to pick the integer components out of a float: mant, exp = math.frexp(x) mant, exp = int(mant * 2.0 ** 53), exp-53 >> * Likewise, follow decimal's lead in avoiding all >> automatic coercions from floats: >> Rational(3,10) + 0.3 for example. The two don't mix. > I'm reluctant to remove the fallback to float, You will need some plan to scale-down long integers than exceed the range of floats (right shifting the numerator and denominator until they fit). |
|||

msg59651 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-10 02:01 | |

One other thought, the Decimal and Rational constructors can be made to talk to each other via a magic method so that neither has to import the other (somewhat like we do now with __int__ and __float__). |
|||

msg59656 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-10 03:51 | |

Allowing automatic coercion to and from Decimal sounds dangerous and complicated to me. Mightn't it be safer just to disallow this (at least for now)? I think something like trim() (find the closest rational approximation with denominator bounded by a given integer) would be useful to have as a Rational method, but probably only for explicit use. I'm still worried about equality tests: is it possible to give a good reason why Decimal("2.5") == Rational(5, 2) should return False, while Rational(5, 2) == 2.5 returns True. Can someone articulate some workable principle that dictates when an int/float/complex/Rational/Decimal instance compares equal to some other int/float/complex/Rational/Decimal instance of possibly different type but the same numerical value? |
|||

msg59657 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-01-10 04:35 | |

All in all, Decimal is the odd man out -- it may never become a full member of the Real ABC. The built-in types complex, float and int (and long in 2.x) all support mixed-mode arithmetic, and this is one of the assumptions of the numeric tower -- and of course of mathematics. The new Rational type can be designed to fit in this system pretty easily, because it has no pre-existing constraints; but the Decimal type defies coercion, and is perhaps best left alone. It's already breaking the commonly understood properties of equality, e.g. 1.0 == 1 == Decimal("1") != 1.0. --Guido On Jan 9, 2008 7:51 PM, Mark Dickinson <report@bugs.python.org> wrote: > > Mark Dickinson added the comment: > > Allowing automatic coercion to and from Decimal sounds dangerous and > complicated to me. Mightn't it be safer just to disallow this (at least for > now)? > > I think something like trim() (find the closest rational approximation with > denominator bounded by a given integer) would be useful to have as a Rational > method, but probably only for explicit use. > > I'm still worried about equality tests: is it possible to give a good reason > why Decimal("2.5") == Rational(5, 2) should return False, while Rational(5, 2) > == 2.5 returns True. Can someone articulate some workable principle that > dictates when an int/float/complex/Rational/Decimal instance compares equal to > some other int/float/complex/Rational/Decimal instance of possibly different > type but the same numerical value? > > > __________________________________ > Tracker <report@bugs.python.org> > <http://bugs.python.org/issue1682> > __________________________________ > |
|||

msg59659 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-10 05:18 | |

Would it be reasonable then to not have any of the numeric tower stuff put in the decimal module -- basically leave it alone (no __ceil__, etc)? |
|||

msg59665 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-01-10 16:10 | |

> Would it be reasonable then to not have any of the numeric tower stuff > put in the decimal module -- basically leave it alone (no __ceil__, > etc)? If that's the preference of the decimal developers, sure. |
|||

msg59747 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-11 20:50 | |

If the consensus is that Decimal should not implement Real, I'll reply to that thread and withdraw the patch. Raymond, do you want me to add Decimal.__init__(Rational) in this patch or another issue/thread? I don't understand the comment about scaling down long integers. It's already the case that float(Rational(10**23, 10**24 + 7))==0.1. Mark, I agree that .trim() and/or .approximate() (which I think is the same as Haskell's approxRational) would be really useful. Do you have particular reasons to pick .trim? Are those the best names for the concepts? I'd also really like to be able to link from their docstrings to a proof that their implementations are correct. Does anyone know of one? Finally, Decimal("2.5") != Rational(5, 2) because Decimal("2.5") != 2.5 (so it'd make equality even more intransitive) and hash(Decimal("2.5")) != hash(2.5) so we couldn't follow the rule about equal objects implying equal hash codes, which is probably more serious. I don't have a principled explanation for Decimal's behavior, but given that it's fixed, I think the behavior of non-integral Rationals is determined too. On the other hand, I currently have a bug where Rational(3,1) != Decimal("3") where the hash code could be consistent. I'll fix that by the next patch. |
|||

msg59755 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-11 21:26 | |

> If the consensus is that Decimal should not implement Real, > I'll reply to that thread and withdraw the patch. Thanks. That would be nice. > Raymond, do you want me to add Decimal.__init__(Rational) > in this patch How about focusing on the rational module and when you've done, I'll adapt the Decimal constructor to accept a rational input. > I don't understand the comment about scaling down long integers. My understanding is that you're going to let numerators and denominators grow arbitrarily large. When they get over several hundred digits each, you will have to scale the down before converting to a float. For example when numerator=long('2'*400+'7') and denominator=long('3'*400+'1'), the long-->float conversion will overflow, so it is necessary to first scale-down the two before computing the ratio: scale=325; float_ratio=float(numerator>>scale)/float(denominator>>scale) |
|||

msg59772 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-12 01:40 | |

More comments, questions, and suggestions on the latest patch: 1. In _binary_float_to_ratio, the indentation needs adjusting. Also 'top = 0L' could be replaced with 'top = 0', unless you need this code to work with Python 2.3 and earlier, in which case top needs to be a long so that << behaves correctly. Otherwise, this looks good. 2. Rational() should work, and it should be possible to initialize from a string. I'd suggest that Rational(' 1729 '), Rational('-3/4') and (' +7/18 \n') should be acceptable: i.e. leading and trailing whitespace, and an optional - or + sign should be permitted. But space between the optional sign and the numerator, or on either side of the / sign, should be disallowed. Not sure whether the numerator and denominator should be allowed to have leading zeros or not---probably yes, for consistency with int(). 3. I don't think representing +/-inf and nan as Rationals (1/0, -1/0, 0/0) is a good idea---special values should be kept out of the Rational type, else it won't be an implementation of the Rationals any more---it'll be something else. 4. hash(n) doesn't agree with hash(Rational(n)) for large integers (I think you already mentioned this above). 5. Equality still doesn't work for complex numbers: >>> from rational import * >>> Rational(10**23) == complex(10**23) # expect False here True >>> Rational(10**23) == float(10**23) False >>> float(10**23) == complex(10**23) True 6. Why the parentheses around the str() of a Rational? 7. How about having hash of a Rational (in the case that it's not equal to an integer or a float) be given by hash((self.numerator, self.denominator))? That is, let the tuple hash take care of avoiding lots of hash collisions. |
|||

msg59773 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-12 01:47 | |

About .trim and .approximate: it sounds like these are different, but quite closely related, methods: one takes a positive integer and returns the best approximation with denominator bounded by that integer; the other returns the 'smallest' rational in a given interval centered at the original rational. I guess we probably don't need both of these, but I can't give any good reason for preferring one over the other. I don't have anything to offer about names, either. I can try to find out whether the algorithms are published anywhere on the web---certainly, neither algorithm should be particularly hard to implement and prove the correctness of; they both essentially rely on computing the continued fraction development of the given rational. Almost any not-too-basic elementary number theory text should contain proofs of the relevant results about continued fractions. Am willing to help out with implementing either of these, if that's at all useful. |
|||

msg59870 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-13 22:57 | |

_binary_float_to_ratio: Oops, fixed. Rational() now equals 0, but I'd like to postpone Rational('3/2') until there's a demonstrated need. I don't think it's as common a use as int('3'), and there's more than one possible format, so some real world experience will help (that is, quite possibly not in 2.6/3.0). I'm also postponing Rational(instance_of_numbers_Rational). +/-inf and nan are gone, and hash is fixed, at least until the next bug. :) Good idea about using tuple. Parentheses in str() help with reading things like "%s**2"%Rational(3,2), which would otherwise format as "3/2**2". I don't feel strongly about this. Equality and the comparisons now work for complex, but their implementations make me uncomfortable. In particular, two instances of different Real types may compare unequal to the nearest float, but equal to each other and have similar but inconsistent problems with <=. I can trade off between false ==s and false !=s, but I don't see a way to make everything correct. We could do better by making the intermediate representation Rational instead of float, but comparisons are inherently doomed to run up against the fact that equality is uncomputable on the computable reals, so it's probably not worthwhile to spend too much time on this. I've added a test that float(Rational(long('2'*400+'7'), long('3'*400+'1'))) returns 2.0/3. This works without any explicit scaling on my part because long.__truediv__ already handles it. If there's something else I'm doing wrong around here, I'd appreciate a failing test case. The open issues I know of are: * Is it a good idea to have both numbers.Rational and rational.Rational? Should this class have a different name? * trim and approximate: Let's postpone them to a separate patch (I do think at least one is worth including in 2.6+3.0). So that you don't waste time on them, we already have implementations in the sandbox and (probably) a good-enough explanation at http://en.wikipedia.org/wiki/Continued_fraction#Best_rational_approximations. Thanks for the offer to help out with them. :) * Should Rational.from_float() exist and with the current name? If there's any disagreement, I propose to rename it to Rational._from_float() to discourage use until there's more consensus. * Rational.from_decimal(): punted to a future patch. I favor this for 2.6+3.0. * Rational('3/2') (see above) I think this is close enough to correct to submit and fix up the remaining problems in subsequent patches. If you agree, I'll do so. |
|||

msg59884 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-14 01:45 | |

The latest version of rational.py looks good to me---nice work! (I haven't looked properly at numbers.py or test_rational.py, yet. I do plan to, eventually.) I do think it's important to be able to create Rational instances from strings: e.g., for easy reading from and writing to files. But maybe I'm alone in this opinion. You say there's more than one possible format---what other formats were you considering? And since you pointed it out, I think Rational(Rational(a, b)) should work too. There's also the not-entirely-insignificant matter of documentation :) Other than that, I don't see why this shouldn't go in. Other comments: I have a weak preference for no parentheses on the str() of a Rational, but it's no big deal either way. I agree that equality and comparisons are messy. This seems almost inevitable: one obvious cause is that the existing int <-> float comparisons already break the `numeric tower' model (push both operands to the highest common type before operating). So I'm not sure there can be an easy and elegant solution here :( I like the name Rational for this class. Maybe change the name of numbers.Rational instead? Postponing trim, approximate, from_decimal sounds fine to me. Finally: the very first line of rational.py is, I think, no longer accurate. Please add your name so everyone knows who to blame/credit/assign bug reports to :) |
|||

msg59886 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-14 03:07 | |

Just had a quick look at numbers.py. Only two comments: 1. I think there's a typo in the docstring for Inexact: one of those == should be a != 2. Not that it really matters now, but note that at least for Decimal, x-y is not the same as x+(-y) (I'm looking at Complex.__sub__), and +x is not a no-op (Real.real, Real.conjugate). In both cases signs of zeros can be problematic: >>> x = Decimal('-0') >>> y = Decimal('0') >>> x-y Decimal("-0") >>> x+(-y) Decimal("0") >>> x Decimal("-0") >>> +x Decimal("0") Of course the first point wouldn't matter anyway since Decimal already implements __sub__; the second means that if Decimal were to join Real, something would need to be done to avoid Decimal("-0").real becoming Decimal("0"). |
|||

msg59958 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-15 07:47 | |

Thanks! I've added some minimal documentation and construction from other Rationals. The other formats for string parsing center around where whitespace is allowed, and whether you can put parens around the whole fraction. Parens would, of course, go away if str() no longer produces them. So they're not significantly different. I guess my objection to construction from strings is mostly that I'm ambivalent, and especially for a core library, that means no. If there's support from another core developer or two, I'd have no objections. Inexact is saying that one thing could be ==3 and the other ==0, so I think it's correct. Negative zeros are a problem. :) I think the default implementations are still fine, but you're right that classes like Decimal will need to think about it, and the numbers module should mention the issue. It's not related to the Rational implementation though, so in another patch. I've submitted this round as r59974. Onto the next patch! :) |
|||

msg59982 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-15 18:46 | |

> Inexact is saying that one thing could be ==3 and the other ==0, so I > think it's correct. You're right, of course :) |
|||

msg60080 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-18 04:05 | |

Here's a patch that adds construction from strings (Guido favored them) and .from_decimal(), changes __init__ to __new__ to enforce immutability, and removes "rational." from repr and the parens from str. I don't expect this to be contentious, so I'll commit it tomorrow unless I hear objections. |
|||

msg60081 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-18 04:28 | |

After this come the two approximation methods. Both are implemented using the continued fraction representation of the number: http://en.wikipedia.org/wiki/Continued_fraction#Best_rational_approximations. The first, currently named "trim", takes the closest rational whose denominator is less than a given number. This seems useful for computations in which you want to sacrifice some accuracy for speed. At one point in this discussion, Guido suggested that Decimal removed the need for a method like this, and this type isn't optimized for speed anyway. The other, currently named "approximate", returns the "simplest" rational within a certain distance of the real value. This seems useful for converting from float and displaying results to users, where we prefer readability over accuracy ("yes, I'll take '1/3' even though it's farther away than '1234/3690'."). We could provide 0, 1, or both of them, or an accessor for the continued fraction representation of the number, which might help with third-party implementations of both. I've never actually used either of these, so I can't say which is actually more useful. It's probably a good question to send to the full python-dev list. Even if we decide against including them in the class, we might put the implementations into the docs or the test as a demonstration. |
|||

msg60131 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-19 01:59 | |

(About the latest patch): this all looks good to me. The comment that "Decimal provides no other public way to detect nan and infinity." is not true (though it once was). Decimal has public methods is_nan and is_infinite, added as part of updating to the most recent specification. (Yes, it also has private methods _isnan and _isinfinity, dating from long ago; I'm working on a patch that gets rid of the duplication.) (About the approximation methods): I agree that these aren't a necessary part of a Rational module---just something that might be nice to have around. So my vote would be for adding either 0 or 1 of these; adding two such similar methods with similar use-cases just seems like a cause of possible confusion to me. I'd also vote against a method for providing the convergents of the continued-fraction, but that's just me. See what python- dev says! One interesting use-case for approximate is to recover a simple rational from a float, in a case where the float was rational to begin with, but lost a little accuracy in conversion; approximate works well here because you generally have some idea how close the float is to the rational. |
|||

msg60132 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-01-19 02:17 | |

Were the new methods part of the spec update? If so that's great. If not, we need to take them out. We want zero API creep that isn't mandated by the spec (no playing fast and loose with this module). |
|||

msg60133 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-19 02:28 | |

Raymond: > Were the new methods part of the spec update? If so that's great. Yes. See http://www2.hursley.ibm.com/decimal/damisc.html > If not, we need to take them out. We want zero API creep that isn't > mandated by the spec (no playing fast and loose with this module). Understood. |
|||

msg60137 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-01-19 09:56 | |

I coulda sworn I looked for is_nan and is_infinite. Oh well, switched to using .is_finite, which is even better and checked in as r60068. Thanks for the pointer. |
|||

msg61737 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-01-27 15:20 | |

I noticed that the ability to type Rational("2.3") has been added (and I think this is a very useful addition). One minor nit: would it make sense to have Rational("2.") and Rational(".3") also work? These strings work for float() and Decimal(), and 2. and .3 work as float literals, so it would seem to make sense to allow this for Rational as well. |
|||

msg61995 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-02 07:24 | |

Whoops, sorry for taking a while to answer. +0 on adding support for '2.' and '.3', given that they're allowed for float and Decimal. Not +1 because they'll make the regular expression more complicated, and they're not exactly necessary, but if you want to add them, go ahead. |
|||

msg62001 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-02 13:17 | |

About Rational('3.') and Rational('.2'): It's not so much to do with consistency with float and Decimal; more to do with the fact that some people like to write floats these ways (which presumably is why float and Decimal allow such input). It occurs to me that if you also allow things like Rational('1e+100') then conversions from Decimal to Rational could simply be spelled Rational(str(decimal_instance)) eliminating the need for the special Decimal -> Rational conversion method. Anyway, I'll change the regex to allow '3.' and '.3'. |
|||

msg62005 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-02 17:21 | |

Regex changed in r60530. |
|||

msg62010 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-02 20:27 | |

I think '1e+100' would constitute feature creep at this point, but thanks for the suggestion, and the improvement in the readability of the regex! |
|||

msg62011 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-02-02 21:17 | |

The rational constructor should accept decimals directly. Rational(Decimal('3.1')) does not suffer the same representation error problems as Rational(float('3.1')). The conversion from decimal is lossless and does not depend on the decimal context. There is no need for a separate constructor. |
|||

msg62013 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-02 23:34 | |

I think Rational should handle all floating types as consistently as possible, whatever their radix or precision. It's unlikely that people will convert from them often anyway, especially from Decimal, so the shorter conversion from Decimal doesn't seem to outweigh the extra complexity in the constructor's behavior. If I turn out to be wrong about this, we can revisit it in 3.1. |
|||

msg62028 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-03 23:52 | |

FWIW, if Rational(Decimal(...)) is to be accepted, then Decimal(Rational(...)) should also be accepted, and arguably mixed binary operations as well (Rational(...) + Decimal(...) etc.). |
|||

msg62029 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-02-04 00:35 | |

I would rather drop it than see that mess. FWIW, there is a difference. Rational(Decimal(...)) takes place without reference to a decimal.Context and is always lossless. In contrast, Decimal(Rational(...)) is context sensitive (the division is subject to rounding and precision limits) and is typically lossy as would be the case with Decimal(Rational(1, 3)) which like most rationals cannot be exactly represented in Decimal. |
|||

msg62099 - (view) | Author: Facundo Batista (facundobatista) * | Date: 2008-02-06 15:23 | |

I'm +0 to make Decimal(Rational(x,y)) available. I'd make it something like: Decimal(R) == Decimal(R.numerator) / Decimal(R.denominator) Note, as Raymond says, that this division implies the use of the context. So the following will NOT be true: R == Rational(Decimal(R)) |
|||

msg62189 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-08 01:03 | |

from Guido: > I have one minor nit on the current rational.py code: all internal > accesses to the numerator and denominator should use self._numerator > and self._denominator -- going through the properties makes it *much* > slower. Remember that Python function/method calls are slow, and never > optimized away. :-) This isn't quite as simple as doing s/.numerator/._numerator, since the current code makes use of the fact that the int and long types also implement .numerator and .denominator. Can we follow the approach that Decimal takes: convert subclasses of int and long to Rational before operating? At first sight it seems possible that this might actually slow down code that does a lot of mixed-mode int/long + Rational arithmetic, but I think this is unlikely. I'll implement this unless there are objections. I'm also wondering what the policy should be on return types: if a and b are instances of a subclass of Rational, should a+b have return type Rational, or return type equal to that of a and b? Current behaviour of various builtin types and Decimal suggests that a Rational should be returned. |
|||

msg62192 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-08 01:23 | |

> > I have one minor nit on the current rational.py code: all internal > > accesses to the numerator and denominator should use self._numerator > > and self._denominator -- going through the properties makes it *much* > > slower. Remember that Python function/method calls are slow, and never > > optimized away. :-) > > This isn't quite as simple as doing s/.numerator/._numerator, since the > current code makes use of the fact that the int and long types also > implement .numerator and .denominator. Well, but *self.numerator* certainly doesn't need to worry about self being an int or long. :-) > Can we follow the approach that Decimal takes: convert subclasses of > int and long to Rational before operating? At first sight it seems > possible that this might actually slow down code that does a lot of > mixed-mode int/long + Rational arithmetic, but I think this is unlikely. > I'll implement this unless there are objections. It had never occurred to me to do it otherwise. ;-) > I'm also wondering what the policy should be on return types: if a and > b are instances of a subclass of Rational, should a+b have return type > Rational, or return type equal to that of a and b? Current behaviour of > various builtin types and Decimal suggests that a Rational should be > returned. Correct -- the thing is that you can't know the signature of the subclass's constructor so you can't very well blindly call that. |
|||

msg62218 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-09 05:46 | |

I figured I'd time the difference before we change the code: $ ./python.exe -m timeit -s 'import rational; r=rational.Rational(3, 1)' 'r.numerator' 1000000 loops, best of 3: 0.696 usec per loop $ ./python.exe -m timeit -s 'import rational; r=rational.Rational(3, 1)' 'r._numerator' 10000000 loops, best of 3: 0.155 usec per loop $ ./python.exe -m timeit -s '(3).numerator' 10000000 loops, best of 3: 0.0324 usec per loop $ ./python.exe -m timeit -s '(3L).numerator' 10000000 loops, best of 3: 0.0356 usec per loop $ ./python.exe -m timeit -s 'from rational import Rational' 'Rational(3, 1)' 10000 loops, best of 3: 80.4 usec per loop $ ./python.exe -m timeit -s 'from rational import Rational; r=Rational(3, 1)' 'isinstance(r, Rational)' 100000 loops, best of 3: 10.6 usec per loop So every time we change .numerator to ._numerator we save half a microsecond. If we construct a new Rational from the result, that's totally drowned out by the Rational() call. Even checking whether we're looking at a Rational costs 20 times as much as the difference, although that can probably be optimized. I think this means that we shouldn't bother changing the arithmetic methods since it makes the code more complicated, but changing unary methods, especially ones that don't return Rationals, can't hurt. |
|||

msg62222 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-09 14:52 | |

Guido: > Correct -- the thing is that you can't know the signature of the > subclass's constructor so you can't very well blindly call that. Jeffrey, is it okay for me to change the three class methods (from_float, from_decimal, from_continued_fraction) to static methods, and call Rational instead of cls as the constructor? |
|||

msg62227 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-09 16:37 | |

Jeffrey: Yay for measurements! I was going to say that __add__ is inefficient because it makes so many function calls, but it turns out that, as you say, the cost of constructing a new Rational instance drowns everything else. On my 2.8GHz iMac, Rational(2,3) costs ~80 usec. This goes down to 50 usec if I make it inherit from object -- the ABC machinery costs 30 usecs! If I then also comment out all the typechecking of numerator and denominator and go straight into the gcd(), the constructor costs go down to 6 (six!) usecs. Beyond that it's slim pickings; replacing super() with object or inlining gcd wins perhaps half an usec. But once the constructor is down to 5-6 usec, the half usec for going through the constructor (times 6 for 6 constructor calls in _add()!) might be a significant gain to also try and inline the common case of the binary operators. In the mean time I have two recommendations if you want to make the constructor faster without losing functionality: (a) remove the direct inheritance from RationalAbc (using virtual inheritance instead); (b) special-case the snot out of the common path in the constructor (called with two ints). An alternative might be to have a private class or static method to construct a Rational from two ints that is used internally; it could use object.__new__ instead of super() so as to save the ABC overhead. But I'm not sure if this will always work. Unrelated issue: I just realized for the first time that we have two classes named 'Rational': the ABC and the concrete implementation. That's going to be awfully confusing. Perhaps it's not too late to rename rational.py/Rational to fractions.py/Fraction? |
|||

msg62236 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-09 20:59 | |

Mark: Coming from C++, I don't have any intuition on static vs. class methods. It might be strange to write MyRationalSubclass.from_float() and get a Rational back, but it might also be strange to write a subclass with a different constructor and get an error. So go ahead. Guido: It would be a shame to decide that classes shouldn't inherit from ABCs for performance reasons. Issue 1762 tracks the problem, but I haven't paid much attention to it. Let's see if we can fix that before using virtual inheritance for Rational. Special-casing ints in the constructor sounds like a good idea, and we can cache the results of .numerator and .denominator in _add, etc, without having to change the overall logic, which should save 2μs (leaving 1 on the table). It could be useful to declare a private constructor that expects ints that are already in lowest terms, sort of like decimal._dec_from_triple(). __add__ couldn't use it directly, but __abs__ could. I don't think it's too late to rename one of the classes. I'm using "RationalAbc" inside of rational.py to refer to numbers.Rational, which is one reason I was positive on adding a suffix to the ABCs, but renaming this class is fine with me too. |
|||

msg62249 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-10 15:36 | |

staticmethod substituted for classmethod in r60712. I'm not sure I like the idea of names Rational and Fraction; the two classes numbers.Rational and rational.Rational are quite different beasts, and using two almost-synonyms for their names seems like a bad idea. Is there some more descriptive name for numbers.Rational that might give a hint of its ABC-ness? |
|||

msg62251 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-10 15:55 | |

We still need to sort out the trim/approximate/convergents decisions. Currently, we have: from_continued_fraction to_continued_fraction approximate (what we've been calling trim: limit the denominator) At this point I'm not sure how much I care about what is or is not included, but here are a few thoughts: (1) if to_continued_fraction is kept it should be a generator instead of returning a list. (2) from_continued_fraction would be better replaced by convergents, since a user is just as (more?) likely to be interested in the whole sequence of convergents than just the final convergent. If from_continued_fraction is kept in addition to convergents then it should work forwards instead of backwards, so that it doesn't need to use reversed (and hence works on the output of to_continued_fraction). (3) approximate needs finishing up and possibly renaming to trim. Can we remove {from,to}_continued_fraction and just leave trim? |
|||

msg62252 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-10 16:19 | |

> I'm not sure I like the idea of names Rational and Fraction; the two > classes numbers.Rational and rational.Rational are quite different beasts, > and using two almost-synonyms for their names seems like a bad idea. > Is there some more descriptive name for numbers.Rational that might give a > hint of its ABC-ness? I'd rather not change the Rational ABC's name -- it is the mathematically accepted term (Complex, Real, Rational, Integer, and then natural numbers I believe). I'd rather not add a suffix or prefix like A or Abc to the ABCs either. To be honest, we have a rather hodge-podge of mappings from numeric ABCs to concrete implementations: Complex/complex, Real/float, Rational/?, Integer/int. I think that, given that fraction is the *common* name for rationals (see wikipedia), it fits relatively well. We can introduce fractions without ever mentioning the ABCs and users will immediately know what they mean even if they haven't got more than grade school math. |
|||

msg62253 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-10 16:53 | |

Fair enough. Should it be fractions.Fraction or fraction.Fraction? |
|||

msg62254 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-10 17:04 | |

> Fair enough. Should it be fractions.Fraction or fraction.Fraction? I think "from fractions import Fraction" is linguistically more correct -- the concept is always mentioned in plural, but a fractional number is of course singular. We also have "numbers". |
|||

msg62256 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-10 20:05 | |

Any reason not to make the name change? Is further discussion/time required, or can we go ahead and rename Rational to Fraction and rational.py to fractions.py? It seems like everybody's happy with the idea. I note that the name change affects Lib/test/test_builtin.py as well as Doc/whatsnew/2.6.rst. Any other non-obvious places that I've missed? |
|||

msg62257 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-10 20:20 | |

Go for it! |
|||

msg62258 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-10 21:31 | |

Name change in r60721. |
|||

msg62271 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-11 03:18 | |

I just checked in another very minor change in r60723. The repr of a Fraction now looks like Fraction(1, 2) instead of Fraction(1,2). Let me know if this change is more controversial than I think it is. |
|||

msg62305 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-12 04:28 | |

Re convergents: In the interest of minimality, I don't think from_continued_fraction and to_continued_fraction need to stick around. I think the other thread established pretty conclusively that .convergents() is not a particularly good building block for either nearby-fraction method, so I'd let people who want it implement it themselves. Neal Norwitz suggested and I agree that .trim() isn't descriptive enough, so let's follow Raymond's alternative of .limit_denominator(). Would you like to commit your implementation? |
|||

msg62311 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2008-02-12 12:07 | |

I mentioned my dislike of the classmethod->staticmethod change on the python-checkins list, but given the lack of response there, I figure I should bring it up here. It is well established that the way to implement an alternate constructor in Python is to write a classmethod. This means the alternate constructor will behave in a way similar to __new__: invocation via a subclass will result in an instance of that subclass rather than of the base type. Now, this does usually mean the parent class is placing certain expectations on the signature of the subclass constructor. However, this is a pretty common Python idiom, and a lot friendlier than coercing the result to the base type. It makes the common case (constructor signature unchanged or at least compatible) simple, while still permitting the rarer case of changing the signature in an incompatible way (by overriding the class methods as well as the __new__ and __init__ methods) If you want to make this more convenient for users that do want to subclass and change the constructor signature, the expected interface can be factored out to a single method, similar to the way it is done for collections.Set and its _from_iterable class method (i.e. if a Set subclass constructor doesn't accept an iterable directly, it can just override _from_iterable to adapt the supplied iterable to the new interface). |
|||

msg62318 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-12 15:17 | |

Nick Coghlan wrote: > I mentioned my dislike of the classmethod->staticmethod change on the > python-checkins list, but given the lack of response there, I figure I > should bring it up here. Yes, I missed it. Apologies. I'll rethink this (and likely-as-not revert it, but I want to get my head around the issues first). One problem I'm having is imagining any real-life examples of subclasses of Rational. An example or two might help inform the discussion. |
|||

msg62320 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-12 16:09 | |

Okay, Nick; you've got me convinced. For some reason I wasn't really thinking of these methods as alternative constructors---in this light, your arguments about doing the same as __new__, and about established patterns, make perfect sense. Looking back at the discussion, Jeffrey looked like he was +/-0 on the change, and Guido was answering a slightly different question (about __add__ instead of constructors); I then took his answer out of context to justify the change :( So I'll change this back unless there's further discussion. |
|||

msg62324 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-12 18:07 | |

> Okay, Nick; you've got me convinced. For some reason I wasn't really > thinking of these methods as alternative constructors---in this light, > your arguments about doing the same as __new__, and about established > patterns, make perfect sense. > > Looking back at the discussion, Jeffrey looked like he was +/-0 on the > change, and Guido was answering a slightly different question (about > __add__ instead of constructors); I then took his answer out of context > to justify the change :( > > So I'll change this back unless there's further discussion. Sounds good. BTW I think the next goal should be to reduce the cost of constructing a Fraction out of to plain ints by at least an order of magnitude. I believe this is possible. |
|||

msg62335 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-12 21:51 | |

limit_denominator implemented in r60752 from_decimal and from_float restored to classmethods in r60754 |
|||

msg62337 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-12 22:15 | |

> BTW I think the next goal should be to reduce the cost of constructing > a Fraction out of to plain ints by at least an order of magnitude. I > believe this is possible. I certainly hope so! Here's a very simple (and simplistic) benchmark: # start benchmark from fractions import Fraction from cProfile import run def test1(): return sum(Fraction(1, n*n-1) for n in xrange(2, 100000)) run("test1()") #end benchmark On my MacBook this reports a total time of 38.072 seconds, with 22.731 of those (i.e. around 60%) being spent in abc.__instancecheck__ and its callees. |
|||

msg62340 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2008-02-12 22:39 | |

Thanks for adding the class methods back Mark. On the constructor front, we got a fair payoff in the early Decimal days just by reordering the type checks in the constructor. Other than that, I'd have to dig into the code a bit more to offer any useful suggestions. |
|||

msg62380 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-14 07:21 | |

r60785 speeds the benchmark up from 10.536s to 4.910s (on top of whatever my __instancecheck__ fix did). Here are the remaining interesting-looking calls: ncalls tottime percall cumtime percall filename:lineno(function) ... 1 0.207 0.207 4.999 4.999 {sum} 199996 1.587 0.000 3.419 0.000 fractions.py:58(__new__) 99997 0.170 0.000 3.236 0.000 fractions.py:295(forward) 99998 0.641 0.000 2.881 0.000 fractions.py:322(_add) 99999 0.202 0.000 1.556 0.000 benchmark.py:3(<genexpr>) 199996 0.829 0.000 0.829 0.000 fractions.py:17(gcd) 199996 0.477 0.000 0.757 0.000 abc.py:63(__new__) 399993 0.246 0.000 0.246 0.000 abc.py:164(__instancecheck__) 199996 0.207 0.000 0.207 0.000 {method 'get' of 'dictproxy' objects} 100071 0.185 0.000 0.185 0.000 {isinstance} 399990 0.109 0.000 0.109 0.000 fractions.py:200(denominator) 200004 0.073 0.000 0.073 0.000 {built-in method __new__ of type object at 0xfeea0} 199995 0.065 0.000 0.065 0.000 fractions.py:196(numerator) |
|||

msg62401 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-14 16:58 | |

Yay! And my benchmark went from 70 usec to 15 usec. Not bad! PS. Never use relative speedup numbers based on profiled code -- the profiler skews things tremendously. Not saying you did, just stating the obvious. :) |
|||

msg62403 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-14 17:08 | |

PS. I can shave off nearly 4 usec of the constructor like this: - self = super(Fraction, cls).__new__(cls) + if cls is Fraction: + self = object.__new__(cls) + else: + self = super().__new__(cls) This would seem to give an upper bound for the gain you can make by moving the check for instantiating an abstract class to C. Your call. I also found that F(2,3)+F(5,7) takes about 22 usecs (or 18 using the above hack). Inlining the common case for addition (Fraction+Fraction) made that go down to 14 usec (combined with the above hack): - __add__, __radd__ = _operator_fallbacks(_add, operator.add) + __xadd__, __radd__ = _operator_fallbacks(_add, operator.add) + def __add__(self, other): + if type(other) is Fraction: + na = self._numerator + da = self._denominator + nb = other._numerator + db = other._denominator + return Fraction(na*db + nb*da, da*db) + return self.__xadd__(other) + |
|||

msg62406 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-14 17:47 | |

Thanks for writing the __add__ optimization down. I hadn't realized how simple it was. I think both optimizations are worth it. 22% on a rarely used class is worth 24 lines of python, and I think nearly eliminating the overhead of ABCs (which will prevent them from getting a bad performance reputation) is worth some C. |
|||

msg62547 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-18 22:15 | |

Two things: (1) Speedup. I haven't been helping much here; apologies. Christian suggested that a C implementation of gcd might help. Is this true, or are we not yet at the stage where the gcd calls are significant? There are some basic tricks that can speed up the usual Euclidean algorithm by a constant factor, and I'll try to implement them if that's of interest (unless Christian beats me to it). There also some serious super-fast recursive gcd algorithms, with running time O(n**(1+epsilon)) for a pair of n-bit integers, but those are complicated and probably best left to GMP. (2) PEP 3101: Should Fraction get a __format__ method? I'm looking at implementing Decimal.__format__, and I think there's quite likely to be a fair amount of common code (e.g. for parsing the format specifier) between Decimal.__format__ and Fraction.__format__, so it would probably make sense for me to do both of these at once. |
|||

msg62550 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-18 22:51 | |

1) No worries. Even without inlining the common case of __add__, etc., Fraction is now faster than Decimal for smallish fractions [sum(Fraction(1, i) for i in range(1, 100))], and for large ones [sum(Fraction(1, i) for i in range(1, 1000))] gcd takes so much of the time that I can't see the effects of any of the optimizations I've made. Since I wasn't thinking of this as a high-performance class, I'm taking that as a sign to stop optimizing, but gcd is definitely the function to tackle if anyone wants to keep going. 2) I haven't been following __format__, but adding it to Rational sounds fine to me. |
|||

msg62564 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-19 16:38 | |

A C reimplementation of gcd probably makes little sense -- for large numbers it is dominated by the cost of the arithmetic, and that will remain the same. |
|||

msg62566 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-19 17:03 | |

> A C reimplementation of gcd probably makes little sense -- for large > numbers it is dominated by the cost of the arithmetic, and that will > remain the same. That's true for a direct translation of the usual algorithm. But there's a variant due to Lehmer which reduces the number of multi- precision operations, and dramatically reduces the number of full- precision divmod operations---they're mostly replaced by n-limb-by-1- limb multiplications and full-precision additions. I'd expect at least a factor of 2 speedup from using this; possibly more. Of course it's impossible to tell without implementing it. I've attached a Python version; note that: * the operations to find x and y at the start of the outer while loop are simple lookups when written in C * the else branch is taken rarely: usually once at the beginning of any gcd() calculation, and then less than 1 time in 20000 on average during the reduction. * all the arithmetic in the inner while loop is done with numbers <= 2**15 in absolute value. Mark |
|||

msg62567 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-02-19 17:20 | |

I think this is definitely premature optimization. :-) In any case, before going any further, you should design a benchmark and defend it. |
|||

msg62568 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-19 19:31 | |

Replacing lehmer_gcd.py with a revised version. Even in Python, this version is faster than the current one, on my machine, once both numbers are greater than 10**650 or so (your crossover points may vary). It's over four times faster for very large inputs (over 10**20000). > In any case, before going any further, you should design a benchmark > and defend it. Okay. I'll stop now :) |
|||

msg62671 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-22 01:02 | |

> Okay. I'll stop now :) So I lied. In a spirit of enquiry, I implemented math.gcd, and wrote a script (lehmer_gcd.py, attached) to produce some relative timings. Here are the results, on my MacBook Pro (OS X 10.5): An explanation of the table: I'm computing gcd(a, b) for randomly chosen a and b with a_digits and b_digits decimal digits, respectively. The times in the 5th and 6th columns are in seconds; the last column gives the factor by which math.gcd is faster than the direct Euclidean Python implementation. Even for quite small a and b, math.gcd(a, b) is around 10 times faster than its Python counterpart. For large, roughly equal-sized, a and b, the C version runs over 30 times faster. The smallest difference occurs when the two arguments are very different sizes; then both versions of gcd essentially do one time-consuming divmod, followed by a handful of tiny operations, so they take essentially the same time. For Fraction, the arguments will tend to be around equal size for many applications: this assumes that the actual real number represented by the Fraction is within a sensible range, even when the numerator and denominator have grown huge. For random a and b, gcd(a, b) tends to be very small. So for the last few entries in the table, the gcds have been artifically inflated (by calling gcd(g*a, g*b) for some largish g). On to the results. a_digits b_digits g_digits ntrials C_Lehmer Py_Euclid Factor -------- -------- -------- ------- -------- --------- ------ 1 1 0 100000 0.066856 0.240867 3.602780 2 2 0 100000 0.070256 0.314639 4.478466 4 4 0 100000 0.075708 0.466596 6.163087 7 7 0 100000 0.082714 0.681443 8.238537 10 10 0 10000 0.022336 0.318501 14.259532 20 20 0 10000 0.045209 0.752662 16.648436 50 50 0 10000 0.128269 2.167890 16.901128 100 100 0 1000 0.028053 0.511769 18.242906 1000 1000 0 1000 0.709453 18.867336 26.594197 10000 10000 0 100 4.215795 148.597223 35.247736 Lopsided gcds ------------- 20 100 0 100 0.000889 0.007659 8.616953 100 20 0 100 0.000887 0.007665 8.642473 1000 3 0 100 0.000727 0.001387 1.908167 3 1000 0 100 0.000754 0.001457 1.932027 10000 1 0 100 0.004689 0.004948 1.055219 1 10000 0 100 0.004691 0.005038 1.073948 500 400 0 100 0.020289 0.388080 19.127665 400 500 0 100 0.020271 0.389301 19.204768 Large gcd --------- 10 10 10 1000 0.005235 0.043507 8.310880 20 20 20 1000 0.008505 0.095732 11.256167 100 100 100 1000 0.041734 0.812656 19.472284 |
|||

msg62672 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-22 01:05 | |

And here's the diff for math.gcd. It's not production quality---it's just there to show what's possible. |
|||

msg63087 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-02-28 02:16 | |

I'm wondering what there is left to do here. Are we close to being able to close this issue? Some thoughts: (1) I think the docs could use a little work (just expanding, and giving a few examples). But it doesn't seem urgent that this gets done before the next round of alphas. (2) Looking at it, I'm really not sure that implementing Rational.__format__ is worthwhile; it's not obvious that we need a way to format a Rational as a float. So I'm -0 on this. If others think Rational should have a __format__ method I'm still happy to implement it. Is there much else that needs to be done here? |
|||

msg63126 - (view) | Author: Jeffrey Yasskin (jyasskin) * | Date: 2008-02-29 06:37 | |

I agree that we're basically done here. I'm back to -1 on inlining the common case for arithmetic (attached here anyway). Simple cases are already pretty fast, and bigger fractions are dominated by gcd time, not function call overhead. Since duplicating the definitions of arithmetic will make it harder to do tricky things that shrink the arguments to gcd(), we shouldn't do it. I was always +0 to __format__, so not doing it is fine with me. Thanks for everyone's help! |
|||

msg68725 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-06-25 12:11 | |

I'm reopening this to propose a last-minute minor change: Can we remove the trailing 'L' from the numerator and denominator in the repr of a Fraction instance? E.g., >>> x = Fraction('9.876543210') >>> x Fraction(987654321L, 100000000L) >>> x * (1/x) Fraction(1L, 1L) I'd rather see Fraction(987654321, 100000000) and Fraction(1, 1). Does the 'L' serve any useful purpose? |
|||

msg68732 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2008-06-25 15:57 | |

+1 on removing the L. Also, it would be nice if reduced fractions were restored to ints after the gcd step: >>> Fraction(1*10**18, 3*10**18) Fraction(1L, 3L) |
|||

msg68757 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-06-25 20:52 | |

+1 on removing the trailing L from the repr. +0 on trying to reduce the values to ints; that would be dead-end code since in 3.0 it's a non-issue. And it doesn't solve the problem with repr. |
|||

msg68822 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-06-27 10:21 | |

Trailing 'L's removed in r64557. A couple more minor issues: (1) fractions.gcd is exported but not documented. I'll add some documentation for it, unless anyone feels that it shouldn't have been exported in the first place. If so, please speak up! (2) The Fraction constructor is a little more lenient than it ought to be. For example: >>> Fraction('1.23', 1) Fraction(123, 100) I think this should really be a TypeError: a second argument to the constructor should only be permitted when the first argument is an integer. However, it seems fairly harmless, so I'm inclined to just leave this as it is and not mess with it. |
|||

msg68823 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-06-27 10:26 | |

Hmm. I don't much like this, though: >>> Fraction('1.23', 1.0) Fraction(123, 100) >>> Fraction('1.23', 1+0j) Fraction(123, 100) I'd say these should definitely be TypeErrors. |
|||

msg68843 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-06-27 18:51 | |

I think it's okay to accept Fraction('1.23', 1) -- if only because rejecting it means changing the default for the 2nd arg to None and that would slow it down again. I agree that if the type of the 2nd arg isn't int or long it should be rejected. That should not slow down the common path (two ints). |
|||

msg68851 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-06-27 20:36 | |

> I agree that if the type of the 2nd arg isn't int or long it should be > rejected. That should not slow down the common path (two ints). Sounds good. Some care might be needed to avoid invalidating Fraction(n, d) where n and d are instances of numbers.Integral but not of type int or long. |
|||

msg69020 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-06-30 21:37 | |

> I agree that if the type of the 2nd arg isn't int or long it should be > rejected. That should not slow down the common path (two ints). I'm having second thoughts about this; it doesn't seem worth adding an extra check that has to be applied every time a Fraction is created from a string (or another Rational instance). The condition being checked for (a denominator equal to 1, but not of type int or long) is unlikely to occur in practice, and fairly harmless even if it does occur. So I'm thinking that PBP says leave it alone. |
|||

msg69027 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2008-06-30 23:23 | |

That's fine. |
|||

msg69069 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-01 20:35 | |

Well, I can't find anything more to fuss about here. :-) Reclosing. |

History | |||
---|---|---|---|

Date | User | Action | Args |

2014-09-25 11:53:54 | wolma | set | nosy:
+ wolma |

2008-07-01 20:35:22 | mark.dickinson | set | status: open -> closed messages: + msg69069 |

2008-06-30 23:23:17 | gvanrossum | set | messages: + msg69027 |

2008-06-30 21:37:36 | mark.dickinson | set | messages: + msg69020 |

2008-06-27 20:36:20 | mark.dickinson | set | messages: + msg68851 |

2008-06-27 18:51:08 | gvanrossum | set | messages: + msg68843 |

2008-06-27 10:26:41 | mark.dickinson | set | messages: + msg68823 |

2008-06-27 10:21:03 | mark.dickinson | set | messages: + msg68822 |

2008-06-25 20:52:46 | gvanrossum | set | messages: + msg68757 |

2008-06-25 15:57:23 | rhettinger | set | messages: + msg68732 |

2008-06-25 12:11:09 | mark.dickinson | set | status: closed -> open messages: + msg68725 |

2008-02-29 06:37:27 | jyasskin | set | status: open -> closed resolution: fixed messages: + msg63126 files: + fraction_inline_arith.patch |

2008-02-28 02:16:21 | mark.dickinson | set | messages: + msg63087 |

2008-02-22 01:05:11 | mark.dickinson | set | files:
+ lehmer_gcd.patch messages: + msg62672 |

2008-02-22 01:02:56 | mark.dickinson | set | files:
+ lehmer_gcd_2.py messages: + msg62671 |

2008-02-19 19:31:25 | mark.dickinson | set | files:
+ lehmer_gcd.py messages: + msg62568 |

2008-02-19 19:25:51 | mark.dickinson | set | files: - lehmer_gcd.py |

2008-02-19 17:20:00 | gvanrossum | set | messages: + msg62567 |

2008-02-19 17:03:30 | mark.dickinson | set | files:
+ lehmer_gcd.py messages: + msg62566 |

2008-02-19 16:38:21 | gvanrossum | set | messages: + msg62564 |

2008-02-18 22:51:05 | jyasskin | set | messages: + msg62550 |

2008-02-18 22:15:42 | mark.dickinson | set | messages: + msg62547 |

2008-02-14 17:47:18 | jyasskin | set | messages: + msg62406 |

2008-02-14 17:08:54 | gvanrossum | set | messages: + msg62403 |

2008-02-14 16:58:46 | gvanrossum | set | messages: + msg62401 |

2008-02-14 07:21:58 | jyasskin | set | messages: + msg62380 |

2008-02-12 22:39:42 | ncoghlan | set | messages: + msg62340 |

2008-02-12 22:15:58 | mark.dickinson | set | messages: + msg62337 |

2008-02-12 21:51:16 | mark.dickinson | set | messages: + msg62335 |

2008-02-12 18:07:07 | gvanrossum | set | messages: + msg62324 |

2008-02-12 16:09:48 | mark.dickinson | set | messages: + msg62320 |

2008-02-12 15:17:50 | mark.dickinson | set | messages: + msg62318 |

2008-02-12 12:07:36 | ncoghlan | set | nosy:
+ ncoghlan messages: + msg62311 |

2008-02-12 04:28:45 | jyasskin | set | messages: + msg62305 |

2008-02-11 03:18:04 | mark.dickinson | set | messages: + msg62271 |

2008-02-10 21:31:04 | mark.dickinson | set | messages:
+ msg62258 title: Move Demo/classes/Rat.py to Lib/rational.py and fix it up. -> Move Demo/classes/Rat.py to Lib/fractions.py and fix it up. |

2008-02-10 20:20:39 | gvanrossum | set | messages: + msg62257 |

2008-02-10 20:05:23 | mark.dickinson | set | messages: + msg62256 |

2008-02-10 17:04:51 | gvanrossum | set | messages: + msg62254 |

2008-02-10 16:53:07 | mark.dickinson | set | messages: + msg62253 |

2008-02-10 16:19:20 | gvanrossum | set | messages: + msg62252 |

2008-02-10 15:55:48 | mark.dickinson | set | messages: + msg62251 |

2008-02-10 15:36:43 | mark.dickinson | set | messages: + msg62249 |

2008-02-09 20:59:57 | jyasskin | set | messages: + msg62236 |

2008-02-09 16:37:29 | gvanrossum | set | messages: + msg62227 |

2008-02-09 14:52:40 | mark.dickinson | set | messages: + msg62222 |

2008-02-09 05:46:48 | jyasskin | set | messages: + msg62218 |

2008-02-08 01:23:33 | gvanrossum | set | messages: + msg62192 |

2008-02-08 01:03:06 | mark.dickinson | set | messages: + msg62189 |

2008-02-06 15:23:17 | facundobatista | set | messages: + msg62099 |

2008-02-04 00:36:00 | rhettinger | set | messages: + msg62029 |

2008-02-03 23:52:33 | gvanrossum | set | assignee: gvanrossum -> jyasskin messages: + msg62028 |

2008-02-02 23:34:37 | jyasskin | set | messages: + msg62013 |

2008-02-02 21:17:21 | rhettinger | set | messages: + msg62011 |

2008-02-02 20:27:50 | jyasskin | set | messages: + msg62010 |

2008-02-02 17:21:48 | mark.dickinson | set | messages: + msg62005 |

2008-02-02 13:18:00 | mark.dickinson | set | messages: + msg62001 |

2008-02-02 07:24:52 | jyasskin | set | messages: + msg61995 |

2008-01-27 15:20:36 | mark.dickinson | set | messages: + msg61737 |

2008-01-19 09:56:33 | jyasskin | set | messages: + msg60137 |

2008-01-19 02:28:36 | mark.dickinson | set | messages: + msg60133 |

2008-01-19 02:17:40 | rhettinger | set | messages: + msg60132 |

2008-01-19 01:59:33 | mark.dickinson | set | messages: + msg60131 |

2008-01-18 04:28:47 | jyasskin | set | messages: + msg60081 |

2008-01-18 04:05:51 | jyasskin | set | files:
+ rational_tweaks.patch messages: + msg60080 |

2008-01-15 18:46:58 | mark.dickinson | set | messages: + msg59982 |

2008-01-15 07:47:58 | jyasskin | set | messages: + msg59958 |

2008-01-14 03:07:30 | mark.dickinson | set | messages: + msg59886 |

2008-01-14 01:45:48 | mark.dickinson | set | messages: + msg59884 |

2008-01-13 22:57:31 | jyasskin | set | files:
+ rational.patch messages: + msg59870 |

2008-01-12 01:47:45 | mark.dickinson | set | messages: + msg59773 |

2008-01-12 01:40:07 | mark.dickinson | set | messages: + msg59772 |

2008-01-11 21:26:05 | rhettinger | set | messages: + msg59755 |

2008-01-11 20:50:32 | jyasskin | set | messages: + msg59747 |

2008-01-10 16:10:51 | gvanrossum | set | messages: + msg59665 |

2008-01-10 05:18:50 | rhettinger | set | messages: + msg59659 |

2008-01-10 04:35:37 | gvanrossum | set | messages: + msg59657 |

2008-01-10 03:51:15 | mark.dickinson | set | messages: + msg59656 |

2008-01-10 02:01:12 | rhettinger | set | messages: + msg59651 |

2008-01-10 01:58:46 | rhettinger | set | messages: + msg59650 |

2008-01-10 01:23:22 | jyasskin | set | messages: + msg59645 |

2008-01-10 00:38:34 | gvanrossum | set | messages: + msg59640 |

2008-01-10 00:29:16 | rhettinger | set | messages: + msg59639 |

2008-01-10 00:15:34 | facundobatista | set | messages: + msg59636 |

2008-01-10 00:04:54 | rhettinger | set | messages: + msg59635 |

2008-01-09 23:11:00 | gvanrossum | set | messages: + msg59633 |

2008-01-09 17:55:33 | facundobatista | set | nosy:
+ facundobatista messages: + msg59615 |

2008-01-09 17:48:33 | rhettinger | set | messages: + msg59614 |

2008-01-09 16:09:20 | gvanrossum | set | messages: + msg59608 |

2008-01-09 09:04:42 | rhettinger | set | nosy:
+ rhettinger messages: + msg59583 |

2008-01-09 07:36:24 | jyasskin | set | keywords:
+ patch files: + rational.patch messages: + msg59582 |

2008-01-07 02:21:46 | mark.dickinson | set | messages: + msg59425 |

2008-01-07 01:31:49 | mark.dickinson | set | nosy:
+ mark.dickinson messages: + msg59424 |

2007-12-21 19:54:19 | gvanrossum | set | priority: normal assignee: gvanrossum nosy: + gvanrossum |

2007-12-21 17:41:30 | jyasskin | create |