Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] Add math.lcm() function: Least Common Multiple #83660

Closed
ananthan-123 mannequin opened this issue Jan 28, 2020 · 31 comments
Closed

[RFE] Add math.lcm() function: Least Common Multiple #83660

ananthan-123 mannequin opened this issue Jan 28, 2020 · 31 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@ananthan-123
Copy link
Mannequin

ananthan-123 mannequin commented Jan 28, 2020

BPO 39479
Nosy @malemburg, @tim-one, @rhettinger, @mdickinson, @vstinner, @stevendaprano, @serhiy-storchaka, @vedgar, @ananthan-123
PRs
  • bpo-39479:Add math.lcm() function: Least Common Multiple #18328
  • bpo-39479:Add math.lcm() function: Least Common Multiple #18547
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-02-19.18:22:25.773>
    created_at = <Date 2020-01-28.13:52:43.571>
    labels = ['type-feature', 'library', '3.9']
    title = '[RFE] Add math.lcm() function: Least Common Multiple'
    updated_at = <Date 2020-02-19.18:22:25.772>
    user = 'https://github.com/ananthan-123'

    bugs.python.org fields:

    activity = <Date 2020-02-19.18:22:25.772>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-02-19.18:22:25.773>
    closer = 'mark.dickinson'
    components = ['Library (Lib)']
    creation = <Date 2020-01-28.13:52:43.571>
    creator = 'Ananthakrishnan'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39479
    keywords = ['patch']
    message_count = 31.0
    messages = ['360875', '360876', '360877', '360879', '360880', '360888', '360895', '360898', '360900', '360934', '360941', '360989', '361021', '361029', '361033', '361035', '361037', '361080', '361084', '361086', '361098', '361101', '361102', '361210', '361220', '361221', '361251', '361262', '361263', '361270', '362287']
    nosy_count = 10.0
    nosy_names = ['lemburg', 'tim.peters', 'rhettinger', 'mark.dickinson', 'vstinner', 'stutzbach', 'steven.daprano', 'serhiy.storchaka', 'veky', 'Ananthakrishnan']
    pr_nums = ['18328', '18547']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue39479'
    versions = ['Python 3.9']

    @ananthan-123
    Copy link
    Mannequin Author

    ananthan-123 mannequin commented Jan 28, 2020

    can we add an lcm and gcd function that can work as:

    lcm(4,6) # returns 12
    
    gcd(4,6) # returns 2

    @ananthan-123 ananthan-123 mannequin added 3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jan 28, 2020
    @vstinner
    Copy link
    Member

    There is math.gcd():
    https://docs.python.org/dev/library/math.html#math.gcd

    You can use numpy.lcm():
    https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.lcm.html

    Is it common to need lcm()? Do you have examples of applications which need lcm()? It's trivial to implement lcm():
    https://stackoverflow.com/questions/51716916/built-in-module-to-calculate-least-common-multiple

    I suggest to reject this feature request, since I never needed lcm() in 10 years of Python, whereas I use gcd() time to time.

    @vstinner vstinner added 3.9 only security fixes and removed 3.8 only security fixes labels Jan 28, 2020
    @vstinner vstinner changed the title can we add a lcm and gcd function. [RFE] Add math.lcm() function: Least Common Multiple Jan 28, 2020
    @vstinner vstinner added 3.9 only security fixes and removed 3.8 only security fixes labels Jan 28, 2020
    @vstinner vstinner changed the title can we add a lcm and gcd function. [RFE] Add math.lcm() function: Least Common Multiple Jan 28, 2020
    @stevendaprano
    Copy link
    Member

    Uses for lcm are common enough that it is provided by Excel and the C++ boost. You can use it for working out problems like:

    • if event A happens every 14 days, and event B happens every 6 days, then A and B will occur together even lcm(14, 6) days.

    By the way, the "trivial" implementation given in the Stackoverflow link has a bug: if both arguments are zero, it raises instead of returning zero.

    I wish that gcd took an arbitrary number of arguments, I often need to find the gcd of three or more numbers, and this is a pain:

        gcd(a, gcd(b, gcd(c, gcd(d, e)))))

    when I could just say gcd(a, b, c, d, e) and have it work. Likewise of lcm. (For that matter, the gcd of a single number a is just a.)

    @serhiy-storchaka
    Copy link
    Member

    reduce(gcd, [a, b, c, d, e])

    @ananthan-123
    Copy link
    Mannequin Author

    ananthan-123 mannequin commented Jan 28, 2020

    I created this issue as i came across the following question:

    There are n students in class A,and m students in class B.each class divides into teams for a competition.What is the biggest possible team size that can be divided,such that each team has same number of members.

    We can solve this type of problems easily if we have lcm() in math library.And there are lots of real life applications where we have to use lcm.

    @ananthan-123
    Copy link
    Mannequin Author

    ananthan-123 mannequin commented Jan 28, 2020

    Should i proceed with adding a pull request for adding a 'lcm' function in python's math module.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Jan 28, 2020

    I must say that the problem (with two classes divided into teams) seems to me to be exactly one that can be solved with gcd, and lcm itself is mostly useless for it.

    @ananthan-123
    Copy link
    Mannequin Author

    ananthan-123 mannequin commented Jan 28, 2020

    some problems that needs lcm function:

    1:find the least number which when divided by 'a','b','c','d' leaves remainder 'e' in each case.

    2:person A exercises every 'n' days and person B every 'm' days. A and B both exercised today. How many days will it be until they exercise together again?

    3:The LCM is important when adding fractions which have different denominators

    we have to use the lcm function when,

    1. an event that is or will be repeating over and over.
    2. To purchase or get multiple items in order to have enough.
    3. To figure out when something will happen again at the same time.

    All these shows lcm function should be included in the math library.

    So can i proceed with adding pull request to add lcm function in python's math module.

    @rhettinger
    Copy link
    Contributor

    -1 Given that we had gcd(), I don't see any value to adding *lcm()* as well.

    Once you have gcd(), getting the least common multiple is trivial.

    Also, it is rare to ever need a lcm() function. I don't think I've ever seen it in real code.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Jan 29, 2020

    I agree with Raymond that it's really seldom needed. However, I'd like to point out that the "trivial" implementation might not be so trivial after all: as Steven said, it mishandles (0,0) case. And even Tim fell into that trap, so it can't be said it's easily avoided. I agree that this case doesn't really appear in "real world" tasks, but that's not really an excuse: imagine a factorial that didn't work for 0.

    Also, a bit more often used case: seeing the code for lcm of 2 arguments, people might (and do; I've seen it) generalize to 3 or more arguments in a way that seems logical and is often correct, but not always (a*b*c//gcd(a,b,c)).

    And one more tidbit: the usual formula for lcm doesn't really work for the case of fraction inputs. I know that currently math.gcd doesn't handle fractions, but it could. It's imaginable that that feature will some day be added (just like pow recently started supporting modular inverses), and then suddenly lcd implementations will silently give the wrong result for fractions.

    A smart man;-) once said that the main criteria for inclusion in stdlib is that the function is often needed by users, _and_ it's often implemented wrong. I think lcm doesn't really satisfy the first criterion, but it does satisfy the second.

    @ananthan-123
    Copy link
    Mannequin Author

    ananthan-123 mannequin commented Jan 29, 2020

    I agree with Vedran Čačić.As the modules are not made for one person but it is for the ease of coding.There are so many functions which i don't use but used by other people.We are using functions to make coding easy and if lcm function is added many people will find it usefull.

    @tim-one
    Copy link
    Member

    tim-one commented Jan 29, 2020

    +0 from me.

    Another use is computing the Carmichael function for composite numbers (like an RSA encryption modulus, in which context the Carmichael function is routinely used).

    But only +0 instead of +1 because it's so easy to build from gcd().

    I don't agree it's tricky at all. While lcm(0, 0) undoubtedly should return 0 in a general-purpose library function, in my own code I've never supplied that. Because in every application I've ever had for it, I would rather get an exception if I ever passed two zeroes - that would always have been a mistake.

    @mdickinson
    Copy link
    Member

    +0 from me as well; agreed with everything that Tim said (except that I've never had a need for the Carmichael function; my RSA implementations do the inefficient thing based on (p-1)(q-1)).

    This is somewhat reminiscent of comb and perm: lcm is often taught as a natural counterpart to gcd, so despite the fact that it's less fundamental and has less utility, people often expect to see the two together.

    @ananthakrishnan: do you want to put together a PR? I'll commit to reviewing it if you do.

    @ananthan-123
    Copy link
    Mannequin Author

    ananthan-123 mannequin commented Jan 30, 2020

    Yes,I want to put together a PR.

    @mdickinson
    Copy link
    Member

    Great. For clarity, here's a Python function giving the behaviour I'd expect from a 2-arg lcm:

        from math import gcd
    
        def lcm(a, b):
            if a == 0:
                return 0
            return abs(a // gcd(a, b) * b)

    @serhiy-storchaka
    Copy link
    Member

    If a < b, what is better,

    a // gcd(a, b) * b
    

    or

    b // gcd(a, b) * a
    

    ? Or there is no difference?

    @mdickinson
    Copy link
    Member

    I'd guess a // gcd(a, b) * b would be faster, on the basis that division is slower than multiplication. But I doubt it's worth worrying about for this implementation, given that the gcd call is likely to be the bottleneck as a and b get large.

    @tim-one
    Copy link
    Member

    tim-one commented Jan 30, 2020

    And I in turn agree with everything Mark said ;-) But I'll add that while the analogy to comb/perm is a good one, I think the case for lcm is stronger than for perm. Not only is lcm more common in real life (although, no, not at all common overall!), perm was added at the same time as prod, and perm is essentially a special case of prod.

    @rhettinger
    Copy link
    Contributor

    Perhaps add, is_prime() as well. Conceptually, it is in the same family of functions. Unlike lcm(), it is one that I would use and would improve Python's value as a teaching tool.

    @tim-one
    Copy link
    Member

    tim-one commented Jan 31, 2020

    Raymond, there was a thread a while back about adding an imath module, to package the number-theoretic functions that frequently come up. All these things should really go into that instead, IMO. math started as a thin wrapper around C's libm, and has been losing its once-exclusive focus on functions for working with Python floats. I think that focus was valuable.

    In that older thread, I suggested a probable_prime(n) predicate function, and posted my pure-Python Miller-Rabin implementation. Simple (as such things go), but I wouldn't aim to compete with (say) gmp.

    I don't think is_prime(n) is suitable for Python. Proving that a large integer absolutely is prime is either quite slow or quite complicated. In practice, even professionals in critical applications are happy to settle for probabilistic assurances, because an arbitrarily tiny probability of error can be obtained very much faster than a deterministic proof.

    Anyway ;-) , ya, I like the idea, but I'd sure like it to go into a module where it obviously belongs. Also a function, e.g., to generate primes, and ...

    @vstinner
    Copy link
    Member

    I dislike the idea of adding a is_prime() function in Python since users will likely stress Python with huge numbers and then complain that Python is too slow. Correct is_prime() is very slow for large numbers, and statistically approach is not 100% correct... Or we may have to add both approaches. But then you have to pick an algorithm which would fit "most use cases".

    I concur with Tim, it would be better to put such opinionated code on PyPI ;-)

    --

    I'm -0 on adding math.lcm(). For example, I failed to find any request to add such function on python-ideas archives. This issue seems to be the first user request.

    I'm not opposed to add it. Some people seem to like the idea of the completeness of the stdlib (gcd & lcm go together). So if someone wants to add it, go ahead and propose a PR :-)

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Jan 31, 2020

    is_prime that's always correct is probably not the right thing to go into math. Besides, now we have isqrt, it's just

    n>1 and n&1 and all(n%d for d in range(3,isqrt(n)+1,2))
    

    -- yes, it's damn slow, but so is everything else you want to be absolutely correct. :-]

    is_probable_prime is another matter, but there is an enormous amount of bikeshedding about the API of that one.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Jan 31, 2020

    And yeah, I managed to leave out 2. Speaking about "often implemented wrong"... :-))

    @tim-one
    Copy link
    Member

    tim-one commented Feb 2, 2020

    I'd have to hear back from Raymond more on what he had in mind - I may well have been reading far too much in the specific name he suggested.

    Don't much care about API, etc - pick something reasonable and go with it. I'm not overly ;-) concerned with being "newbie friendly". If someone is in a context where they need to use probabilistic solutions, there is no substitute for them learning something non-trivial about them. The usual API for a Miller-Rabin tester supports passing in the number of bases to try, and it's as clear as anything of this kind _can_ be then that the probability of getting back True when the argument is actually composite is no higher than 1 over 4 to the power of the number of bases tried. Which is also the way they'll find it explained in every reference. It's doing nobody a real favor to make up our own explanations for a novel UI ;-)

    BTW, purely by coincidence, I faced a small puzzle today, as part of a larger problem:

    Given that 25 is congruent to 55 mod 10, and also mod 15, what's the largest modulus we can be certain of that the congruence still holds? IOW, given

        x = y (mod A), and
        x = y (mod B)

    what's the largest C such that we can be certain

        x = y (mod C)

    too? And the answer is C = lcm(A, B) (which is 30 in the example).

    @stevendaprano
    Copy link
    Member

    On Fri, Jan 31, 2020 at 12:15:37PM +0000, Vedran Čačić wrote:

    Vedran Čačić <vedgar@gmail.com> added the comment:

    is_prime that's always correct is probably not the right thing to go into math. Besides, now we have isqrt, it's just

    n\>1 and n&1 and all(n%d for d in range(3,isqrt(n)+1,2))
    

    -- yes, it's damn slow, but so is everything else you want to be absolutely correct. :-]

    Lots of things are fast and still absolutely correct.

    And I think that you probably are underestimating just how slow trial
    division is for testing primes. I can imagine you've probably tested it
    on "large numbers" like a trillion-trillion (1e18) and thought that's
    acceptably fast. Try it on numbers like this:

    P = int('29674495668685510550154174642905332730771991'
            '79985304335099507553127683875317177019959423'
            '8596428121188033664754218345562493168782883')
    Q = P*(313*(P-1)+1)*(353*(P-1)+1)

    Q is a 397 digit Carmichael Number. Its smallest factor is P, which has
    131 digits. If your computer is fast enough to perform a thousand
    trillion (1e15) divisions per second, your trial division will take more
    than 3e107 years to complete. That's 10 million trillion trillion
    trillion trillion trillion trillion trillion trillion trillion times
    longer than the universe has existed.

    In a practical sense, your algorithm is never going to terminate. The
    sun will burn out and die first.

    Upgrading your CPU is not going to help.

    My old and slow PC can prove that Q is a composite number, using pure
    Python, in less than six seconds. And I'm sure that a better programmer
    than me would be able to shave off some of that time.

    is_probable_prime is another matter, but there is an enormous amount
    of bikeshedding about the API of that one.

    What is there to bikeshed about the API?

    The API should be a function that takes a single integer, and returns
    False if that number is certainly composite[1] otherwise True.

    I will argue that any other details -- including whether it is
    probabilistic or not -- are *quality of implementation* issues and not
    part of the API.

    For the same reason, the name should be just "is_prime" (or "isprime")
    and not disturb naive users by calling it "probably prime".

    Experts will read the documentation and/or the source code, and everyone
    else can happily ignore the microscopic[2] chance of a false positive
    with really huge numbers. (Sometimes ignorance really is bliss.)

    We can argue on technical grounds just what the implementation should
    be, but that's not part of the API, and the implementation could change:

    • Miller-Rabin, or Baillie–PSW, or AKS, or something else;
    • whether probabilistic or deterministic;
    • what error rate (if any) we are willing to accept;

    etc.

    If we choose the fast, simple Miller-Rabin test, with just 30 random
    iterations the error rate is less than approximately one in a trillion
    trillion. If we tested a million numbers ever second, it would take over
    18,000 years (on average) to find one false positive.

    If that isn't good enough, increase the number of iterations to 50, in
    which case you would expect one false positive every 20,000 trillion
    years.

    In comparison, it is estimated that cosmic rays cause memory bit
    flips as often one error per 4GB of RAM per day. This probabilistic
    algorithm is more reliable than your determinstic computer.

    https://blogs.oracle.com/linux/attack-of-the-cosmic-rays-v2

    https://www.johndcook.com/blog/2019/05/20/cosmic-rays-flipping-bits/

    I don't have much time for worries about Miller-Rabin being
    "probabilistic". When I was young and naive I worried about it a lot,
    and maybe that was a legitimate worry for a naive Miller-Rabin
    implementation that did, say, five iterations (probability of a false
    positive about 0.1%).

    But with 30 or 50 rounds, I am confident that nobody will ever
    experience such a false positive, not if they spend their entire
    lifetime doing nothing but calling is_prime.

    Let's be real: if the banks are willing to trust the transfer of
    billions of dollars to probabilistic primality testing, why are we
    worring about this?

    (By the way: for smallish numbers, under 2**64, no more than twelve
    rounds of Miller-Rabin are sufficient to deterministically decide
    whether a number is prime or not. Likewise for Baillie–PSW, which is
    also completely deterministic for numbers up to 2**64.)

    [1] For ease of discussion, we'll count zero and one as composite.

    [2] More like nanoscopic or picoscopic.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Feb 2, 2020

    Tim: Considering that congruence is _defined_ as x=y(mod m) :<=> m|y-x, it's really not so surprising. :-)

    Steven: It seems that we completely agree about inclusion of is_probabilistic_prime in stdlib. And we agree that it should be called isprime (or is_prime if Raymond names it;). About the bikeshedding, see Tim's comment. :-P

    About absolutely correct algorithms: first, what exactly is your claim? If it's "I can write an algorithm in pure Python that can for every number of 397 digits mathematically exactly determine whether it is prime in under 6 seconds", I'd very much like to see that algorithm. (I guess it must be publicly available, since we're speaking about inclusion in Python stdlib.) I really don't have much expertise in number theory that I can be convinced it doesn't exist, but I very much doubt it.

    @tim-one
    Copy link
    Member

    tim-one commented Feb 2, 2020

    [Steven]

    ... Try it on numbers like this:
    ...
    Q = P*(313*(P-1)+1)(353(P-1)+1)

    Q is a 397 digit Carmichael Number. Its smallest factor is P,
    which has 131 digits.
    ...
    My old and slow PC can prove that Q is a composite number,
    using pure Python, in less than six seconds.

    And I'm sure that a better programmer than me would be
    able to shave off some of that time.

    The pure-Python Miller-Rabin code i posted in the aforementioned thread is typically at least 100 times faster than that. But it's not deterministic. Because it tries (pseudo)random bases, it all depends on how many it needs to try on each run before finding a witness to that Q is composite. It usually (at least 75% of runs) finds one on the first try.

    BTW, it doesn't much matter that this is "pure Python" - for ints of this size the bulk of the time regardless is spent in CPython's C-coded bigint arithmetic. I expect that your code must be doing more than _just_ Miller-Rabin, and in the Q case is paying through the nose for "optimizations" that all fail before getting to Miller-Rabin.

    About the API, I can't agree to the one you propose. Different applications have different appropriate tradeoffs between degree of certainty and time consumed - "one size fits all" doesn't fly here.

        def probably_prime(n, maxsteps=20)

    supplies a _default_ instead. I don't want an API that's useful _only_ to people who don't know what they're doing ;-)

    (By the way: for smallish numbers, under 2**64, no more than
    twelve rounds of Miller-Rabin are sufficient to
    deterministically decide whether a number is prime or not.

    But only if you use a fixed set of 12 specific bases for which the claim has been proved. "Real" Miller-Rabin picks bases at random, relying only on properties that have been proved independent of the argument size.

    [Vedran]

    Tim: Considering that congruence is _defined_ as
    x=y(mod m) :<=> m|y-x, it's
    really not so surprising. :-)

    Didn't say it was ;-) Was just noting the odd coincidence that I just happened to stumble into a real use for lcm(), not previously mentioned in this report, while doing something else.

    @stevendaprano
    Copy link
    Member

    [Steven]
    > ... Try it on numbers like this:
    > ...
    > Q = P*(313*(P-1)+1)(353(P-1)+1)
    >
    > Q is a 397 digit Carmichael Number. Its smallest factor is P,
    > which has 131 digits.

    [Tim]

    The pure-Python Miller-Rabin code i posted in the aforementioned
    thread is typically at least 100 times faster than that.

    This is exactly the sort of argument about quality of implementation
    which isn't, or shouldn't be, part of the argument about the API, IMO.
    Just as the choice of Timsort over Quicksort or Bubblesort *wink* isn't
    part of the list.sort() API, let alone the implementation choices in
    Timsort such as MIN_GALLOP.

    I'm happy to have a discussion about implementation, here or off-list,
    I'm sure I will learn a lot. But briefly, the Q I quoted above was
    carefully designed (not by me, I hasten to add!) to be a preudoprime to
    the first 307 prime bases, so it's something of a worst-case scenario
    for my version.

    BTW, it doesn't much matter that this is "pure Python" - for ints of
    this size the bulk of the time regardless is spent in CPython's
    C-coded bigint arithmetic.

    A fair point, thank you.

    About the API, I can't agree to the one you propose. Different
    applications have different appropriate tradeoffs between degree of
    certainty and time consumed - "one size fits all" doesn't fly here.

    def probably_prime(n, maxsteps=20)
    

    supplies a _default_ instead. I don't want an API that's useful
    _only_ to people who don't know what they're doing ;-)

    It's not just for people who don't know what they're doing. It is for
    people who don't want to sweat the small details, they just want an
    answer True or False and are prepared to trust that the function author
    knows what they are doing.

    If someone cares about the small details like how many bases to
    try, they might also care about details like:

    • which specific bases are used;
    • whether to use Miller-Rabin at all;
    • how many trial divisions to do first;
    • which specific primality test to use;

    etc. What if the implementation shifts away from Miller-Rabin to (say)
    Baillie-PSW? Then your maxsteps parameter becomes meaningless. Do we
    deprecate it or leave it there in case the implementation changes again
    to something that has a concept of number of steps?

    I think that if someone cares sufficiently about the details, then they
    probably won't be satisfied with a single isprime function, but may want
    is_fermat_prime, is_miller_rabin_prime, is_lucas_prime etc.

    > (By the way: for smallish numbers, under 2**64, no more than
    > twelve rounds of Miller-Rabin are sufficient to
    > deterministically decide whether a number is prime or not.

    But only if you use a fixed set of 12 specific bases for which the
    claim has been proved.

    Correct. The first twelve primes make up such a minimal set.

    http://oeis.org/A014233

    @tim-one
    Copy link
    Member

    tim-one commented Feb 3, 2020

    [Tim]

    > The pure-Python Miller-Rabin code i posted in the
    > aforementioned thread is typically at least 100
    > times faster than that.

    [Steven]

    This is exactly the sort of argument about quality of
    implementation which isn't, or shouldn't be, part of
    the argument about the API, IMO.

    I wasn't at all talking about API at that point. I was backing the argument _you_ were making, that trial division is a hopelessly inefficient implementation, compared to what's possible with probabilistic methods, regardless of API. You were in fact underselling that argument, because it's straightforward to get an order or two of magnitude faster than you demonstrated.

    the Q I quoted above was carefully designed (not by me, I hasten
    to add!)

    I know the paper it came from.

    to be a preudoprime to the first 307 prime bases, so it's
    something of a worst-case scenario for my version.

    Which is why I have no problem picturing how this "should be" approached: the original Miller-Rabin (which uses random bases, not succumbing to the "premature optimization" catastrophe magnet) has no problem with Q (or with anything else!), and hasn't really been improved on for general-purpose use since it was published. It's a darned-near perfect mix of simplicity, brevity, clarity, robustness, generality, and speed. "Good enough" by a long shot on all counts.

    > def probably_prime(n, maxsteps=20)
    >
    > supplies a _default_ instead. I don't want an API that's useful
    > _only_ to people who don't know what they're doing ;-)

    It's not just for people who don't know what they're doing. It
    is for people who don't want to sweat the small details,

    Then they can accept the default. In what conceivable sense is that a burden?

    they just want an nswer True or False and are prepared
    to trust that the function author knows what they are
    doing.

    But the function author can't possibly know what the _user_ needs this for. In some apps degree of certainty is far more important than speed, while in other apps it's the opposite. Be realistic here? Your argument here makes sense for always-right functions, but you're not willing to accept one of those for this specific purpose (& neither am I).

    If someone cares about the small details like how
    many bases to try,

    It's not a "small detail" where it matters: it is THE way to trade off computation time against confidence in the results. It's a _fundamental_ aspect of how Miller-Rabin works.

    they might also care about details like:

    • which specific bases are used;
    • whether to use Miller-Rabin at all;
    • how many trial divisions to do first;
    • which specific primality test to use;

    Then they should go use a special-purpose library ;-) Letting them fiddle the single most important parameter isn't down in the weeds, it's a top-level control knob.

    My answers to all the above:

    • bases are picked via randrange(2, n-1)
    • yes, because no better general-purpose algorithm is known
    • none - although I'll allow that there may be a
      speed advantage in some apps if a gcd is done with a
      relatively small primorial first (more efficient than
      one-at-a-time trial divisions by small primes)
    • Miller-Rabin

    What if the implementation shifts away from Miller-Rabin
    to (say) Baillie-PSW?

    It can't ;-) I would absolutely document that Miller-Rabin _is_ the algorithm being used, with the random-bases implementation choice. Then the curious can search the web for a mountain of good information about it.

    Then your maxsteps parameter becomes meaningless. Do we
    deprecate it or leave it there in case the implementation
    changes again to something that has a concept of number
    of steps?

    All moot, given that I have no interest in hiding the specific algorithm in use. YAGNI.

    I think that if someone cares sufficiently about the details,
    then they probably won't be satisfied with a single isprime
    function, but may want is_fermat_prime, is_miller_rabin_prime,
    is_lucas_prime etc.

    Again, go to a special-purpose library if that's what they want. And, again, I don't agree with your characterization of the Miller-Rabin maxsteps parameter as a "detail". It's a fundamental aspect of what the algorithm does. Which casual users can certainly ignore, but at their own risk.

    ...
    Correct. The first twelve primes make up such a minimal set.

    And if you don't care about picking the fixed bases from a prefix of the primes, you only need 7 bases for a 100% reliable test through 2**64: 2, 325, 9375, 28178, 450775, 9780504, and 1795265022. Or so this table claims:

    https://miller-rabin.appspot.com/

    But I don't care here. Using a fixed set of bases is begging for embarrassment (for every fixed finite set of bases, there are an infinite number of composites they'll _claim_ are prime). There are no systemic failure modes when using random bases.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 3, 2020

    If someone wants to continue the discussion on is_prime(), I suggest to open a separated issue.

    @mdickinson
    Copy link
    Member

    New changeset f2ee21d by ananthan-123 in branch 'master':
    bpo-39479:Add math.lcm() function: Least Common Multiple (bpo-18547)
    f2ee21d

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants