classification
Title: Expand math.gcd() and math.lcm() to accept multiple arguments
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Ananthakrishnan, BernardoSulzbach, Dennis Sweeney, SilentGhost, mark.dickinson, rhettinger, serhiy.storchaka, steven.daprano, tim.peters
Priority: normal Keywords: patch

Created on 2020-02-16 13:38 by Ananthakrishnan, last changed 2020-02-23 11:23 by mark.dickinson. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 18590 closed Ananthakrishnan, 2020-02-21 09:38
PR 18604 merged serhiy.storchaka, 2020-02-22 10:55
Messages (27)
msg362069 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-16 13:38
If we have to find the gcd of three or more numbers, now we should use 
gcd(a, gcd(b, gcd(c, gcd(d, e)))))
which will create lot of problems.

math.gcd should take "n" number of arguments,like:

gcd(a,b,c,....)

gcd(4,6,8)    //returns 2
gcd(2,5,8,6)    //returns 1
gcd(6,30,40,60,20,40)    //returns 2
msg362071 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-02-16 13:50
What problems it will create?
msg362073 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-16 13:59
It will take more time to write the statement itself and there are chances of getting mistakes as the statement is complicated.
msg362074 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2020-02-16 14:00
You could (should?) use reduce, then:

>>> functools.reduce(math.gcd, (6, 30, 40, 60, 20, 40))
2
msg362077 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-02-16 16:27
+1  This isn't hard for us to do and may be occasionally useful.
msg362079 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-02-16 17:02
Should it take variable number of positional arguments or a single iterable argument as max() and min()?

What should it return for 0 arguments?
msg362080 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-02-16 17:08
Python as programming language provides builtin blocks. This is a good example of using functools.reduce().

But if you can provide an algorithm for calculating the GCD and LCM more efficient than sequential applying gcd() and lcm() for two arguments, it would be an argument for implementing it in the stdlib.
msg362082 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-02-16 17:12
> Should it take variable number of positional arguments or a single iterable argument as max() and min()?

Good question: the obvious extension to the current function would be to allow it to take multiple scalar arguments. But I don't much like the mismatch that introduces with `sum` and `math.prod`, which accept a single iterable argument.

I *definitely* don't want to give `gcd` a combination API like the one `min` and `max` have.

> What should it return for 0 arguments?

That one's easy. It should return `0`. `0` is an identity for the binary `gcd` operation, and mathematically, the `gcd` is defined for *any* set of integers (finite or infinite), including the empty set. The gcd of the empty set is zero.
msg362083 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-02-16 17:13
What's slightly more interesting is the return value for a single argument  `a`, where we should be careful to return `abs(a)` rather than simply `a`.
msg362092 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-02-16 19:04
This is almost all down to pragmatics for me.

For sum() and prod(), if there are only two operands then there are trivial other ways to spell that (+ and *).  So it makes most sense for them to accept iterables instead.  Essentially, e.g., nobody would ever _want_ to write sum(2, 3) instead of 2+3.

max() and min() differ in that there is no other simple way to write the 2-argument forms.  max(x, y) _is_ commonly wanted - but so is max(iterable).  It's a weird function signature, but nearly always does what's intended.

gcd() (and lcm()!) differ from both in that they're nearly always wanted with 2 arguments.  "0, 1, or >= 3 arguments" are possibilities, but rarely wanted.  It would be a PITA to need to write, e.g., gcd((x, y)) instead of gcd(x, y) for the overwhelmingly most common 2-argument case.  So they should be *args functions - provided we want to cater directly to "0, 1, or >= 3 arguments" at all.  Fine by me if we don't.  I'm only +0 on generalizing beyond "exactly 2 arguments".

For the rest, I agree with Mark.  In particular, gcd() must be 0.

Serhiy, a "bulk" gcd can be more efficient than nested function calls by exploiting a common case:  it can get out early when the gcd-so-far becomes 1 (since gcd(1, anything) == 1, it doesn't matter what the remaining arguments are).  For "random" integers, it's already the case that gcd(i, j) == 1 over 60% of the time.
msg362098 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-02-16 20:04
I think the behavior of gcd() == 0 is correct, but it should be documented, because it isn't completely obvious.

Arguments for gcd() == 0:

- Preserves the invariant gcd(itertools.chain(iterables)) == gcd(itertools.starmap(gcd, iterables)) in the case that some iterables are empty.

- Return a "neutral element" (not technically true for negative integers because gcd(0, a) == abs(a) != a, but it does get "ignored")

Other considerations:

- The "common divisors" of the empty set are the integers d such that for all x in the empty set, d divides x, which is to say, all of the integers. Then there is no greatest of these common divisors. The same is true if we try gcd(0, 0, ... 0).

- The above is only fixed if we interpret "greatest" in the sense of divisibility, where 0 > everything because everything divides 0. In particular, we can say that g is a greatest common divisor of a_1, ..., a_n if g | a_i for all i, and for any d with d | a_i for all i, we also have d | g. Using this definition also requires that we specify that if there are two gcds, we return the positive one.
I believe this is the correct interpretation, but it may not be obvious, since it assumes that proof that such a number even exists.

- Another common definition of the greatest common divisor of {a_1, ..., a_n} is the smallest positive integer expressible as an integer linear combination x_1 a_1 + ... + x_1 a_2, where x_i are integers. But this definition breaks on gcd(), gcd(0), gcd(0, 0), etc, since no positive integers are expressible.

In short, while I agree with the behavior, I think there should be a clarifying sentence in the docs saying "returns 0 if all of the arguments are 0".
msg362099 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-02-16 20:06
Correction: gcd(itertools.chain(iterables)) == gcd(*map(gcd, iterables))
msg362100 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-02-16 20:26
Correction correction: returning zero preserves the invariant below.

from math import gcd as GCD
from functools import reduce
from itertools import starmap, chain

def gcd(*args):
    return reduce(GCD, args, 0)

iterables = [[10, 20, 30],
             [],
             [0, 0, 0],
             [5],
             [-15, -10000000]]

a = gcd(*chain.from_iterable(iterables))
b = gcd(*starmap(gcd, iterables))
assert a == b == 5
msg362139 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-17 12:17
It should take variable number of positional arguments.
it should return:
>>math.gcd()
0
>>math.gcd(8)
8
>>math.gcd(-8)
8
msg362159 - (view) Author: Bernardo Sulzbach (BernardoSulzbach) * Date: 2020-02-17 22:47
I think this might make sense because gcd() could have some optimizations for n > 2, such as sorting the numbers and starting by the smallest elements.

However, I don't think gcd() and lcm() with more than two arguments are frequent use cases.
msg362223 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-18 15:36
Can I put together a PR for this issue?
msg362343 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-02-20 18:49
@Ananthakrishnan: Sure, go ahead.

Let's make the process a bit more streamlined than last time around: see if you can put together a PR that passes all the CI checks before asking for review. If you get stuck, make a work-in-progress PR and ask for help in a comment on that PR rather than via email.

A couple of pointers:

- Take a look at the existing math.hypot implementation to see how to deal with multiple arguments. (Note that argument clinic doesn't currently work for *args functions.)

- For the first working version, don't bother making special cases for the zero-argument or one-argument gcd (or even the case of two arguments). You just want the equivalent of the following Python code, which is perfectly general:

    def gcd(*args):
        g = 0
        for arg in args:
            g = gcd_2arg(g, operator.index(arg))
        return g

where gcd_2arg is the original 2-argument gcd (in C, _PyLong_GCD). We can always optimise later.
msg362383 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-21 06:41
This is my code for math.gcd:

static PyObject *
math_gcd(PyObject *module, PyObject *args, Py_ssize_t n)
{
    PyObject *g = 0, *item, *in;
    
    Py_ssize_t i;
    for (i = 0; i < n; i++) {
        item = args[i];
        in = PyNumber_Index(item);
        if (in == NULL) {
            return NULL;
        }
        g = _PyLong_GCD(g, in);
    }
    Py_DECREF(in);
    return g;
}



This is showing an error:

error: incompatible types when assigning to type ‘PyObject * {aka struct _object *}’ from type ‘PyObject {aka struct _object}’
       item = args[i];
           ^^^
msg362386 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-02-21 07:52
@Ananthakrishnan: Do you know how arrays and pointers work in C?
msg362392 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-21 08:53
Yes I know.
msg362399 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-21 09:35
Thanks for the hint.Made changes.
msg362457 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-02-22 10:54
Sorry, Ananthakrishnan, but I think this problem is too difficult to you. Adding math.lcm() taken 2 weeks and produced 200 messages. It is simpler to implement this feature myself to me.
msg362459 - (view) Author: Ananthakrishnan (Ananthakrishnan) * Date: 2020-02-22 11:11
>>Sorry, Ananthakrishnan, but I think this problem is too difficult to you. Adding math.lcm() taken 2 weeks and produced 200 messages. It is simpler to implement this feature myself to me.


I'm a beginner.Not everyone is perfect at begenning.
I am starting this.So I took some problems myself as i saw the problems here are difficult.You should have given me a chance.

I think you should have at the same situations once,like the situations that I'm facing now.
msg362461 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-02-22 11:29
I'm always ready to help a beginner, but this is the first time that it takes so much effort. It would be easier if you started with the simple problems you could handle. Even in simple cases you would be able to get tips that you can understand and gradually improve your skills.
msg362462 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-02-22 11:39
If expand math.gcd() to accept multiple arguments, it is worth to do the same with math.lcm().
msg362502 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-02-23 11:21
New changeset 559e7f165ad03731e6bc2211c0e6d8d9c02fb549 by Serhiy Storchaka in branch 'master':
bpo-39648: Expand math.gcd() and math.lcm() to handle multiple arguments. (GH-18604)
https://github.com/python/cpython/commit/559e7f165ad03731e6bc2211c0e6d8d9c02fb549
msg362503 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2020-02-23 11:23
Merged; thank you to all involved.
History
Date User Action Args
2020-02-23 11:23:14mark.dickinsonsetstatus: open -> closed

messages: + msg362503
stage: patch review -> resolved
2020-02-23 11:21:42mark.dickinsonsetmessages: + msg362502
2020-02-22 11:39:22serhiy.storchakasetmessages: + msg362462
title: Update math.gcd() to accept "n" arguments. -> Expand math.gcd() and math.lcm() to accept multiple arguments
2020-02-22 11:29:44serhiy.storchakasetmessages: + msg362461
2020-02-22 11:11:39Ananthakrishnansetmessages: + msg362459
2020-02-22 10:55:32serhiy.storchakasetpull_requests: + pull_request17970
2020-02-22 10:54:53serhiy.storchakasetmessages: + msg362457
2020-02-21 09:38:27Ananthakrishnansetkeywords: + patch
stage: patch review
pull_requests: + pull_request17959
2020-02-21 09:35:47Ananthakrishnansetmessages: + msg362399
2020-02-21 08:53:08Ananthakrishnansetmessages: + msg362392
2020-02-21 07:52:32mark.dickinsonsetmessages: + msg362386
2020-02-21 06:41:40Ananthakrishnansetmessages: + msg362383
2020-02-20 18:49:36mark.dickinsonsetmessages: + msg362343
2020-02-18 15:36:18Ananthakrishnansetmessages: + msg362223
2020-02-17 22:47:13BernardoSulzbachsetnosy: + BernardoSulzbach
messages: + msg362159
2020-02-17 12:17:27Ananthakrishnansetmessages: + msg362139
2020-02-16 20:26:50Dennis Sweeneysetmessages: + msg362100
2020-02-16 20:06:36Dennis Sweeneysetmessages: + msg362099
2020-02-16 20:04:47Dennis Sweeneysetnosy: + Dennis Sweeney
messages: + msg362098
2020-02-16 19:04:57tim.peterssetmessages: + msg362092
2020-02-16 17:13:57mark.dickinsonsetmessages: + msg362083
2020-02-16 17:12:45mark.dickinsonsetmessages: + msg362082
2020-02-16 17:08:16serhiy.storchakasetmessages: + msg362080
2020-02-16 17:02:27serhiy.storchakasetmessages: + msg362079
2020-02-16 16:27:31rhettingersetnosy: + tim.peters, rhettinger
messages: + msg362077
2020-02-16 14:00:28SilentGhostsetnosy: + SilentGhost
messages: + msg362074
2020-02-16 13:59:15Ananthakrishnansetmessages: + msg362073
2020-02-16 13:50:53serhiy.storchakasetmessages: + msg362071
2020-02-16 13:38:42Ananthakrishnancreate