This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients PedanticHacker, Stefan Pochmann, mark.dickinson, mcognetta, rhettinger, serhiy.storchaka, tim.peters
Date 2022-01-19.08:14:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1642580068.77.0.124366391776.issue37295@roundup.psfhosted.org>
In-reply-to
Content
All this should be tested with the C implementation because relative cost of operations is different in C and Python.

I have tested Raymond's idea about using iterative algorithm for small k.

$ ./python -m timeit -s 'from math import comb' "comb(3329023, 3)"
recursive: Mean +- std dev: 173 ns +- 9 ns
iterative: Mean +- std dev: 257 ns +- 13 ns

$ ./python -m pyperf timeit -s 'from math import comb' "comb(102571, 4)"
recursive: Mean +- std dev: 184 ns +- 10 ns
iterative: Mean +- std dev: 390 ns +- 29 ns

$ ./python -m pyperf timeit -s 'from math import comb' "comb(747, 8)"
recursive: Mean +- std dev: 187 ns +- 10 ns
iterative: Mean +- std dev: 708 ns +- 39 ns

Recursive algorithm is always faster than iterative one for k>2 (they are equal for k=1 and k=2).

And it is not only because of division, because for perm() we have the same difference.

$ ./python -m pyperf timeit -s 'from math import perm' "perm(2642247, 3)"
recursive: Mean +- std dev: 118 ns +- 7 ns
iterative: Mean +- std dev: 147 ns +- 8 ns

$ ./python -m pyperf timeit -s 'from math import perm' "perm(65538, 4)"
recursive: Mean +- std dev: 130 ns +- 9 ns
iterative: Mean +- std dev: 203 ns +- 13 ns

$ ./python -m pyperf timeit -s 'from math import perm' "perm(260, 8)"
recursive: Mean +- std dev: 131 ns +- 10 ns
iterative: Mean +- std dev: 324 ns +- 16 ns

As for the idea about using a table for fixed k=20, note that comb(87, 20) exceeds 64 bits, so we will need to use a table of 128-bit integers. And I am not sure if this algorithm will be faster than the recursive one.

We may achieve better results for lesser cost if extend Mark's algorithm to use 128-bit integers. I am not sure whether it is worth, the current code is good enough and cover the wide range of cases. Additional optimizations will likely have lesser effort/benefit ratio.
History
Date User Action Args
2022-01-19 08:14:28serhiy.storchakasetrecipients: + serhiy.storchaka, tim.peters, rhettinger, mark.dickinson, PedanticHacker, mcognetta, Stefan Pochmann
2022-01-19 08:14:28serhiy.storchakasetmessageid: <1642580068.77.0.124366391776.issue37295@roundup.psfhosted.org>
2022-01-19 08:14:28serhiy.storchakalinkissue37295 messages
2022-01-19 08:14:28serhiy.storchakacreate