Message 361262 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	steven.daprano
Recipients	Ananthakrishnan, lemburg, mark.dickinson, rhettinger, serhiy.storchaka, steven.daprano, stutzbach, tim.peters, veky, vstinner
Date	2020-02-03.01:28:06
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<20200203012802.GJ6362@ando.pearwood.info>
In-reply-to	<1580668005.33.0.405916279908.issue39479@roundup.psfhosted.org>

Content
> [Steven] > > ... Try it on numbers like this: > > ... > > Q = P(313(P-1)+1)(353(P-1)+1) > > > > Q is a 397 digit Carmichael Number. Its smallest factor is P, > > which has 131 digits. [Tim] > The pure-Python Miller-Rabin code i posted in the aforementioned > thread is typically at least 100 times faster than that. This is exactly the sort of argument about quality of implementation which isn't, or shouldn't be, part of the argument about the API, IMO. Just as the choice of Timsort over Quicksort or Bubblesort wink isn't part of the list.sort() API, let alone the implementation choices in Timsort such as MIN_GALLOP. I'm happy to have a discussion about implementation, here or off-list, I'm sure I will learn a lot. But briefly, the Q I quoted above was carefully designed (not by me, I hasten to add!) to be a preudoprime to the first 307 prime bases, so it's something of a worst-case scenario for my version. > BTW, it doesn't much matter that this is "pure Python" - for ints of > this size the bulk of the time regardless is spent in CPython's > C-coded bigint arithmetic. A fair point, thank you. > About the API, I can't agree to the one you propose. Different > applications have different appropriate tradeoffs between degree of > certainty and time consumed - "one size fits all" doesn't fly here. > > def probably_prime(n, maxsteps=20) > > supplies a _default_ instead. I don't want an API that's useful > _only_ to people who don't know what they're doing ;-) It's not just for people who don't know what they're doing. It is for people who don't want to sweat the small details, they just want an answer True or False and are prepared to trust that the function author knows what they are doing. If someone cares about the small details like how many bases to try, they might also care about details like: - which specific bases are used; - whether to use Miller-Rabin at all; - how many trial divisions to do first; - which specific primality test to use; etc. What if the implementation shifts away from Miller-Rabin to (say) Baillie-PSW? Then your maxsteps parameter becomes meaningless. Do we deprecate it or leave it there in case the implementation changes again to something that has a concept of number of steps? I think that if someone cares sufficiently about the details, then they probably won't be satisfied with a single isprime function, but may want is_fermat_prime, is_miller_rabin_prime, is_lucas_prime etc. > > (By the way: for smallish numbers, under 2**64, no more than > > twelve rounds of Miller-Rabin are sufficient to > > deterministically decide whether a number is prime or not. > > But only if you use a fixed set of 12 specific bases for which the > claim has been proved. Correct. The first twelve primes make up such a minimal set. http://oeis.org/A014233

> [Steven]
> > ... Try it on numbers like this:
> > ...
> > Q = P*(313*(P-1)+1)*(353*(P-1)+1)
> >
> > Q is a 397 digit Carmichael Number. Its smallest factor is P,
> > which has 131 digits.

[Tim]
> The pure-Python Miller-Rabin code i posted in the aforementioned 
> thread is typically at least 100 times faster than that.

This is exactly the sort of argument about quality of implementation 
which isn't, or shouldn't be, part of the argument about the API, IMO. 
Just as the choice of Timsort over Quicksort or Bubblesort *wink* isn't 
part of the list.sort() API, let alone the implementation choices in 
Timsort such as MIN_GALLOP.

I'm happy to have a discussion about implementation, here or off-list, 
I'm sure I will learn a lot. But briefly, the Q I quoted above was 
carefully designed (not by me, I hasten to add!) to be a preudoprime to 
the first 307 prime bases, so it's something of a worst-case scenario 
for my version.

> BTW, it doesn't much matter that this is "pure Python" - for ints of 
> this size the bulk of the time regardless is spent in CPython's 
> C-coded bigint arithmetic.

A fair point, thank you.

> About the API, I can't agree to the one you propose.  Different 
> applications have different appropriate tradeoffs between degree of 
> certainty and time consumed - "one size fits all" doesn't fly here.
> 
>     def probably_prime(n, maxsteps=20)
> 
> supplies a _default_ instead.  I don't want an API that's useful 
> _only_ to people who don't know what they're doing ;-)

It's not just for people who don't know what they're doing. It is for 
people who don't want to sweat the small details, they just want an 
answer True or False and are prepared to trust that the function author 
knows what they are doing.

If someone cares about the small details like how many bases to 
try, they might also care about details like:

- which specific bases are used;
- whether to use Miller-Rabin at all;
- how many trial divisions to do first;
- which specific primality test to use;

etc. What if the implementation shifts away from Miller-Rabin to (say) 
Baillie-PSW? Then your maxsteps parameter becomes meaningless. Do we 
deprecate it or leave it there in case the implementation changes again 
to something that has a concept of number of steps?

I think that if someone cares sufficiently about the details, then they 
probably won't be satisfied with a single isprime function, but may want 
is_fermat_prime, is_miller_rabin_prime, is_lucas_prime etc.

> > (By the way: for smallish numbers, under 2**64, no more than
> > twelve rounds of Miller-Rabin are sufficient to
> > deterministically decide whether a number is prime or not.
> 
> But only if you use a fixed set of 12 specific bases for which the 
> claim has been proved.

Correct. The first twelve primes make up such a minimal set.

http://oeis.org/A014233

History
Date	User	Action	Args
2020-02-03 01:28:08	steven.daprano	set	recipients: + steven.daprano, lemburg, tim.peters, rhettinger, mark.dickinson, vstinner, stutzbach, serhiy.storchaka, veky, Ananthakrishnan
2020-02-03 01:28:08	steven.daprano	link	issue39479 messages
2020-02-03 01:28:06	steven.daprano	create