Score:1

Rabin-Miller Primality Test - Elaboration needed

pa flag

In short, my question is:

What exactly do people mean when they say that "The more you apply the Rabin-Miller test to a number, the more certain you can be that the number you're testing is prime."?

To clarify what I'm asking, let's look at an example I was working through:

Testing if N = 78007 is prime or not (spoiler, it is).

Rabin-Miller procedure:

  1. Find N - 1 = 2K * M

In this case, 78007 - 1 = 21*39003, which means K = 1 and M = 39003

  1. Pick an A, such that 2 <= A <= N-2. We pick A = 2.

  2. Compute B0 = AM % N.

    If B0 = 1 or B0 = (N-1), then N is prime (probably). If it's anything else, we stop and conclude that N is not prime.

    Else, begin computing terms of Bi = (Bi-1)2 % N, and: If Bi = 1, N is not prime. If Bi = N - 1, N is prime (probably).

    Now, for our running example:

    B0 = 239003 % 78007 = 1.

    Which means N is probably prime.

    HERE IS MY QUESTION: What would I proceed to do next if I want to be more certain that N is prime?

Would I try steps 2 and 3 with different A's?

Or would I keep the same A and proceed with the Bi terms until one of the Bi's turns out to be equal to 1? In which case the test says that N is not prime? If I reach such a Bi term, would that mean N is definitely not prime, or the more i's it took to get there, the more likely it is that N is actually prime?

Since I read a statement that said "If you do the Rabin-Miller test about 20 times on a big-ish number, you can be pretty certain it's prime." What exactly did they mean?

Thanks everyone in advance for any answers that clears it up for me.

Side note: I came across this primality test while looking for ways to generate a random 4000-bit prime number as part of implementing the Diffie-Hellman key exchange. I have written my own custom Big Integer library in C which I will be implementing Diffie-Hellman in. I implemented the Sieve of Eratosthenes but it was so damn slow. People say this primality test is way faster and can lead to high enough certainy that a number is prime for said number to be used in encryption schemes. If anyone knows of a better way, please let me know. I am no expert in encryption, but I really love implementing stuff on my own, it teaches me a lot more than simply using existing libraries for everything.

I tried reading explanations on several websites and asked on programming discord servers, watched several youtube videos but nobody was able to clarify this particular question to me. It's still unclear to me what exactly it means to "do the test 20 times on a number to be more certain that said number is prime".

fgrieu avatar
ng flag
Hint: do the same experiment with N=2047 (which is not prime: 2047=23×89), or any N in OEIS'[A001262](https://oeis.org/A001262) or [A014233](https://oeis.org/A014233).
Richard Thiessen avatar
mx flag
First off, consider not implementing DH in a large prime field. Elliptic curve cryptography works, is well studied and has small code size. Secondly, re-using a prime isn't actually a bad thing. You can just hard-code a large prime as the DH parameter to use. Here's an RFC defining some good ones https://www.ietf.org/rfc/rfc3526.txt .
cn flag
The answer to "here is my question" is "more A's".
Kevin Stefanov avatar
pa flag
Thank you everyone! I did some more N's and kinda answered my own question after several hours of trying different N's, A's and B's. Now the thing that struck me is, if I want to apply Rabin-Miller to a 4000-bit number, I'd have to compute 2 to the power of a 2000-bit number (in the worst case). Wondering whether that's gonna be practical or if it's gonna turn into a disaster like Sieve of Eratosthenes did. Might do what you guys said and simply hard-code a large prime for DH, or if things get desperate enough, not implement DH at all and look at elliptic curve cryptography.
ye flag
@KevinStefanov: When calculating $a^e \bmod m$ for a big exponent $e$, be sure to reduce modulo $m$ after each squaring or multiplication during the exponentiation. (It's a good exercise; if you use a "modern" programming language like python, modular exponentiation is already built in.)
Kevin Stefanov avatar
pa flag
@garfunkel That was actually a worry I had. So what you're saying is, I can check the result after each multiplication and if it got bigger than M, subtract M from it?
ye flag
@KevinStefanov: By multiplying two 1000-bit numbers you will get a 2000-bit number, so you will have to subtract the 1000-bit modulus M quite often to get the intermediate result down to 1000 bit again... (you have to reduce it modulo M).
Kevin Stefanov avatar
pa flag
@j.p. How many A's would be sufficient to check to be pretty sure a number is prime if the number has 5000 bits? For numbers of a few digits I noticed just a few A's are enough. Is that also the case for numbers with thousands of digits?
cn flag
The longer the primes are you are looking for, the less MR-tests you will have to perform. For primes of length 2048 (or longer), passing two MR-tests gives you an error probability of less than $2^{-100}$. For details look at table B.1 or Appendix C.1 in [this norm](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-5.pdf).
Kevin Stefanov avatar
pa flag
@j.p. Passing two different MR-tests means the algorithm said "it's probably prime" for two different A values? Also, side question, is my understanding correct that for Diffie Hellman to be secure, the big prime you feed it has to have a large prime factor itself?
cn flag
Yes, repeating MR-tests with the same prime wouldn't make much sense ;-). About DH: A prime has itself as its only prime factor. (You phrased your question wrongly.) But if you subtract one from the big prime, then you get the order of the multiplicative group mod the prime, in which you are working, and this order has to have a big prime factor.
Kevin Stefanov avatar
pa flag
@j.p. wait, A has to be prime? Also, is the converse true as well, like for 3000-bit numbers, if it says "composite" for two different A's BEFORE it says "probably prime" for 2 different A's, then we can be really sure the number is composite?
cn flag
@KevinStefanov: Sorry! No! No clue, why I wrote prime. Just forget it!
Score:2
my flag

I'll answer your questions about Rabin-Miller, as well as put in a few comments about Diffie-Hellman. You have other questions in your comments; you can ask those in other questions (or just search through crypto.stackexchange for the answers - you aren't the first one to ask them)

As for Rabin-Miller, it has this property:

  • If $N$ is prime, then the result of the algorithm is always "probably prime", no matter what value of $A$ you selected

  • If $N$ is composite, then the result of the algorithm is "composite" for at least 75% (3/4) of the possible $A$ values within the range.

That is, if your $N$ is composite, and you selected $A$ randomly, then with probability at least 0.75, the algorithm will say "composite".

And, if you run the algorithm $k$ times, and select a random $A$ each time, then with probability at least $1 - 1/4^k$, the algorithm will say "composite" at least once.

These hold for any composite. Note that this is strictly true if you use random $A$ values; if you use a fixed value for $A$ (say, 2) the proofs do not apply.

That is what is meant by "running Rabin-Miller multiple times, you get a higher assuredness of primality".

That said, it is sometimes overly conservative. If you were handed a value $N$ from an adversary, and you are asked to test it for primality, and he "wins" if he gets you to accept a composite value, then this is the best we can do with Rabin-Miller; there exist composite numbers for which the probability of a false acceptance is very close to 1/4. On the other hand, such values are actually fairly rare - if you are testing 'random' values for primality, the vast majority of composite values will hardly ever come up with 'probability prime', and so it's safe in that scenario to use a far smaller number of iterations.

Now, on to Diffie-Hellman: it is often not safe to just use a random prime. The reason is that the security of Diffle-Hellman depends on the factorization of $N-1$; if you just pick a random prime $N$, $N-1$ may have a number of small factors (in addition to the factor 2 which is always present), and (depending on how you use it; how you select $G$ and the size of the private exponents you use), this can subvert security.

If you don't know what you're doing, you're better off using the RFC 3526 primes (which are known not to have any small factors other than 2)

Kevin Stefanov avatar
pa flag
Thank you for the detailed answer! One worry I have currently is, during Rabin Miller, I have to compute (2^N mod K) where N is a 3000-bit number itself. So apparently there's a way called Modular Exponentiation, which saves you the effort of having to compute 2 to the power of a 3000-bit number. Can you explain please? Im currently watching videos on that and asking around. Do I simply check if 2^N has gotten larger than K after each multiplication, and if it has, subtract M from it?
poncho avatar
my flag
@KevinStefanov For an efficient way to do modexp, one place to start is https://en.wikipedia.org/wiki/Modular_exponentiation#Right-to-left_binary_method
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.