Score:2

A variation of Sieve of Eratosthenes for random pseudoprime number generation

st flag

I wasn't sure if this question is more suited for SE.Math or not; please tell me if I should move it.

For its mpz_nextprime() function (find the next pseudoprime following the given number) GMP uses an interesting variation of the Sieve. In order to weed out multiples of small primes before applying slower tests (Miller-Rabin and Lucas), it first generates a table of residues of the "base" number. The base number can be thousands of bits long, but the residues are small since the primes are small. Then it iterates over small increments to the base number, only checking the (increment + residue) mod prime instead of base mod prime which would be much slower for a big base. The algorithm can be seen here: https://github.com/alisw/GMP/blob/master/mpz/nextprime.c#L85

I have two questions regarding this algorithm:

  1. Does anyone know if it is a known method that can be attributed to some paper? I looked through different Sieve variations, but couldn't quite find a match.
  2. In the code, there is a limit on the maximum increment, after which a new prime is sampled and residues recalculated. It is described only as #define INCR_LIMIT 0x10000 /* deep science */. I wonder, where does it come from? I would think that the only limit is when increment + residue overshoots the data type size, which is 4 bytes (unsigned), and the maximum residue is about 1000, so there's still plenty of space left.
Score:3
ru flag
  1. This appears to be a memory-efficient variation of the wheel sieve which although using older ideas is usually attributed to Pritchard in his paper A Sublinear Additive Sieve for Finding Prime Numbers. Pritchard's sieve has the advantage over the sieve of Eratosthenes in that composite numbers are only "marked" once. Both Eratosthenes and Pritchard require memory to store candidates and the approach quoted trades this off by exhausting wheel increments via computation. This again means that each composite is only "marked" once, but a more work is expended on the "no mark" operation.

  2. Some people have noted that choosing (probable) primes via a nextprime function inserts a bias into the selection of (probable) primes. This is because primes are selected with probability proportional to the size of the preceding (probable) prime gap. The programmers may be trying to limit this bias.

Score:2
my flag

Does anyone know if it is a known method that can be attributed to some paper?

Dunno. I remember doing this circa 25 years ago (and I didn't read it anywhere); it's quite likely that multiple people made the same observation independently.

In the code, there is a limit on the maximum increment, after which a new prime is sampled and residues recalculated.

I can think of two reasons:

  • They just don't trust the math that says that $ai+b$ will always eventually be prime for some $i$ (as long as $a, b$ are relatively prime); or

  • They're worried about potential bugs which might cause an infinite loop (e.g. if $a, b$ aren't relatively prime - although, because they use $a=2$ and assuming they always pick the starting point $b$ odd, this cannot happen)

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.