The
prime generator program simply generates prime numbers until you get bored and stop it (likely), the maximum integer size is reached (unlikely), or the program runs out of memory (much more likely than the last one). It is based on the sieve method, and is quite fast.
To present the algorithm used by pgen, we will start with the simple-minded prime generation method. It is based on a double loop like this: this:
- Code:
for n := 2 to infinity do
for d := 2 to n / 2 do
if d divides n, proceed to the next n
print n
Now, we will try to improve the efficiency. First, we observe that the inner loop needs only to check the d values which are themselves prime. Since it has already identified these values, we need only remember them. So now we have:
- Code:
primes := { }
for n := 2 to infinity do
for each d in primes do
if d divides n, proceed to the next n
print n
primes := primes + {n}
This version is faster because the inner loop performs fewer iterations. It is possible to make it still faster, however, by using a better data structure for primes. For this, first off, we observe that any given prime number eliminates a series of its multiples in increasing order. For instance, as n increases, the prime number 2 will eliminate 4, 6, 8, 10, . . ., in that order. The prime number 7 eliminates 14, 21, 28, and so forth. This means that, for any prime, we can predict which n value will be eliminated next. This leads to the notion of an eliminator object. It is a pair of numbers, a prime and one of its multiples. The multiple is the next n value which it will eliminate. It also has a method advance which moves it to its next victim. That simply means adding the prime to the multiple to create the next multiple. We keep a collection of these. For each n value, we check to see if a member of our eliminator collection eliminates n. If so, we know that that n is not prime, and we advance the eliminator. The algorithm becomes:
- Code:
eliminators := { }
for n := 2 to infinity do
if eliminators contains some element e = (p, n) then
advance e to (p, n + p)
if there are any other e' = (p', n), advance them, too.
proceed to the next n
print n
eliminators := eliminators + {(n, 2n)}
Here's what that looks like:
- Code:
n eliminators Action
2 Print n
3 (2,4) Print n
4 (2,4), (3,6) Eliminate n
5 (2,6), (3,6) Print n
6 (2,6), (3,6), (5,10) Eliminate n
7 (2,8), (3,9), (5,10) Print n
8 (2,8), (3,9), (5,10), (7,14) Eliminate n
9 (2,10), (3,9), (5,10), (7,14) Eliminate n
10 (2,10), (3,12), (5,10), (7,14) Eliminate n
11 (2,12), (3,12), (5,15), (7,14) Print n
12 (2,12), (3,12), (5,15), (7,14), (11, 22) Eliminate n
. . .
Finally, we want to answer the question "if eliminators contains some eliminator e = (p, eliminators)" quickly. First observe that, at each step, if there is an eliminator for n in eliminators, it is always one with the smallest multiple. Therefore we organize eliminators as a heap. A heap is a data structure from which is it is efficient to find the smallest (or largest) item.
I use the vector class from the C++ standard libraries to hold eliminators. The standard libraries also provide the methods to order the contents of a vector as a heap.
Source: This algorithm is hardly original. The Sieve of Eratosthenes is a classical approach. This particular implementation was suggested by a data structures book I have by
Augenstein and
Tenenbaum, which is nearly as ancient. (The application language is PL/I, which gives a clue.) My implementation is rather different from theirs (better, I think), but omits some optimizations which they suggest.