This Page started on: 2007 July 25 10:10:23

Back in the early 1950's when electronic computers were just being invented everybody thought they would be used for computing. The ENIAC, was not a stored program computer and I don't believe it had any capability to handle anything but decimal numbers; much like the adding machines and mechanical calculators of the day. The motivation for developing ENIAC was to compute Ballistic Tables. After it was working well a volunteers effort used it on a week end to compute Pi to over 2000 decimal places. (Actually, they practiced on e, then did Pi over a three day week end.)

Historically, generating a table of Prime Numbers has been an interesting programming task for many programmers. The challenge seems simple: Generate numbers and test them to be prime. An easy test is to simply divide the number to be tested by successive prime numbers. As with most programming tasks a little cleverness can make things go much faster. The "tricks" I know are described below.

I don't know if ENIAC was ever used to compute a table of prime numbers. I rather suspect not, because there were tables of primes up to 10 million already published. (Probably the largest was by D. N. Lehmer who lived from 1867 to 1938.)

- It should be obvious, the interesting primes are all odd; so one
should not even generate even numbers to test. But, we can do better
than that. Note, of the odd numbers: 3, 5, 7, 9, 11, 13, 15, 17, 19, 21,
etc. There is a multiple of three, then there are two which are not a
multiple of three, and this repeats. It should be obvious, when you add
2 three times you have added 6 which is a multiple of 3. An easy way to
generate numbers which are not a multiple of 3 is to start with 1 and
alternately add four then two.
*In binary: each time you add the increment XOR it with 6 to create the next increment.*With this little trick we don't generate test numbers that are a multiple of two or three. Thus, we only need to test 1/3 of the numbers--still a third of ten million is a lot of tests. - Because all the interesting primes are odd,
*in binary, there is no need to keep the last bit--it would always be one*. This doubles the size of numbers we can work with in a byte or larger register. - Because the difference in successive primes is small until we get
to very large primes we can vastly reduce the size of the output by just
outputting the successive differences. (By not keeping the low order
bit, a byte will hold differences up to and including 512.
*The program will stop if the value ever exceeds what can be held in one byte.*) This significantly speeds up the output. Further, when converting the output to decimal it is not necessary to re-compute the same leading digits.*Note for prime twins, four out of five times only the last digit changes!*Most of the time only the last one or two digits change for successive primes. - When a number is not divisible by any prime less than its
square root, it is a prime. However, for those generating primes using a
language such as Fortran or C, I should point out that you should never
use the Square Root function--it takes far too much time. If you are
using division by successive primes for your test: You need to see if
the remainder on the division is zero, and you can quit when the
Quotient is less than the divisor. But unfortunately algebraic languages
such as Fortran or C don't often give you both the quotient and the
remainder without performing two divides!
*Still a second divide is much faster than a Square Root.* - In almost all computers division takes more cycles than an add, subtract or compare. Thus, good prime number generation techniques will not use division to test for a remainder of zero.
- It turns out it is faster to generate all the composite numbers (Numbers which are not prime.) and test each number against a table of composites. If it is not one of the composites it is prime.
- To do this we generate a table in memory that has all composite numbers which are not a multiple of two or three. Here is the trick: Starting with 5, we use one memory location and in it we put in it 25, when the number we are testing gets larger than 25 we add to that location two times 5 making the value 35; when the number we are testing gets larger than 35 we increase it to 45, and so on. Thus in just one memory location during the running of the program we will have every multiple of the prime number 5, all in just one memory location. We do similarly for 7, 11 and successive primes. This dynamic multiples table I call the "Q" table. Another table I call the "P" table contains twice the value of the corresponding prime, its' value is used each time we bump a value in the Q table.
- We search the Q table for each prime candidate. The Intel 8086 and its following all have an instruction which is ideal for the task. It is the Compare String Byte/Word instruction, which compares the value in AX with a value in memory; and points the index register to the next memory location.
- If we put the value we are testing at the top of the Q table, we do not have to test for "end of table" within our search loop. Our entire search loop is only two 8086 instructions: A SCASW and a conditional jump back to it, when the test value in AX is less than the memory value it jumps back. When it doesn't jump back we have to test if AX was equal or greater. If it is greater we need to bump the value in the Q table. If it is equal we need check to see if it is at the top of the Q table, if it is then the number is a prime. Otherwise the number is not prime.
- The number we are testing and the values in the Q table are all
near the same size, thus we do not keep the high order word(s) in
either AX or the values in the Q table. Yes, this does require some
clever programming.
*And, of course we are not keeping the low order bit anywhere.* - I save the prime differences and output them in blocks of 16,384 bytes each, keeping the number of calls to the operating system pretty low.
- I use a separate program:
POC.COM **Prime Output in Columns**to convert the file of differences to decimal. It reads each byte and increments it and shifts it left once; thus a byte of zero is converted to two, a byte of one becomes 4, etc. This value is converted to decimal*which will always be less than 512*and added to the previous value using a decimal digit add loop. The result is moved to an output buffer, along with appropriate spaces or tabs between values, and a CR-LF at the end of each line. POC has parameters to specify the number of columns per line, and if the columns are separated by a space or a TAB.- I have other programs which use the "prime difference" file, one simply counts the number of times each byte occurs in the file. This gives the number of prime twins, as well as the maximum difference between two consecutive primes.

On an older computer using a Cyrex processor running DOS 6.2 at 200
MegHz. I used a 32 Meg RAM disk to run a .BAT file that read the clock
then executed the prime program, which output the prime difference file
to the RAM disk, and read the clock when it was done, outputting clock
the times to another file. It generated 1,638,400 primes in 29.82
seconds. *That was the number of bytes in the output file, and did not
count any time for conversion and output of decimal values.*

The program is a bit less than 500 bytes, so in an 8086 64K segment
there is over 60,000 bytes available for the P and Q tables, and stack.
I chose to make each table large enough to hold 15,000 words. The entire
program and tables easily fits in a single 64K byte segment. The program
could generate all primes up to somewhat over 2 x 10^{10},
before the limit of the P and Q tables is reached.

Go to:
My Home Page |
Go to:
This page TOP |