Laboratory Exercise

The Sieve of Eratosthenes

Purpose

This exercise provids practice working with lists, specifically recursive list traversals, and with defining subclasses of a collection class.

Prerequisites

Understanding of chapter 11 of Algorithms and Data Structures: The Science of Computing—through section 11.5.2.

Background

More than 2000 years ago, the Greek mathematician Eratosthenes described a simple algorithm for finding prime numbers. This exercise involves using Eratosthenes’s algorithm in a “list of prime numbers” class, and then using an instance of that class to conduct an experiment concerning the “density” of prime numbers. The “list of prime numbers” class will be a subclass of the List class from chapter 11 of Algorithms and Data Structures: The Science of Computing.

Eratosthenes’s Algorithm

Eratosthenes’s original algorithm was based on the idea of removing non-primes from a list of integers. The key insight was to do this in such a way that the next unremoved integer must be prime, because it was not divisible by any of the smaller primes. To find all the primes between 2 and n, Eratosthenes would proceed as follows:

Make a list of the integers from 2 to n
For each integer i, 2 <= i <= n
    If i has not been crossed out
        Cross out all multiples of i (except i itself).

This algorithm is called “the Sieve of Eratosthenes” because it works by screening numbers through progressively coarser filters (the multiples of each i). At completion, the numbers remaining (i.e., not crossed out) are all the primes <= n.

A variation on this algorithm is a little bit more convenient for end users — it doesn’t require storage for numbers that aren’t prime, and it produces a list of primes with no crossed out numbers. Specifically, rather than start with a list that contains many numbers that won’t be prime and then cross out numbers found to be non-prime, start with a list of primes, and add numbers to it as they are found to be prime:

Initialize the list, “primes,” to be empty
For each integer i, 2 <= i <= n
     primes.addIfIndivisible( i )

where addIfIndivisible is a message to the list that adds i to the list if no element already in the list exactly divides i. (Note that the first iteration of the loop will add 2 to the list, because an empty list necessarily contains no element that divides 2.)

The “Density” of Prime Numbers

“Density” refers to the number of natural numbers that are prime. If most numbers are prime, then primes are very dense. On the other hand, if most numbers non-prime, then primes are very sparse (i.e., “undense”).

This exercise examines the density of primes by asking how many primes are less than or equal to a given natural number, n, and then further asking how this number of primes relates (if at all) to the value of n. The larger the number of primes less than or equal to n, the denser primes are. For example, there are three primes less than or equal to 5 (2, 3, and 5 itself), which suggests that primes are quite dense. But doubling n to 10 only adds one more prime (7) — does this mean the number of primes less than or equal to n levels off quickly as n increases? Maybe, but maybe not — letting n = 15 adds two more primes (11 and 13), so maybe primes are just unusually sparse between 5 and 10. The discussion under “The Density Experiment” below says more about this issue and how to address it empirically.

The List Class

Chapter 11 of Algorithms and Data Structures: The Science of Computing thoroughly describes the List class. A complete implementation of this class is available, in a file named List.java. The “Final Details” section of this document explains where to find this file.

Any Java source file that uses the List class should “import” it via a statement of the form

    import geneseo.cs.sc.List;

Exercise

This exercise has two phases:

Create a PrimeList subclass of List that incorporates the variation on Eratosthenes’s Algorithm described above.
Use that subclass in an experiment regarding the density of prime numbers.

The PrimeList Subclass

Design and code a class PrimeList that maintains a list of prime numbers. This class should be a subclass of List, and should at least be able to handle the addIfIndividisible message used in the variation on Eratosthenes’s algorithm. It will also need a makeNewList method (as all subclasses of List do), and other methods may be helpful — include any that are.

Also write a main method that uses an instance of PrimeList to make a list of prime numbers. The largest possible number in the list, n, should be easy to change, as the experiment will require making lists for many different n values.

The Density Experiment

The length of the list of primes for a particular n is simply the number of primes less than or equal to that n. Measuring the length of the list for various n values is thus a way to see how dense primes are.

Even more interesting, the length of the list can also be used to test hypotheses about how the number of primes less than or equal to n relates to n (equivalently, how the density of primes depends on n). While the simple examples given under “The ‘Density’ of Prime Numbers” above suggest that there may not be a perfect correlation between n and the number of primes less than or equal to it, there may be a relation that holds on average or asymptotically (i.e., as n gets very large). For example, perhaps on average about half to one third of the numbers less than or equal to n are prime (roughly what the examples in “The ‘Density’ of Prime Numbers” would suggest). Such a hypothesis can be tested simply by measuring the length of the lists of primes for different n values and looking for a constant of proportionality between list length and n. Similarly, any other functional relationship between n and the number of primes less than or equal to n can be tested by looking for a constant of proportionality between list lengths and the appropriate function of n (e.g., to see if the number of primes less than or equal to n is roughly the square root of n, see if list lengths are approximately proportional to the square roots of the n values).

Form a hypothesis about how the number of primes less than or equal to n relates to n. Then design and conduct an experiment that uses the PrimeList class to test the hypothesis.

Those with a strong math background may have hypotheses in mind; others may not. For those who want hypotheses, here are two possibilities:

While it is known that there is no largest prime, primes might become so rare that there might as well be a largest one (i.e., at some point the n values needed to produce new primes become so extravagant that no-one would ever encounter them). In this case, there comes a point where the number of primes appears essentially constant as n increases.
Primes can’t be denser than proportional to n (i.e., the number of primes less than or equal to n can’t be greater than n itself), but they might be that dense. The constant of proportionality between the number of primes less than or equal to n and n would certainly be less than 1, but it might nonetheless exist.

Clearly these hypotheses can’t both be correct, and indeed it’s possible that both are wrong. However, the beauty of experiments is that they expose incorrect hypotheses, and in demonstrating that one’s expectations are wrong they can teach a lot. So it is not as important that a hypothesis for this experiment be right, as that it be at least somewhat plausible, and that you are prepared to learn something from testing it.

When analyzing data from this experiment, keep in mind that the actual density of primes may vary about some average rather than exactly following some relationship to n. So the constants of proportionality may themselves vary about some average, as long as they don’t decline systematically towards zero or rise systematically towards infinity. Section 9.2.3 of Algorithms and Data Structures: The Science of Computing contains some ideas on testing hypotheses about the asymptotic behavior of functions that are relevant here.

Final Details

Finding the List Class

File List.java can be Downloaded from the World Wide Web.

Documentation for the List class is also available on the Web. The main documentation page is an index to documentation for all the Java classes written for use with Algorithms and Data Structures: The Science of Computing. To see the documentation for a specific class, click on that class’s name in the left-hand panel of the page.

Submitting Your Work

This lab is due at the beginning of your lab session on Monday, April 5. Turn in a printout of your PrimeList class and main program, and a report on your experiment. Be sure the report includes a clear statement of your hypothesis as well as the usual information for an experiment report (procedure, all data collected, data analysis, conclusions, etc.)

Revised March 29, 2004