Laboratory Exercise

The Sieve of Eratosthenes

Supplemental Material for Baldwin and Scragg, Algorithms and Data Structures: The Science of Computing; Charles River Media, 2004 (Now published by Cengage Learning)

Purpose

This exercise provides practice working with lists, specifically recursive list traversals, and with defining subclasses of a collection class.

Prerequisites

Understanding of chapter 11 of Algorithms and Data Structures: The Science of Computing through section 11.5.2.

Background

More than 2000 years ago, the Greek mathematician Eratosthenes described a simple algorithm for finding prime numbers. This exercise involves using Eratosthenes’s algorithm in a “list of prime numbers” class, and then using an instance of that class to conduct an experiment concerning the number of prime numbers. The “list of prime numbers” class will be a subclass of the List class from chapter 11 of Algorithms and Data Structures: The Science of Computing.

Eratosthenes’s Algorithm

Eratosthenes’s original algorithm was based on the idea of removing non-primes from a list of integers. The key insight was to do this in such a way that the next unremoved integer must be prime, because it was not divisible by any of the smaller primes. To find all the primes between 2 and n, Eratosthenes would proceed as follows:

This algorithm is called “the Sieve of Eratosthenes” because it works by screening numbers through progressively coarser filters (the multiples of each i). At completion, the numbers remaining (i.e., not crossed out) are all the primes less than or equal to n.

A variation on this algorithm is a little bit more convenient for computer implementation — it doesn’t require storage for numbers that aren’t prime, and it produces a list of primes with no crossed out numbers. Specifically, rather than start with a list that contains many numbers that won’t be prime and then cross out numbers found to be non-prime, start with a list of primes, and add numbers to it as they are found to be prime:

where addIfIndivisible is a message that adds i to a list if no element already in the list exactly divides i.

The Number of Prime Numbers

One mathematically interesting question about prime numbers is “how many of them are there?” This exercise examines this question by asking how (if at all) the number of primes less than or equal to a given natural number, n, relates to the value of n. For example, there are three primes less than or equal to 5 (2, 3, and 5 itself), so maybe the number of primes less than or equal to n is approximately the same as n. But doubling n to 10 only adds one more prime (7), so perhaps the number of primes less than or equal to n levels off quickly as n increases. But then again, maybe not — letting n = 15 adds two more primes (11 and 13), so maybe primes are just unusually sparse between 5 and 10. The discussion under “The Experiment” below says more about this issue and how to address it empirically.

The List Class and Its Subclasses

Chapter 11 of Algorithms and Data Structures: The Science of Computing thoroughly describes the List class. A complete implementation of this class is available in a file named List.java. The “Final Details” section of this document explains where to find this file.

Any Java source file that uses the List class should “import” it via a statement of the form

This exercise requires defining a subclass of List, something that is subtle in several ways. Section 11.5.1 of Algorithms and Data Structures: The Science of Computing discusses the subtleties in detail. Pay particular attention to how to write makeNewList methods, and to the use of casts in recursive methods within subclasses of List.

Wrapper Classes

The “list of prime numbers” class should be a list whose elements are (prime) integers. Unfortunately, integers, i.e., values of Java type int, are not objects in Java, and so cannot be stored directly in lists. (As explained in appendix section A.2.1 of Algorithms and Data Structures: The Science of Computing, Java has two kinds of data: objects and non-objects; List objects (and most other collections) can hold any kind of object, but not non-object values.) Fortunately, the designers of Java anticipated such problems, and provided “wrapper” classes for all the non-object data types. A wrapper is a very simple object that basically does nothing except contain one value of a non-object type. For example, instances of the wrapper class Integer each hold one int, instances of Character hold a char, etc. To store a non-object value in a collection, create a wrapper object to hold the non-object value, then store the wrapper in the collection. For example

To retrieve the non-object value from the collection, retrieve the wrapper and then extract the non-object from the wrapper. For example

(The notation (Integer) before someList.getFirst() in this example is a cast that tells the Java compiler that this particular use of getFirst will return an Integer object. Casts are often used when retrieving data from collections, because the retrieval message is usually declared to return some very generic class— e.g., Object — but client programmers often know much more precisely what classes their application stores in the collection. See appendix section A.5.2 of Algorithms and Data Structures: The Science of Computing for more information on casts.)

Here are some common wrapper classes from the Java class library. Every wrapper class provides a constructor that takes a single value of the corresponding non-object type as its only parameter. Every wrapper class also handles a parameterless message for extracting the non-object value from a wrapper, as indicated in the following table (the examples above of wrapping and unwrapping an int illustrate how one might use these constructors and extractors).

Non-Object Type	Wrapper Class	Extraction Message
int	Integer	intValue
long	Long	longValue
double	Double	doubleValue
float	Float	floatValue
char	Character	charValue
boolean	Boolean	booleanValue

Exercise

The PrimeList Subclass

Design and code a class PrimeList that maintains a list of prime numbers. This class should be a subclass of List, and should at least be able to handle the addIfIndivisible message used in the variation on Eratosthenes’s algorithm. Other methods may also be helpful — include any that are.

Also write a main method that uses an instance of PrimeList to make a list of prime numbers. The largest possible number in the list, n, should be easy to change, as the experiment will require making lists for many different n values.

The Experiment

The length of the list of primes for a particular n is simply the number of primes less than or equal to that n. Measuring the length of the list for various n values is thus a way to test hypotheses about how the number of primes less than or equal to n relates to n. While the simple examples given under “The Number of Prime Numbers” above suggest that there may not be a perfect correlation between n and the number of primes less than or equal to it, there may be a relation that holds on average or asymptotically (i.e., as n gets very large). For example, perhaps on average about half to one third of the numbers less than or equal to n are prime (roughly what the examples in “The Number of Prime Numbers” would suggest). Such a hypothesis can be tested simply by measuring the length of the lists of primes for different n values and looking for a constant of proportionality between list length and n. Similarly, any other functional relationship between n and the number of primes less than or equal to n can be tested by looking for a constant of proportionality between list lengths and the appropriate function of n (e.g., to see if the number of primes less than or equal to n is roughly the square root of n, see if list lengths are approximately proportional to the square roots of the n values).

Form a hypothesis about how the number of primes less than or equal to n relates to n. Then design and conduct an experiment that uses the PrimeList class to test the hypothesis.

Those with a strong math background may have hypotheses in mind; others may not. For those who want hypotheses, here are two possibilities:

Clearly these hypotheses can’t both be correct, and indeed it’s possible that both are wrong. The important thing about both is that they are at least plausible. Scientific hypotheses should always be plausible, because science isn’t simply about capricious whims, but the beauty of experiments is that they expose plausible hypotheses that are nonetheless incorrect. In doing so, experiments can teach a lot. So it is not as important that a hypothesis for this experiment be right, as that it be plausible, and that you be prepared to learn something from testing it.

When analyzing data from this experiment, keep in mind that the actual relation between n and the number of primes less than or equal to n may vary about some average rather than following some exact function. So constants of proportionality may themselves vary about some average, as long as they don’t decline systematically towards zero or rise systematically towards infinity. Section 9.2.3 of Algorithms and Data Structures: The Science of Computing contains some ideas that are relevant here regarding testing hypotheses about the asymptotic behavior of functions.

Final Details

Finding the List Class

Documentation for the List class is also available on the Web. The main documentation page is an index to documentation for all the Java classes written for use with Algorithms and Data Structures: The Science of Computing. To see the documentation for a specific class, click on that class’s name in the left-hand panel of the page.

Submitting Your Work

Turn in your PrimeList class and a report on your experiment, as directed by your instructor.