Supplemental Material for Baldwin and Scragg, Algorithms and Data Structures: The Science of Computing; Charles River Media, 2004
Understanding of the following sections of Algorithms and Data Structures: The Science of Computing:
This exercise empirically verifies
the theoretical times to search lists and balanced binary trees, and compares
the empirical times for the two structures. Doing these things requires establishing
some theoretical background not presented in Algorithms and Data Structures:
The Science of Computing, and writing subclasses of the book’s List
and OrderedTree
classes
with which to do the main experiment.
Chapters 11 and 13 of Algorithms and Data Structures: The Science of Computing present algorithms for searching lists and trees, respectively. Chapter 13 derives the worst-case execution time for searching a balanced ordered binary tree of n elements: Θ( log n ). Although not explicitly derived in the book, the worst-case time to search a list of n items is Θ(n). These asymptotic times suggest that worst-case (i.e., failing) searches should generally be faster in balanced ordered binary trees than in lists.
Chapter 11 of Algorithms and Data Structures: The Science of Computing thoroughly
describes the List
class. A complete implementation of this class
is available, in a file named List.java. The “Final
Details” section of this document explains where to find this file.
Similarly, chapter 13 describes the OrderedTree
class. It is available in
a file named OrderedTree.java, which can also be found per the
directions in the “Final Details” section.
Any Java source file that uses the List
or OrderedTree
classes
should “import” them
via statements of the form
import geneseo.cs.sc.*;
or
import geneseo.cs.sc.List;
import geneseo.cs.sc.OrderedTree;
This exercise requires defining subclasses of List
and OrderedTree
,
definitions that are subtle in several ways. Section 11.5.1 of Algorithms
and Data Structures: The Science of Computing discusses the subtleties
related to lists, and the same subtleties generalize to trees. For this exercise,
pay particular attention to how to write makeNewList
and makeNewTree
methods.
Every subclass of List
or OrderedTree
needs a parameterless
constructor that initializes a list or tree to be empty — these constructors
are necessary because makeNewList
and makeNewTree
use
them. Generally, these constructors are simple, they just call the corresponding
constructor in the List
or OrderedTree
superclass.
The list and tree subclasses in this exercise will represent lists or trees
of integers. Unfortunately, integers, i.e., values of Java type int
,
are not objects in Java, and so cannot be stored directly in lists or trees.
(As explained in appendix section A.2.1 of Algorithms and Data Structures:
The Science of Computing, Java has two kinds of data: objects and non-objects; List
or
OrderedTree
objects (and most other collections) can hold any kind of object,
but not non-object values.) Fortunately, the designers of Java anticipated
such
problems,
and
provided “wrapper” classes for all the non-object data types. A
wrapper is a very simple object that basically does nothing except contain
one value of a non-object type. For example, instances of the wrapper class Integer
each
hold one int
, instances of Character
hold a char
,
etc. To store a non-object value in a collection, create a wrapper object to
hold the non-object value, then store the wrapper in the collection. For example
Integer intWrapper = new Integer( 3 );
someList.addItem( intWrapper );
To retrieve the non-object value from the collection, retrieve the wrapper and then extract the non-object from the wrapper. For example
Integer wrappedInt = (Integer) someList.getFirst();
int retrievedInt = wrappedInt.intValue();
(The notation (Integer)
before someList.getFirst()
in
this example is a cast that tells the Java compiler that this
particular use of getFirst
will return an Integer
object.
Casts are often used when retrieving data from collections, because the retrieval
message is usually declared to return some very generic class— e.g., Object
— but
client programmers often know much more precisely what classes their application
stores in the collection. See appendix section A.5.2 of Algorithms and
Data Structures: The Science of Computing for more information on casts.)
Here are some common wrapper classes from the Java class library. Every wrapper
class provides a constructor that takes a single value of the corresponding
non-object type as its only parameter. Every wrapper class also handles a parameterless
message for extracting the non-object value from a wrapper, as indicated in
the following table (the examples above of wrapping and unwrapping an int
illustrate
how one might use these constructors and extractors).
Non-Object Type | Wrapper Class | Extraction Message |
---|---|---|
int | Integer | intValue |
long | Long | longValue |
double | Double | doubleValue |
float | Float | floatValue |
char | Character | charValue |
boolean | Boolean | booleanValue |
For more information on these (and other) wrapper classes, see the Documentation for the Standard Java Library at http://java.sun.com/j2se/1.4.2/docs/api/index.html
The heart of this exercise is
an experiment that tests a set of three related hypotheses about the worst-case
execution
times of searches in lists and balanced binary trees. However, before doing
this experiment, subclasses of List
and OrderedTree
that
support the experiment are needed. The lab exercise thus breaks down into three
parts:
List
that allows clients to create lists of client-specified
sizeOrderedTree
that allows clients to create balanced
trees of client-specified sizeOne of the experiment’s goals is to verify that lists have worst-case
search times of Θ(n). Doing this requires a way to create
lists of a variety of lengths, with which to
time
find
messages. Thus the first task in this lab is to define a
subclass of List
that provides a new constructor. This constructor
should take two integers, n and m, as its parameters,
and should initialize a list to contain the integers from n to m,
in ascending order. Nominally, n is the inclusive lower bound on
the range of integers to put in the list, and m the inclusive upper
bound. However, n may be greater than m, in which case
the list should be made empty. This constructor is one of the few algorithms
where a client of List
is likely to need List
’s setFirstItem
and setRest
messages.
The new constructor is the only unique feature that the List
subclass
needs. Don’t forget, however, that every List
subclass needs
a makeNewList
method, and a parameterless constructor that initializes
a list to be empty. Everything else the experiment needs can be inherited from List
.
Another goal of the experiment is to verify that balanced ordered binary trees
have a worst-case search time of Θ( log n ).
This requires a subclass of OrderedTree
that can initialize balanced
trees of client-specified size. Thus, the second task is to define a
subclass of OrderedTree
that provides a constructor with two integer
parameters, n and m. This constructor initializes
an OrderedTree
that
contains the integers between n and m, and that is as
nearly balanced as possible. Nominally, n is the inclusive lower
bound on the range of integers to put in the tree, and m the inclusive
upper bound. However, n may be greater than m, in which
case the tree should be made empty.
For this lab, “nearly balanced” means that for every node in a tree, the sizes of that node’s left and right subtrees differ by at most one.
Initializing the tree so that it is nearly balanced may require a little thought.
Think recursively! In other words, try to find a way to create a nearly balanced,
ordered, binary tree that involves creating two smaller, nearly balanced,
ordered binary trees. You will probably find the setRoot
, setLeft
, and setRight
messages to OrderedTree
objects helpful.
As with the List
subclass, the new constructor is the only really
unique feature of the OrderedTree
subclass. Remember, however,
to also provide a makeNewTree
method and a parameterless constructor
for the subclass.
Finally, the third task in this lab is to do an experiment that tests the following three-part hypothesis:
Data collection for this experiment will mainly involve timing worst-case searches in lists and trees of various sizes. Note that for both data structures, the worst case for searching is searching for an item that is not in the data structure. The subclasses defined in the first and second tasks will be helpful here, because they make it easy to initialize lists and trees of known sizes, and with precisely known contents (so that it is easy to calculate a value that won’t be in the list or tree).
For both lists and trees, actual search times may be so small that they are hard to measure. Use the technique for measuring small times that is described in section 9.3 of Algorithms and Data Structures: The Science of Computing if necessary.The List.java and OrderedTree.java files can be downloaded from the World Wide Web.
Documentation for both classes is also available on the Web. The main documentation page is an index to documentation for all the Java classes written for use with Algorithms and Data Structures: The Science of Computing. To see the documentation for a specific class, click on that class’s name in the left-hand panel of the page.
This lab is due on Monday, April 25. Turn in a printout of the code you wrote for the experiment, and a report on the experiment itself. The report needn’t say much about the hypotheses you tested, since they are given above. It should, however, say anything about the design and execution of the experiment that isn’t evident from your code. It should also show all the data you collected, and explain how you analyzed the data and show the results of the analysis. Finally, it should discuss the conclusions you reached about each of the hypotheses.
Portions copyright © 2004. Charles River Media. All rights reserved.
Revised Apr. 16, 2005 by Doug Baldwin