Laboratory Exercise

Search Times for Lists and Trees

Supplemental Material for Baldwin and Scragg, Algorithms and Data Structures: The Science of Computing; Charles River Media, 2004 (Now published by Cengage Learning)

Site Index


Purpose

This exercise demonstrates the practical significance of the difference in theoretical execution time between list and binary tree searches. In doing this, it also provides practice mathematically analyzing the performance of recursive list algorithms.

Prerequisites

Understanding of the following sections of Algorithms and Data Structures: The Science of Computing:

Background

This exercise empirically verifies the theoretical times to search lists and balanced binary trees, and compares the empirical times for the two structures. Doing these things requires establishing some theoretical background not presented in Algorithms and Data Structures: The Science of Computing, and writing subclasses of the book’s List and OrderedTree classes with which to do the main experiment.

Theoretical Execution Times

Chapters 11 and 13 of Algorithms and Data Structures: The Science of Computing present algorithms for searching lists and trees, respectively. Chapter 13 derives the worst-case execution time for searching a balanced ordered binary tree of n elements: Θ( log n ). The book presents no analogous derivation for the worst-case time to search a list, however.

The List and OrderedTree Classes and Their Subclasses

Chapter 11 of Algorithms and Data Structures: The Science of Computing thoroughly describes the List class. A complete implementation of this class is available, in a file named List.java. The “Final Details” section of this document explains where to find this file.

Similarly, chapter 13 describes the OrderedTree class. It is available in a file named OrderedTree.java, which can also be found per the directions in the “Final Details” section.

Any Java source file that uses the List or OrderedTree classes should “import” them via statements of the form

    import geneseo.cs.sc.*;

or

    import geneseo.cs.sc.List;
    import geneseo.cs.sc.OrderedTree;

This exercise requires defining subclasses of List and OrderedTree, definitions that are subtle in several ways. Section 11.5.1 of Algorithms and Data Structures: The Science of Computing discusses the subtleties related to lists, and the same subtleties generalize to trees. For this exercise, pay particular attention to how to write makeNewList and makeNewTree methods.

Every subclass of List or OrderedTree needs a parameterless constructor that initializes a list or tree to be empty — these constructors are necessary because makeNewList and makeNewTree use them. Generally, these constructors are simple, they just call the corresponding constructor in the List or OrderedTree superclass.

Wrapper Classes

The list and tree subclasses in this exercise will represent lists or trees of integers. Unfortunately, integers, i.e., values of Java type int, are not objects in Java, and so cannot be stored directly in lists or trees. (As explained in appendix section A.2.1 of Algorithms and Data Structures: The Science of Computing, Java has two kinds of data: objects and non-objects; List or OrderedTree objects (and most other collections) can hold any kind of object, but not non-object values.) Fortunately, the designers of Java anticipated such problems, and provided “wrapper” classes for all the non-object data types. A wrapper is a very simple object that basically does nothing except contain one value of a non-object type. For example, instances of the wrapper class Integer each hold one int, instances of Character hold a char, etc. To store a non-object value in a collection, create a wrapper object to hold the non-object value, then store the wrapper in the collection. For example

    Integer intWrapper = new Integer( 3 );
    someList.addItem( intWrapper );

To retrieve the non-object value from the collection, retrieve the wrapper and then extract the non-object from the wrapper. For example

    Integer wrappedInt = (Integer) someList.getFirst();
    int retrievedInt = wrappedInt.intValue();

(The notation (Integer) before someList.getFirst() in this example is a cast that tells the Java compiler that this particular use of getFirst will return an Integer object. Casts are often used when retrieving data from collections, because the retrieval message is usually declared to return some very generic class— e.g., Object — but client programmers often know much more precisely what classes their application stores in the collection. See appendix section A.5.2 of Algorithms and Data Structures: The Science of Computing for more information on casts.)

Here are some common wrapper classes from the Java class library. Every wrapper class provides a constructor that takes a single value of the corresponding non-object type as its only parameter. Every wrapper class also handles a parameterless message for extracting the non-object value from a wrapper, as indicated in the following table (the examples above of wrapping and unwrapping an int illustrate how one might use these constructors and extractors).

Non-Object Type Wrapper Class Extraction Message
int Integer intValue
long Long longValue
double Double doubleValue
float Float floatValue
char Character charValue
boolean Boolean booleanValue

For more information on these (and other) wrapper classes, see the Documentation for the Standard Java Library at http://java.sun.com/j2se/1.4.2/docs/api/index.html

Exercise

The heart of this exercise is an experiment that tests a set of three related hypotheses about the worst-case execution times of searches in lists and balanced binary trees. However, before doing this experiment, a hypothesis about the worst-case execution time of searches in lists is needed, as are subclasses of List and OrderedTree that support the experiment. The lab exercise thus breaks down into four parts:

  1. Derive the worst-case execution time for searching a list
  2. Write a subclass of List that allows clients to create lists of client-specified size
  3. Write a subclass of OrderedTree that allows clients to create balanced trees of client-specified size
  4. Conduct an experiment that verifies the worst-case execution times for lists and balanced trees, and that demonstrates which is generally fastest.

Search Time in Lists

The first task is to derive an asymptotic expression for the worst-case time to search a list. In terms of the List class, this is the worst-case execution time of the find method. Express time as a function of list length. Note that the worst case happens when the item being sought is not in the list at all.

Compare this asymptotic expression to the one for searching an ordered binary tree (Θ( log n )) to determine which search should be faster in general. Remember the result, as it will be part of the hypothesis for the experiment.

A List Subclass

One of the experiment’s goals is to verify that the asymptotic expression for list search time corresponds to the actual behavior of lists. Doing this requires a way to create lists of a variety of lengths, with which to time find messages. Thus the second task in this lab is to define a subclass of List that provides a new constructor. This constructor should take two integers, n and m, as its parameters, and should initialize a list to contain the integers from n to m, in ascending order. Nominally, n is the inclusive lower bound on the range of integers to put in the list, and m the inclusive upper bound. However, n may be greater than m, in which case the list should be made empty.

The new constructor is the only unique feature that the List subclass needs. Don’t forget, however, that every List subclass needs a makeNewList method, and a parameterless constructor that initializes a list to be empty. Everything else the experiment needs can be inherited from List.

A Tree Subclass

Another goal of the experiment is to verify that balanced ordered binary trees have a worst-case search time of Θ( log n ). This requires a subclass of OrderedTree that can initialize balanced trees of client-specified size. Thus, the third task is to define a subclass of OrderedTree that provides a constructor with two integer parameters, n and m. This constructor initializes an OrderedTree that contains the integers between n and m, and that is as nearly balanced as possible. Nominally, n is the inclusive lower bound on the range of integers to put in the tree, and m the inclusive upper bound. However, n may be greater than m, in which case the tree should be made empty.

For this lab, “nearly balanced” means that for every node in a tree, the sizes of that node’s left and right subtrees differ by at most one.

Initializing the tree so that it is nearly balanced may require a little thought. Think recursively! In other words, try to find a way to create a nearly balanced, ordered, binary tree that involves creating two smaller, nearly balanced, ordered binary trees.

As for the List subclass, the new constructor is the only really unique feature of the OrderedTree subclass. Remember, however, to also provide a makeNewTree method and a parameterless constructor for the subclass.

The Experiment

Finally, the fourth task in this lab is to do an experiment that tests the following three-part hypothesis:

  1. The worst-case time to search a list is asymptotically equal to the expression derived in the first task
  2. The worst-case time to search a balanced, ordered, binary tree is Θ( log n )
  3. The comparison made between the asymptotic expressions for list search time and tree search time in the first task really does describe which of the two data structures has the faster searches in reality

Data collection for this experiment will mainly involve timing worst-case searches in lists and trees of various sizes. Note that for both data structures, the worst case for searching is searching for an item that is not in the data structure. The subclasses defined in the second and third tasks will be helpful here, because they make it easy to initialize lists and trees of known sizes, and with precisely known contents (so that it is easy to calculate a value that won’t be in the list or tree).

For both lists and trees, actual search times may be so small that they are hard to measure. Use the technique for measuring small times that is described in section 9.3 of Algorithms and Data Structures: The Science of Computing if necessary.

Final Details

Finding the List and OrderedTree Classes

The List.java and OrderedTree.java files can be downloaded from the World Wide Web.

Documentation for both classes is also available on the Web. The main documentation page is an index to documentation for all the Java classes written for use with Algorithms and Data Structures: The Science of Computing. To see the documentation for a specific class, click on that class’s name in the left-hand panel of the page.

Submitting Your Work

Turn in your derivation for the worst-case time to search a list, and a report on your experiment, as directed by your instructor.


Copyright © 2004 Charles River Media. All rights reserved.

Revised Aug. 8, 2005 by Doug Baldwin

Site Index