Laboratory Exercise

Search Times for Trees

Supplemental Material for Baldwin and Scragg, Algorithms and Data Structures: The Science of Computing; Charles River Media, 2004


Purpose

This exercise demonstrates the practical significance of the theoretical execution time of binary tree searches. In doing this, it also provides practice designing recursive tree algorithms.

Prerequisites

Understanding of the following sections of Algorithms and Data Structures: The Science of Computing:

Background

This exercise empirically verifies the theoretical time to search balanced binary trees. Doing this requires writing a subclass of the OrderedTree class with which to do the main experiment.

Theoretical Execution Time

Chapter 13 of Algorithms and Data Structures: The Science of Computing presents an algorithm for searching ordered binary trees. The book derives the worst-case execution time for searching a balanced ordered binary tree of n elements: Q( log n ).

The OrderedTree Class and Its Subclasses

Chapter 13 describes the OrderedTree class. It is available in a file named OrderedTree.java, which can be found per the directions in the “Final Details” section of this document.

Any Java source file that uses the OrderedTree class should “import” it via statements of the form

    import geneseo.cs.sc.*;

or

    import geneseo.cs.sc.OrderedTree;

This exercise requires defining a subclass of OrderedTree, something that is subtle in several ways. Section 11.5.1 of Algorithms and Data Structures: The Science of Computing discusses the subtleties related to lists, and the same subtleties generalize to trees. For this exercise, pay particular attention to how to write makeNewTree methods.

Every subclass of OrderedTree needs a parameterless constructor that initializes a tree to be empty — this constructor is necessary because makeNewTree uses it. Generally, this constructor is simple, it just calls the corresponding constructor in the OrderedTree superclass.

Wrapper Classes

The tree subclass in this exercise will represent trees of integers. Unfortunately, integers, i.e., values of Java type int, are not objects in Java, and so cannot be stored directly in trees. (As explained in appendix section A.2.1 of Algorithms and Data Structures: The Science of Computing, Java has two kinds of data: objects and non-objects; OrderedTree objects (and most other collections) can hold any kind of object, but not non-object values.) Fortunately, the designers of Java anticipated such problems, and provided “wrapper” classes for all the non-object data types. A wrapper is a very simple object that basically does nothing except contain one value of a non-object type. For example, instances of the wrapper class Integer each hold one int, instances of Character hold a char, etc. To store a non-object value in a collection, create a wrapper object to hold the non-object value, then store the wrapper in the collection. For example

    Integer intWrapper = new Integer( 3 );
    someTree.addNode( intWrapper );

To retrieve the non-object value from the collection, retrieve the wrapper and then extract the non-object from the wrapper. For example

    Integer wrappedInt = (Integer) someTree.getRoot();
    int retrievedInt = wrappedInt.intValue();

(The notation (Integer) before someTree.getRoot() in this example is a cast that tells the Java compiler that this particular use of getRoot will return an Integer object. Casts are often used when retrieving data from collections, because the retrieval message is usually declared to return some very generic class— e.g., Object — but client programmers often know much more precisely what classes their application stores in the collection. See appendix section A.5.2 of Algorithms and Data Structures: The Science of Computing for more information on casts.)

Here are some common wrapper classes from the Java class library. Every wrapper class provides a constructor that takes a single value of the corresponding non-object type as its only parameter. Every wrapper class also handles a parameterless message for extracting the non-object value from a wrapper, as indicated in the following table (the examples above of wrapping and unwrapping an int illustrate how one might use these constructors and extractors).

Non-Object Type Wrapper Class Extraction Message
int Integer intValue
long Long longValue
double Double doubleValue
float Float floatValue
char Character charValue
boolean Boolean booleanValue

For more information on these (and other) wrapper classes, see the Documentation for the Standard Java Library at http://java.sun.com/j2se/1.4.2/docs/api/index.html

Exercise

The heart of this exercise is an experiment that tests a hypothesis about the worst-case execution times of searches in balanced binary trees. However, before doing this experiment, a subclass of OrderedTree that supports the experiment is needed. The lab exercise thus breaks down into two parts:

  1. Write a subclass of OrderedTree that allows clients to create balanced trees of client-specified size
  2. Conduct an experiment that verifies the worst-case search time for balanced trees.

A Tree Subclass

Verifying that balanced ordered binary trees have a worst-case search time of Q( log n ) requires a subclass of OrderedTree that can initialize balanced trees of client-specified size. Thus, the first task is to define a subclass of OrderedTree that provides a constructor with two integer parameters, n and m. This constructor initializes an OrderedTree that contains the integers between n and m, and that is as nearly balanced as possible. Assume as a precondition that n <= m.

For this lab, “nearly balanced” means that for every node in a tree, the sizes of that node’s left and right subtrees differ by at most one.

Initializing the tree so that it is nearly balanced may require a little thought. Think recursively! In other words, try to find a way to create a nearly balanced, ordered, binary tree that involves creating two smaller, nearly balanced, ordered binary trees.

Once the algorithm for initializing a tree has ben designed, prove it correct. This is an important step in this exercise for two reasons: first, the algorithm will probably not be one that is just obviously correct, and second, the validity of the entire experiment relies on having nearly balanced trees — a single mistake in the initialization algorithm can destroy the entire the lab!

The Experiment

The second task in this lab is to do an experiment that tests the hypothesis that the worst-case time to search a balanced, ordered, binary tree of n items is Q( log n ).

Data collection for this experiment will mainly involve timing worst-case searches in trees of various sizes. Note that the worst case for searching is searching for an item that is not in the tree. The subclass defined in the first task will be helpful here, because it makes it easy to initialize trees of known sizes, and with precisely known contents (so that it is easy to calculate a value that won’t be in the tree).

Actual search times may be so small that they are hard to measure. Use the technique for measuring small times that is described in section 9.3 of Algorithms and Data Structures: The Science of Computing if necessary.

As part of the conclusion to this experiment, think back to the experiment on the worst case time to search a list. Discuss the extent to which the difference between the Q( log n ) time for trees and the Qn ) time for lists is significant in practice.

Final Details

Finding the OrderedTree Class

The OrderedTree.java file can be downloaded from the World Wide Web.

Documentation for the OrderedTree class is also available on the Web. The main documentation page is an index to documentation for all the Java classes written for use with Algorithms and Data Structures: The Science of Computing. To see the documentation for a specific class, click on that class’s name in the left-hand panel of the page.

Submitting Your Work

This lab is due at the start of your laboratory session on April 26. Turn in a report on your experiment. This report should include the code and correctness proof for the tree initialization algorithm, and your comments regarding the significance of the difference in worst-case search times between trees and lists, as well as the usual description of experimental procedure, data, data analysis, and conclusion.


Copyright © 2004. Charles River Media. All rights reserved.

Revised Apr. 5, 2004