SUNY Geneseo Department of Computer Science


Lab 11 -- Tree Height

CSci 240, Spring 2007

Prof. Doug Baldwin

Due Wednesday, April 18

Purpose

The goals of this lab are to give you an empirical sense for how well ordered binary trees perform in practice, to give you some experience implementing binary tree algorithms, and to give you experience designing and conducting an experiment that measures a dependent varible other than time.

Background

Computer scientists generally claim that in practical use, "real" ordered binary trees of n nodes have heights that are Θ(logn). More precisely, it is expected that if you create an ordered binary tree by adding n random values to it, in random order, the tree will end up with a height that is Θ(logn). Your job in this lab is to test this belief. To do this, you will need to write code that builds ordered binary trees from random values. You then need to measure the heights of a large enough number of such trees to calculate an accurate average.

Generic Ordered Trees

I want you to define a generic ordered binary tree class, i.e., a class that is parameterized by the class of objects the trees contain. You have worked with generic library classes before, e.g., Vector. However, this time you need to define your own generic class, and the class you need uses generics in a more sophisticated way than you have seen so far. I will talk at the beginning of lab about how to do these things.

Note that in order to work with generic classes, you need to be using Java 1.5.

Random Values

The Java class library provides a very powerful random number generator in class Random. An Introduction to this class is available at http://cs.geneseo.edu/~baldwin/reference/random.html. Full details can be found in the documentation for the Java Class Library (http://java.sun.com/j2se/1.5.0/docs/api/)

Accurate Averages

Given a sufficiently large number of measurements (e.g., several tens of measurements), the best way to determine how much error is in their average is to compute their standard error. Computation of standard error is described in Chapter 4 of our textbook. The relative standard error (relative error is also described in Chapter 4) is then your measure of accuracy -- for example, if the relative standard error is 10%, then the average is probably accurate to plus or minus 10% also; if the relative standard error is 5% the average can be considered accurate to plus or minus 5%; etc.

Exercise

Design and conduct an experiment that tests the hypothesis that ordered binary trees built by inserting n random values into initially empty trees have average height Θ(logn).

In order to get practice designing and coding binary tree algorithms, I want you to write your own ordered binary tree class for this program. The class should be a generic class, i.e., the tree class should be parameterized by the class of objects that can be placed in a tree. Your tree class should at least provide messages for inserting an object into a tree, and for computing the height of a tree. You may write other messages and constructors as you see fit.

Follow-Up

Hand in a written report describing your experiment and the conclusions you reach about the hypothesis. Because the assignment is quite open-ended about the experiment, you should be sure that your report decribes it in enough detail that a reader could replicate it if they wanted to. I expect that these reports will be 2 to 4 pages long (exclusive of any code you include), although the real requirement is that they concisely say everything that needs to be said to convince a reader to believe your conclusion.

Turn in your report so that I receive it before I leave on the due date above.