Instructor Notes

List and Tree Search Times Lab

Supplemental Material for Baldwin and Scragg, Algorithms and Data Structures: The Science of Computing; Charles River Media, 2004 (Now published by Cengage Learning)

This document supplies information beyond what is in the student instructions for the “Search Times for Lists and Trees” laboratory exercise.

Fitting the Lab into a Course

The main goal of the exercise is to demonstrate to students the huge difference between linear and logarithmic times. The exercise does this most effectively when students measure the times for searches in both data structures in a single experiment. However, the resulting lab will probably take considerably longer than our nominal lab session of two hours.

Pre- and post-lab activities can significantly reduce the amount of time needed in the lab. For example, students can derive the theoretical execution time for searching a list, and design the algorithm for initializing a balanced tree, before coming to lab. They can write the lab report after leaving the lab.

The exercise can also be broken into two experiments, one on lists and one on trees, to fit it into two lab sessions. Many students can do the list experiment in well under two hours; the time for doing the tree experiment will be mostly determined by the time it takes students to devise the algorithm for initializing balanced trees. Splitting the lab in this manner can also be helpful if a course covers lists and trees at widely separated times, and an instructor wants to use the list and tree parts of this experiment to reinforce the corresponding sections of the course. In any split experiment, however, try to do something (e.g., class discussion, a required section in the second lab report) to make students reflect on the observable relative performances of lists and trees.

First Execution Times

We frequently notice that it takes a Java program longer to do something for the first time than to do it subsequently. For example, in this lab, the first find message a program sends to a list or tree is likely to take much longer to execute than other find messages to the same object. For best accuracy, students should therefore send one or even two find messages to their data structures before starting to time finds.

An Extension

One natural extension to this lab is to ask students to prove their algorithm for constructing a balanced binary tree correct, and/or to derive its execution time in terms of the number of items in the tree. The first of these exercises is interesting because the algorithm will probably not be obviously correct to students, and because the validity of the experiment depends crucially on its correctness — i.e., students have a personal stake in the algorithm being correct. The second exercise is interesting because there are two slightly different algorithms students are likely to devise, one of which builds the tree from the root down via addNode messages, and the other of which builds the tree from the leaves up by recursively creating new trees. The former of these algorithms runs in Θ( n logn ) time, while the latter runs in Θ( n ) time. Deriving execution times and comparing them across students (perhaps via class discussions after the lab) can thus demonstrate in a distinctly personal way the reality of faster and slower solutions to a single problem.

We did not include these extensions in the lab exercise because it is already very long, but instructors who wish can add them. This may be particularly appropriate if the lab is split over several laboratory sessions.

Other Tips and Comments

The subclasses that students write for this exercise both have a constructor that initializes a data set containing a range of integers. The constructor for the List subclass can be written either iteratively or recursively. We try to encourage students to think recursively, and so favor the recursive approach, but there is no other reason to prefer one approach over the other. The constructor for the OrderedTree subclass would be very difficult to write without recursion.

Searching large lists can lead to stack overflows in the Java interpreter. In the environment most of our students use (the CodeWarrior development environment running under Mac OS X on dual G4 processors with a total of 768 megabytes of main memory), lists with tens of thousands of elements can be searched. However, other environments may encounter stack overflows for smaller lists. In any environment, students should be able to search lists of up to a thousand elements or more, and as long as students measure times carefully and accurately, this should provide an ample range of data for testing the hypothesis.

Graphing the search times for both data structures on the same axes can be an illuminating way to see the difference between linear and logarithmic performance — typically, a graph that shows the linear growth of search time for lists makes the logarithmic time to search trees indistinguishable from 0. Drawing such a graph drives home to students that “log n” is not just an obscure mathematical function when it describes execution time. Beware, however, that because the graph does not clearly show the shapes of both execution time functions, it is not by itself an adequate analysis of the data.