Laboratory Exercise

Recurrence Relations and Execution Time

Supplemental Material for Baldwin and Scragg, Algorithms and Data Structures: The Science of Computing; Charles River Media, 2004 (Now published by Cengage Learning)

Purpose

This exercise demonstrates the relationship between abstract recurrence relations and the concrete running time of programs. The exercise also provides practice setting up and solving recurrence relations, and designing and conducting experiments.

Prerequisites

Understanding of Section 7.2 of Algorithms and Data Structures: The Science of Computing.

Background

This lab revolves around a recursive algorithm for finding the largest value in an array. The algorithm’s best and worst case execution times differ dramatically. Recurrence relations are used to predict both times, and then an experiment determines whether these mathematical derivations describe the real behavior of the algorithm.

Finding the Largest Element in an Array

Consider finding the largest value in the first n positions in an array of integers. This problem might be described pictorially as follows:

One approach is to find the largest value in the first n-1 positions of the array and compare it to the n^th value. If the n^th value is larger, return it, otherwise return the largest value from the first n-1 positions. This idea can be phrased as a recursive algorithm in Java as follows. Note that A is the array in which to find the largest item.

Notice that if the n^th value is the largest, then this algorithm only sends one recursive findLargest message (the one in the comparison to A[n-1]). However, if the n^th value is not the largest, then this algorithm sends two recursive messages — one in the comparison, and a second in order to return the maximum of the first n-1 values. These observations suggest that the algorithm’s best and worst case execution times will be different. This lab involves quantifying and measuring the difference.

The algorithm’s best case will happen when every invocation of findLargest discovers that the last value in its section of the array is the largest — in other words, when the values in the array are exactly in increasing order. Conversely, the worst case occurs when no invocation of findLargest ever discovers that the last value in its section is the largest — in other words, when the values in the array are exactly in decreasing order.

Recurrence Relations and Recursive Algorithms

To abstractly estimate the execution time of a recursive algorithm, set up and solve a recurrence relation that counts the number of times the algorithm executes some “typical” operation — an operation that is somehow representative of the amount of work the algorithm does. In findLargest, the if ( n <= 1 ) comparison on the first line is a good choice for such an operation: it is executed exactly once every time findLargest is invoked, and so “represents” the total number of times findLargest executes.

The Experiment

Recurrence analysis of findLargest will produce mathematical expressions for its best and worst case operation counts in terms of n. The experiment then tests whether these expressions also describe the algorithm’s real-world running time. At best, a real program’s running time will be proportional to the value of a theoretical operation count, so the hypothesis for the experiment will be, roughly, that the best case running time of findLargest is proportional to one function of n, and the worst case running time proportional to another.

Experimental Procedure

The best case running times will probably be small, and may contain outliers. This makes those times hard to measure accurately. Relatively large values of n may be necessary in order to get measurable times. Chapter 4 of Algorithms and Data Structures: The Science of Computing contains other information about measuring running times, averaging multiple measurements to reduce random error, etc. If the experiment is done on a particularly fast computer, or with a low resolution clock function, the technique in Section 9.3 for measuring times less than the clock’s resolution may be helpful.

In at least some Java systems, the first time a program does something often takes significantly longer than subsequent times. For example, the first findLargest below might take more time than the later two:

This suggests that writing a program that calls (and times) findLargest once, and then running that program many times, is not an accurate way to get multiple measurements of how long findLargest takes. Rather, write a program that calls (and times) findLargest many times, and run the program once.

The worst case running times for findLargest will probably be very large. Surprisingly small values of n may be necessary to keep those times low enough that the experiment completes in a reasonable amount of time.

Data Analysis

As mentioned earlier, the hypothesis for this experiment will be that execution times are proportional to certain functions of n. Data analysis will therefore require testing the measured times for proportionality to the functions. Think about the precise meaning of “proportional” in order to devise an appropriate test.

Exercise

This exercise consists of two main steps: mathematically deriving the best and worst case performances of findLargest, and then conducting an experiment to verify that the mathematical results describe real-world performance.

Derivations

Set up recurrence relations for the best and worst (i.e., smallest and largest) number of times findLargest can execute its if ( n <= 1 ) comparison (or some other representative operation) while finding the largest value in an n-item array section. (See “Recurrence Relations and Recursive Algorithms” in the “Background” section of this document for more on the notion of a “representative operation”.) Find, and prove correct, closed forms for both recurrence relations. The recurrence relations and closed forms should give the number of operations executed as functions of n.

Experiment

The closed forms from the recurrence relations are, theoretically, proportional to the actual best and worst case running times of findLargest. The second step in this lab is to design and conduct an experiment that tests this hypothesis. This entails writing a complete program around the findLargest method. This program will call findLargest on various n values for both best and worst case arrays, time those calls, and display the results in a form that can be analyzed for consistency with the closed forms. See “The Experiment” in the “Background” section of this document for more suggestions on designing and conducting this experiment.

Final Details

Turn in a report on your derivations and experiment, as directed by your instructor.