SUNY Geneseo Department of Computer Science


Last Lab! -- Heapsort

Csci 240, Spring 2007

Prof. Doug Baldwin

Due Tuesday, May 1

Purpose

This lab serves as something of a capstone exercise to this course, in that it asks you to apply all of the methods of inquiry we have developed (algorithm design and implementation, mathematical analysis, and experimental analysis) to study a new sorting algorithm. The lab thus reinforces your ability to design and code algorithms, to conduct mathematical analyses of their performance, and to design and conduct experiments that convincingly support (or refute) the mathematical results.

Background

In this lab, you will code a sorting algorithm known as "heapsort." As the name suggests, heapsort is based on a heap. The lab handout gives you some high-level descriptions of the algorithm and hints at how to analyze it, but you are expected to develop many of the details.

Heapsort's input is an array of n values to be sorted. The algorithm proceeds in two phases:

  1. Arrange the values in the array into a max-heap.
  2. Remove the heap's largest element, swapping it with the heap's last element, and then sifting that new root down to a properly ordered place in the heap. The effect of this swap is to shrink the heap by one element, and to move the removed element to a section of the array beyond the heap that is sorted into ascending order. Repeating this step until all elements have been removed from the heap leaves the entire array sorted into ascending order.

Exercise

Your task is three-fold:

  1. Flesh out the sketch of heapsort in the "Background" section into a complete algorithm, and code this algorithm in Java.
  2. Mathematically derive an asymptotic expression for the worst-case execution time of heapsort.
  3. Design and conduct an experiment to test whether your implementation actually has the worst-case execution time your mathematical analysis predicts it should have.

There are several sections of our textbook that are pertinent to the analysis of heapsort's worst-case execution time. The "surprise bonus" described on page 502 and the top of 503 is an outline of the analysis of the first phase. Although the text suggests that the result can be proved by induction, it is even more straightforward as an application of the Master Theorem. The analysis of the total time required to build an ordered binary tree of n nodes on pages 449 and 450 uses an approach that can also be helpful in analysing heapsort's second phase.

Follow-Up

Hand in a written report describing your version of heapsort, your analysis of it, and the experiment you conducted to test that analysis. Note that this will be a much more complete model of a professional computer science paper than many previous lab reports, in that the algorithm design and its analysis are your own, not things given to you prior to the work described in the report. These reports may therefore be somewhat longer than previous lab reports, although not dramatically so (i.e., where I have previously predicted 2 to 4 pages as a typical length, 5 or 6 might be more likely now). As always, however, the real requirement is that your report concisely says everything that needs to be said in order for a reader to understand what you did and to believe your conclusion.

Turn in your report so that I receive it before I leave on the due date above.