SUNY Geneseo Department of Computer Science
CSci 240, Spring 2007
Prof. Doug Baldwin
Due Tuesday, February 27
This exercise asks you to derive asymptotic execution times for four algorithms related to vectors and matrices. After deriving these asymptotic execution times, you need to measure actual execution times and compare them to the asymptotic predictions. These times may be very short, and measuring them exposes a subtle feature of how Java programs execute, which you will have to take into account in your experiment.
All of the algorithms you need to work with in this lab are recursive. You will therefore need to know how to derive asymptotic execution times for a recursive algorithm. The basic tactic is to derive an expression for the number of times the algorithm executes some representative, fixed-time, operation, and then to take the execution time as asymptotically equivalent to that expression.
Section 7.2 of our text discusses how to use recurrence relations to derive the number of times an operation is executed. This material was also discussed in the February 7 and February 9 lectures.
Asymptotic notation is discussed in Section 9.2.2 of the text, and the February 12 and February 16 lectures.
How to test measured times for consistency with an asymptotic hypothesis is discussed in section 9.2.3 of the text and the February 16 lecture.
Execution times shorter than the resolution of a computer's clock function (System.nanoTime
, in the case of Java) can be measured, although doing so requires a bit of cleverness. The central idea is to measure the time many invocations of the "fast" operation take, not just the time required by one invocation. For details, see Section 9.3 of our textbook.
Vectors and matrices are the mathematical entities that inspired programming language arrays. A vector is a sequence of values, essentially a one-dimensional array. A matrix is a table of values, essentially a two-dimensional array. Mathematicians have defined several unique operations on vectors and matrices, which, although seemingly peculiar at first, turn out to be extremely useful in a number of branches of computer science (graphics being one that I know well). This lab asks you to analyze and measure the execution time of algorithms for four operations on vectors and matrices. Although you do not need to write Java code for these operations, you do need to understand them well enough to derive their execution times.
Two of these operations are initialization operations, i.e., filling a vector or matrix with a fixed value. Initializing a vector can be described recursively as follows:
To initialize the first n elements of vector v to value x:
If n > 0
Set the n-th element of v to x
Recursively initialize the first n-1 elements of v to x
Deriving an expression for the number of times this algorithm assigns a value to an element of the vector is a good way to start deriving its asymptotic execution time.
Initializing a matrix can be described recursively as follows:
To initialize the first r rows of matrix m to value x, where each row contains c elements:
if r > 0
Initialize row r of m to x. The vector initialization algorithm above can be used to do
this, since a row of a matrix can be viewed as a vector containing c elements
Recursively initialize the first r-1 rows of m to x
Note that this algorithm calls the vector initialization algorithm. That call cannot be treated as a fixed-time operation, since it's execution time depends on c, the number of elements in the vector. However, you can derive an expression for the number of operations matrix initialization executes in terms of the number that vector initialization executes.
The remaining two operations you will analyze are probably less well-known. They are computing the so-called "dot product" of two vectors, and computing the product of a matrix and a vector.
The dot product of two vectors is defined to be the sum of the products of corresponding elements in the two vectors. The vectors must be of the same length for their dot product to be defined. For example, the dot product of (2, 10, 5) and (3, 9, 6) is 2×3 + 10×9 + 5×6 (i.e., the sum of the product of the first elements of the two vectors, the product of the second elements, and the product of the third elements), which equals 126.
An algorithm for computing dot products can be defined recursively as follows:
To compute the dot product of the first n elements of vectors v1 and v2:
If n > 0
Recursively compute the dot product of the first n-1 elements of v1 and v2
Return that dot product plus the product of the n-th elements of v1 and v2
Else
Return 0
The execution time of this algorithm can be analyzed by deriving an expression for the number of arithmetic operations (i.e., additions and/or multiplications, both take fixed time and are "representative" of the work the algorithm does) that the algorithm executes.
The product of a matrix with a vector can be defined in terms of dot products; a matrix-vector product is defined only if the number of columns in the matrix is the same as the number of elements in the vector. Given this condition, the product of matrix M and vector V is a new vector, each element of which is the dot product of one row in M with V. This product vector contains as many elements as there are rows in M. Algorithmically, a matrix-vector product can be computed recursively as follows:
To compute the first r elements of the product of matrix m and vector v:
If r > 0
Set the r-th element of the product to the dot product of the r-th row of m with v
Recursively compute the first r-1 elements of the product of m and v
As with the matrix initialization algorithm, you cannot count the dot product in this algorithm as a fixed-time operation. Rather, you will have to analyze the matrix-vector product algorithm in terms of truly fixed-time operations performed within the dot product algorithm.
You have probably heard that when you run a Java program, your computer doesn't just read the text you typed into the ".java" files (so-called "source code") and execute it. Rather, your computer can only directly execute programs in what is called "machine language," a notation that encodes very simple actions (on the order of a single arithmetic operation) as binary numbers. There are two ways of bridging the gap between source code and machine language:
Java programs are executed by a complicated mixture of compilers and interpreters. This mixture is driven by the goal that Java programs should be efficiently executable on many different computer architectures. Because interpreters introduce an extra layer of software between a program and the computer hardware, interpreted programs generally run more slowly than compiled ones -- this is bad for efficiency. On the other hand, once a program is compiled, the compiled program can only run on one computer architecture, because different architectures have different machine languages -- this is bad for portability. To resolve this conflict, Java is usually implemented by a compiler that translates Java source code into a "machine language" for something called the Java Virtual Machine (JVM). The JVM is really a small (and therefore fast) interpreter for a language that operates at about the same level of abstraction as real machine languages, but isn't really the language of any actual machine, and is, in fact, tailored to the needs of Java. JVMs can then be written for many different real machine architectures, and compiled Java programs can be run on those real machines by the JVMs.
Java execution mechanisms don't stop with the JVM, however. Some people feel that even the fast JVM interpreter is too slow, and so some JVMs do what is called "just-in-time compilation:" The first time the interpreter executes a JVM instruction, it compiles that instruction into a little fragment of native machine code. Then, if that same JVM instruction ever gets executed again (e.g., because it is part of a loop), the machine code fragment is executed instead of re-interpeting the original JVM instruction.
One of the most widely used JVMs is Sun's "HotSpot" JVM, which takes just-in-time compilation one step further. HotSpot gets its name from the fact that it tries to identify "hot spots" in programs, i.e., blocks of code that are executed over and over. When it identifies a hot spot, the HotSpot JVM compiles that whole block of (JVM) code into native machine code.
Just-in-time compilation of hot spots has very important ramifications for experiments that time Java code. The code being timed is almost inevitably a hot spot, simply because most timing experiments repeatedly measure the same piece of code for accuracy's sake. Such code will run relatively slowly for a while, until the JVM recognizes that it is a hot spot. Then it will suddenly seem to run very slowly for a short time, while the JVM compiles it. After that, the code will run very fast, because it is now native machine code, not JVM code. For example, here is a series of times gathered from a Java program that shows the "signature" of hot spot just-in-time compilation:
1400.0 1200.0 1200.0 28800.0 400.0 300.0 200.0
These times are in nanoseconds, and were measured using System.nanoTime
and the technique for measuring short times described earlier.
The moral for people designing timing experiments is to somehow take just-in-time compilation into account in your experiment design. For example, deliberately execute the operation you want to time often enough that the just-in-time compiler recognizes it as a hot spot and compiles it before you actually gather execution time data on it, thereby timing only the (just-in-time-)compiled version of the code. Or, if your experiment is simple and doesn't need to time the operation in question very often, time only a bare minimum number of executions and hope that you get only times for the uncompiled version (watching your data carefully for any evidence of the just-in-time compiler kicking in, and reconsidering your approach if it does, of course).
Derive asymptotic execution times for each of the algorithms presented in the "Vectors and Matrices" section above (i.e., initializing a vector, initializing a matrix, dot product, and matrix-vector product). Then design and conduct an experiment to confirm (or not) that the times you derived describe the actual execution time of programs based on these algorithms.
To help you start this lab, I have provided an XCode project that defines Java methods for all four algorithms, along with a main
method that demonstrates them. This project is in the folder "/ComputerScience/baldwin/CSci240/MatrixDemo," on the "OutBox" file server. For non-XCode users, all the Java source code is in the file named "MatrixDemo.java." This source code contains more detailed comments explaining the vector and matrix algorithms than the explanations above, and complete Java implementations of them.
When analyzing matrix initialization and matrix-vector multiplication, assume that the matrix is "square," i.e., has the same number of rows as columns. You can then characterize execution times for all four algorithms in terms of a single parameter, representing the number of elements in a vector and the number of rows/columns in a matrix.
The asymptotic execution times you derive mathematically form the hypothesis for your experiment. In other words, the experiment tests a hypothesis (or hypotheses) of the form "vector initialization has such-and-such an asymptotic execution time, and dot product has such-and-such an asymptotic execution time, and...."
When I tested this lab, I definitely had to design for just-in-time compilation, and treat the execution times as short, as described in the "Executing Java Programs and the Just-in-Time Compiler" and "Measuring Short Times" sections above. I was able to work with vectors of up to at least 320 elements (and 320-by-320 matrices) without difficulty on a Macintosh with 2 GB of memory.
Hand in a written report on your analysis of the four algorithms. This report should follow the recommendations for computer science writing in Lab 2.
Your report must be in my hands before I leave campus on the due date above.