SUNY Geneseo Department of Computer Science


Lab 12 -- A "Dictionary" Data Structure

CSci 240, Spring 2007

Prof. Doug Baldwin

Due Tuesday, April 24

Purpose

This exercise is intended to give you an opportunity to implement and apply a complete tree class. The exercise also gives you an opportunity to work with a program that consists of more classes, with more sophisticated interactions, than in most labs for this course.

Background

A "dictionary" is a data structure that provides search, insertion, and deletion operations. In this lab you will use an ordered binary tree to implement a dictionary data structure that can be used by a larger program. You will also have numerous opportunities to extend the program and dictionary, if you wish.

Dictionaries

A paper dictionary allows someone who knows a word to look up additional information about that word (e.g., its definition). Similarly, a dictionary data structure is any data structure that permits a client with a piece of "key" information (e.g., the word, in the case of paper dictionaries) to retrieve additional information related to that key (e.g., the definition, in the case of paper dictionaries). Many different data structures can be used to implement dictionaries (e.g., ordered binary trees, hash tables, lists) and the definition of "dictionary" doesn't require any particular one. "Dictionary" is thus a very good example of what is called an "abstract data type," i.e., a data type defined by the operations it provides without concern for its implementation details.

Using Java notation, a stereotypical dictionary would be defined somewhat as follows:

    interface Dictionary<K,D> {
        public void insert( K key, D data );
        public void delete( K key );
        public D find( K key );
    }

Dictionary is defined as an interface rather than a class in this example because the notion of "dictionary" implies no particular implementation. Many different classes can implement this interface in different ways. Also note that Dictionary is parameterized by two types: the type of the keys (K), and the type of the data associated with the keys (D).

Class Design

The program you will be working on in this lab contains more classes than most programs in this course, and, as a result, it offers more opportunities for you to see or use certain object oriented class design "tricks." These tricks are all applicable in most multi-class program designs.

Interfaces

You have already seen an Example of an Interface used to define Dictionary, above. An interface is exactly what its name suggests: a specification of (part of) the interface that a class presents to its clients, with no information on how that interface is implemented. This idea is closely related to the idea of an "abstract class," a class that defines interfaces but not bodies for one or more of its methods. Both interfaces and abstract classes can be useful any time you want to describe a set of classes that are similar in how they interact with clients, but may differ in how those interactions are implemented. In the context of this lab, a "dictionary" interface is useful because you will need to work with at least two classes (one that I wrote and one that you will write) that act as dictionaries in a larger program.

A regular class can "implement" an interface, meaning that the class provides methods for handling all the messages defined in the interface. Instances of such a class can then be used anywhere an instance of the interface is expected. Note that a class that implements an interface can handle more messages than the interface calls for, as long as it also handles all the messages described by the interface. Appendix A in our textbook discusses how to declare a class that implements an interface -- see section A.5.3.

Nested Classes

Sometimes the implementation of one class is easiest if it uses one or more simple supporting classes that are of little or no use outside the class that needs them. Java allows you to declare such classes inside the class that uses them, exactly as you would declare other members (in particular, they can be public, private, or protected). Such classes are called "nested classes." Nested classes are regular classes, and can have all the features any other classes do (e.g., constructors, member variables, methods, etc.)

Among other things that I have provided as starting points for this lab is a VectorDictionary class, which is a simple and inefficient implementation of a dictionary interface. This class uses a nested class to package a key and its associated data into a single object. This gives you a simple example of a nested class to study. You may (or may not) find similar nested classes of your own helpful.

Adaptors

Sometimes a programmer will need a class that presents one interface to its clients, but will have ready to hand a class that presents a slightly different interface. For example, when we discussed ordered binary trees in lecture we found that they are easy to implement if given insert and delete methods that return new ordered binary trees; this is not, however, what dictionaries as defined above require their insert and delete messages to return. In other respects, however, the interface for dictionaries is very like the interface to ordered binary trees.

One way of coping with this situation is to define an "adaptor" class. An adaptor class is a very simple class that in effect gives another class a slightly different interface. The adaptor does this by containing an instance of the other class as an instance variable, and then handling messages with the desired interface by "translating" them into similar messages to the instance variable. For example, an adaptor that gave ordered binary trees a dictionary interface would have an ordered binary tree as an instance variable. It would handle an insert message appropriate for dictionaries (in particular, with separate key and data parameters, and no return value) by placing the key and data into a simple wrapper object (a good place to use a nested class), and then sending the ordered binary tree an insert message with that wrapper as its parameter; the result this message returned would be assigned back into the ordered binary tree instance variable, as required by the "easy" ordered binary tree interface. In Java, this could look something like this:

    class TreeAdaptor<K,D> implements Dictionary<K,D> {
        private class Pair implements Comparable<Pair> {
            // "implements Comparable" because you need to be able to compare pairs
            // in order to put them in ordered trees.
            K key;
            D data;
            public compareTo( Pair p ) {
                ... code to compare two pairs ...
            }
        }
        private OrderedBinaryTree<Pair> tree;
        public void insert( K key, D data ) {
            Pair p = new Pair();
            p.key = key; p.data = data;
            tree = tree.insert( p );
        }
        ... other methods ...

Making Objects Printable

When you print an object to a stream (e.g., with the println message), println (or similar printing methods) simply sends the object a toString message, and then writes the resulting string to the stream. Thus, you can make your own objects printable simply by including a toString method in their class. This method takes no parameters, an should return a string that is a human-readable representation of the object.

Exercise

Your task in this lab is to design, code, and test a dictionary data structure that provides a word translation dictionary (in the non-technical sense). I have provided you a program that uses a dictionary to translate words from one language to another. This program uses a simple but inefficient implementation of the dictionary. Your job is to replace that implementation with one based on an ordered binary tree.

My translator is available on the "Outbox" file server, in folder "/ComputerScience/baldwin/CSci240/Translator." This folder contains a complete XCode project for the translator. In addition to the project file proper, the folder contains Java source files for a main program ("Translator.java"), a dictionary interface ("CSci240Dictionary.java"), and a class that provides an inefficient implementation of that interface ("VectorDictionary.java"). You should be able to complete the basic exercise by simply replacing the "VectorDictionary.java" file with a file that defines your own class that implements the CSci240Dictionary interface.

Note that my translator uses a slightly different "dictionary" interface than the Stereotypical One described in the "Background" section. It is, after all, comon that stereotypical models don't appear unchanged in real applications. My dictionary interface is called CSci240Dictionary, and it adds a print method to the methods in the stereotypical interface. The print method prints the entire contents of a dictionary to a stream.

In doing this exercise, the main method's primary translation data structure must remain a CSci240Dictionary, and you may not change the declaration of the CSci240Dictionary interface in any way. The first of these requirements ensures that anyone wishing to use a different dictionary implementation can do so by changing only one word in main (namely, the class name after new in the statement CSci240Dictionary<String,String> translations = new VectorDictionary<String,String>();), and the second simulates the tendency for interfaces in "real" software development to be very hard to change, because they reflect agreements between multiple developers about how certain classes will be used.

You may translate between any two languages you wish in testing your tree-based dictionary. For people who have no particular preferences of languages, here are some English-French translations you can use:

English French
Dog Chien
Cat Chat
House Maison
Think

Penser

Eat Manger
Red Rouge
Room Chambre
Automobile Voiture
Yes Oui
No Non
Read Lire
Green Vert
Hello Bonjour
I Je
Number Nombre
Name Nom
Talk Parler
Big Gros
Little Petit

The basic requirement for this lab is that you replace my inefficient VectorDictionary class with one that implements the CSci240Dictionary interface but uses an ordered binary tree to maintain the dictionary. A vast number of extensions to this lab are possible, however. For up to five points extra credit, see how many of the following you can do:

Follow-Up

I will grade this exercise in a face-to-face meeting with you. Make an appointment to meet with me at some time convenient for you, as long as that time is before the end of the due date above. Meetings only need to be 15 minutes long. You can make an appointment by signing up on the copy of my schedule on the bulletin board outside my office. During the meeting, I will want to read and run your translator, so either bring the translator to my office on a laptop, or hold the meeting in a lab.