- The examination is open book/notes. You are allowed to consult any written materials available under the sun that was published or posted before the exam is made publicly available. Using any materials other than your own created after the exam is posted is a direct violation of the Honor Code. If your code is based on materials that we did not provide you, just give us the appropriate references.
- You are not allowed to discuss the exam
__in any way__with anybody beside the instructors (Cox and Nguyen, but*not*the labbies) of this course. - Do not post your questions on OWLSPACE. E-mail to both of us, alc at rice.edu and dxnguyen at rice.edu, your questions instead. We will reply to you directly, and if deemed necessary, we will forward our reply to OWLSPACE as well. We will check our e-mail as often as we can to catch and respond to your questions
- Write your code in the most object-oriented way possible, that is with the fewest number of control statements and no checking of the states and class types of the objects involved in any way.
- Do not get hung up on fixing your code to compile. You may end up wasting a lot of time that way. We will give major partial credits to code that do not compile but are clear enough to show your ideas on how to solve the given problem.
- There are 5questions for a total of 100 points. You have 5 hours to do the exam.
- Submit the exam in two ways:
- Zip up all your work in one file and submit it to the Exam 03 page of OWLSPACE.
- Submit a hard copy with all answers in type-written form, except perhaps diagrams (if any) which can be hand-drawn. Be sure to print your name and sign the honor pledge. Bring your exam to Dung Nguyen's office (DH 3098) and e-mail him (dxnguyen at rice.edu) to notify your submission. He will reply to confirm your submission.

- The deadlines for submission are:
- for degree candidates: 12 PM (noon), May 02, 2007,
- for the rest: 5 PM, May 07, 2007.

Pledge of Honor |

1.a (10 pts) |
1.b (10 pts) |
2 (20 pts) |
3 (20 pts) |
4 (20 pts) |
5.a (10 pts) |
5.b (10 pts) |
Total (100 pts) |

Consider the following extension to the *
IDictionary* interface:

public interface IRangeDictionary<K, V> extends IDictionary<K, V> { /** * Removes every DictPair<K, V> with a key greater than or equal * to keyLo and less than or equal to keyHi. Returns the number * of DictPair<K, v> removed. Throws an exception if keyLo is * greater than keyHi. */ public int removeRange(K keyLo, K keyHi); }

Write an implementation of removeRange for

- (8 points) hash tables and
- (10 points) binary search trees.

** Note well:** It may NOT be possible to efficiently enumerate
every key between keyLo and keyHi.
For example, enumerating all double between keyLo = 1.0 and keyHi = 2.0 is
impractical. Therefore, your solutions to (1) and (2) must not assume this. You are
free to use all the visitors provided to you in the lectures.

- (2 points) Finally, after writing (1) and (2), tell us which structure is better suited to supporting removeRange()and explain in one or two sentences why.

Ordinarily, the get() method on a priority queue is defined to return either
the maximum or the minimum priority item from the queue. If the maximum priority
item is desired, then a heap where every node's priority is greater than its
children's priorities is used. Conversely, if the minimum priority item is
desired, then a heap where every node's priority is less than its children's
priorities is used.

Describe how to implement a "median priority" queue that returns the item with
the median priority. Specifically, you should (1) explain how put() and get()
work and (2) describe the data structures that they use.

The median priority is the middle priority: If the priority queue contains an
odd number of items, then precisely half of all priorities are smaller than the
median priority and half are larger. If, however, the priority queue contains an
even number of items, then the number of priorities less than the median must
differ by plus or minus one from the number of priorities greater than the
median. For example, consider the following items: 69, 58, 47, 36, 25, 14. If a
get() is performed, you may arbitrarily choose to return either 47 or 36 as the
median priority.

Your solution should perform put() and get() in O(log n) time, where n is the
number of items in the queue.

__ Note__: We are expecting an essay consisting of pseudo Java code
and extensive comments rather than
fully-detailed Java code.

Hint: Use two heaps simultaneously.

Imagine running a simulation of heap sort on an array of int by hand with pencil and paper. After simulating a few split()s, you would observe that the item swapped into the root of the heap, the variable temp from split(), almost always winds up as a leaf. Why? Prior to the split(), it was already a leaf in our heap. Thus, its value is expected to be less than the items near the top of our heap.

Given this observation, we can on average reduce by half the number of comparisons between items performed by siftDown(). Specifically, this can be achieved by eliminating the comparison between A[child] and dat on the way down the heap from the root to a leaf. (See below.)

// for simplicity, we work on an array of int. public void siftDown(int[] A, int lo, int cur, int hi){ int dat = A[cur]; // hold on to data. int child = 2 * cur + 1 - lo; // index of left child of A[cur]. boolean done = hi < child;

while (!done) {

if (child < hi && A[child + 1] > A[child]) { child++; } // child is the index of the larger of the two children.

if (A[child] > dat) {// ELIMINATE THIS COMPARISON.A[cur] = A[child]; cur = child; child = 2 * cur + 1 - lo; done = hi < child; } // A[cur] is greater than its children. else { // A[cur] <= A[child]. done = true; // heap condition is satisfied. } // A[cur] is greater than its children. } // location found for temp. A[cur] = dat; }

Write this optimized version of siftDown().

__ Hint__: Use siftUp() in
Heapifier.java.

The problem of selecting the i^{th} smallest element in an array of N
elements can be done in *expected* O(NlogN) time, by simply sorting the
array using Heap Sort, Merge Sort, or Quick Sort. However, it is possible
to find such an element without completely sorting the array in *expected*
linear time, O(N), by using
the split() method in
QuickSorter
repeatedly.

Design and implement such an algorithm by subclassing the QuickSorter class and adding a method called

int selectIthSmallest(int i, int[] A)

to return the index of the i^{th} smallest element from an array A of int.

One of the most common data transmission formats in use today is the eXtensible Markup Language, “XML”. This modern “lingua franca” is used by everything from web services to music composition programs to almost all Microsoft Office products. This character-based format uses “tags” to define composite data elements.

XML tags have two basic forms: the start tag, <A> and the end tag, </A>. Here, A is the name of the tag. The data associated with the tag is held between matching start and end tags. This data may include multiple nested tags. The following are all valid XML expressions (“tagged elements”):

<A></A>

<A>data</A>

<A><B>data1</B><C>data2</C></A>

<A><B>data1</B><C><D>data3</D>data2</C></A>

The actual XML standard includes many features such as tag qualifiers and comments, but we will only consider a small subset of the official grammar here. In Backus-Naur Form (BNF), a simplified grammar for XML can be written as:

TaggedElt ::= **< Id**
**>** AXML **</** **Id** **>**

AXML ::= NEXML | empty

NEXML ::= AElement AXML

AElement ::=
**Id** | TaggedElt

Here,

- the non-terminal symbols are TaggedElt (the start symbol), AXML, NEXML, AElement, and
- the tokens (terminal symbols) are
**<, >, </**and**Id**. Note that the multiple characters “**</**” is viewed as a single token.

- (10 pts) Draw the UML class diagram representing the object model for the above grammar. Add appropriate comment boxes to clearly show what each class represents.
- (10 pts) Draw the UML class diagram representing the object model for the factories of visitors necessary to parse the above grammar. Add appropriate comment boxes to clearly show what each class represents.

No coding is required.