Exam #3 - Comp 212

Rice University - Spring 2007

Instructions

  1. The examination is open book/notes.  You are allowed to consult any written materials available under the sun that was published or posted before the exam is made publicly available.  Using any materials other than your own created after the exam is posted is a direct violation of the Honor Code.  If your code is based on materials that we did not provide you, just give us the appropriate references.
  2. You are not allowed to discuss the exam in any way with anybody beside the instructors (Cox and Nguyen, but not the labbies) of this course.
  3. Do not post your questions on OWLSPACE.  E-mail to both of us, alc at rice.edu and dxnguyen at rice.edu, your questions instead. We will reply to you directly, and if deemed necessary, we will forward our reply to OWLSPACE as well.  We will check our e-mail as often as we can to catch and respond to your questions
  4. Write your code in the most object-oriented way possible, that is with the fewest number of control statements and no checking of the states and class types of the objects involved in any way.
  5. Do not get hung up on fixing your code to compile.  You may end up wasting a lot of time that way.  We will give major partial credits to code that do not compile but are clear enough to show your ideas on how to solve the given problem.
  6. There are 5questions for a total of  100 points.  You have 5 hours to do the exam.
  7. Submit the exam in two ways:
  8. The deadlines for submission are:
    • for degree candidates:  12 PM (noon), May 02, 2007,
    • for the rest: 5 PM, May 07, 2007.
Pledge of Honor

 

1.a (10 pts)   1.b (10 pts)  2  (20 pts) 3 (20 pts) 4 (20 pts) 5.a (10 pts) 5.b (10 pts) Total (100 pts)
       

 

1. (20 pts) More Dictionary

Consider the following extension to the IDictionary  interface:

public interface IRangeDictionary<K, V> extends IDictionary<K, V> {
    /**
     * Removes every DictPair<K, V> with a key greater than or equal
     * to keyLo and less than or equal to keyHi.  Returns the number
     * of DictPair<K, v> removed.  Throws an exception if keyLo is
     * greater than keyHi.
     */
    public int removeRange(K keyLo, K keyHi);
}

Write an implementation of removeRange for

  1. (8 points) hash tables and
  2. (10 points) binary search trees. 

Note well: It may NOT be possible to efficiently enumerate every key between keyLo and keyHi.  For example, enumerating all double between  keyLo = 1.0 and keyHi = 2.0 is impractical. Therefore, your solutions to (1) and (2) must not assume this.  You are free to use all the visitors provided to you in the lectures.

  1. (2 points) Finally, after writing (1) and (2), tell us which structure is better suited to supporting removeRange()and explain in one or two sentences why.

 

2. (20 pts) Priority Queue and the Heap Structure

Ordinarily, the get() method on a priority queue is defined to return either the maximum or the minimum priority item from the queue. If the maximum priority item is desired, then a heap where every node's priority is greater than its children's priorities is used. Conversely, if the minimum priority item is desired, then a heap where every node's priority is less than its children's priorities is used.

Describe how to implement a "median priority" queue that returns the item with the median priority. Specifically, you should (1) explain how put() and get() work and (2) describe the data structures that they use.

The median priority is the middle priority: If the priority queue contains an odd number of items, then precisely half of all priorities are smaller than the median priority and half are larger. If, however, the priority queue contains an even number of items, then the number of priorities less than the median must differ by plus or minus one from the number of priorities greater than the median. For example, consider the following items: 69, 58, 47, 36, 25, 14. If a get() is performed, you may arbitrarily choose to return either 47 or 36 as the median priority.

Your solution should perform put() and get() in O(log n) time, where n is the number of items in the queue.

Note: We are expecting an essay consisting of pseudo Java code and extensive comments rather than fully-detailed Java code.

Hint: Use two heaps simultaneously.

 

3. (20 pts) Optimized Heap Sort

Imagine running a simulation of heap sort on an array of int by hand with pencil and paper.  After simulating a few split()s, you would observe that the item swapped into the root of the heap, the variable temp from split(), almost always winds up as a leaf.  Why?  Prior to the split(), it was already a leaf in our heap.  Thus, its value is expected to be less than the items near the top of our heap.

Given this observation, we can on average reduce by half the number of comparisons between items performed by siftDown().  Specifically, this can be achieved by eliminating the comparison between A[child] and dat on the way down the heap from the root to a leaf.  (See below.)

// for simplicity, we work on an array of int.
public void siftDown(int[] A, int lo, int cur, int hi){
  int dat = A[cur];             // hold on to data.
  int child = 2 * cur + 1 - lo; // index of left child of A[cur].
  boolean done = hi < child;
  while (!done) {
    if (child < hi && A[child + 1] > A[child]) {
      child++;
    } // child is the index of the larger of the two children.
    if (A[child] > dat) { // ELIMINATE THIS COMPARISON.
      A[cur] = A[child];
      cur = child;
      child = 2 * cur + 1 - lo;
      done = hi < child;
    }                     // A[cur] is greater than its children.
    else {                // A[cur] <= A[child].
      done = true;        // heap condition is satisfied.
    }                     // A[cur] is greater than its children.
  }                       // location found for temp.
  A[cur] = dat;
}

Write this optimized version of siftDown().

Hint: Use siftUp() in Heapifier.java.

 

4. (20 pts) Selecting the ith smallest element

The problem of selecting the ith smallest element in an array of N elements can be done in expected O(NlogN) time, by simply sorting the array using Heap Sort, Merge Sort, or Quick Sort.  However, it is possible to find such an element without completely sorting the array in expected linear time, O(N), by using the split() method in QuickSorter repeatedly.

 Design and implement such an algorithm by subclassing the QuickSorter class and adding a method called

int selectIthSmallest(int i, int[] A)

to return the index of the ith smallest element from an array A of int.

 

5. (20 pts) Parsing XML

One of the most common data transmission formats in use today is the eXtensible Markup Language, “XML”.   This modern “lingua franca” is used by everything from web services to music composition programs to almost all Microsoft Office products.  This character-based format uses “tags” to define composite data elements. 

XML tags have two basic forms:   the start tag, <A> and the end tag, </A>.  Here, A is the name of the tag.  The data associated with the tag is held between matching start and end tags.   This data may include multiple nested tags.  The following are all valid XML expressions (“tagged elements”): 

<A></A>

<A>data</A>

<A><B>data1</B><C>data2</C></A>

<A><B>data1</B><C><D>data3</D>data2</C></A>

 

The actual XML standard includes many features such as tag qualifiers and comments, but we will only consider a small subset of the official grammar here.  In Backus-Naur Form (BNF), a simplified grammar for XML can be written as:

 

   TaggedElt ::= < Id > AXML </ Id >

   AXML  ::=  NEXML | empty  

   NEXML ::=  AElement AXML

   AElement ::=  Id | TaggedElt

 

Here,
  1. (10 pts) Draw the UML class diagram representing the object model for the above grammar.  Add appropriate comment boxes to clearly show what each class represents.
  2. (10 pts) Draw the UML class diagram representing the object model for the factories of visitors necessary to parse the above grammar.  Add appropriate comment boxes to clearly show what each class represents.

No coding is required.