Lecture #14
Fri 15 Feb 2002
NEC

Homework:

average ~ 8.5
CONTRACTS AND DATA DEFINITIOINS????
Problem 5c == first
fib  fib-help

Solutions posted this afternoon!
More Reading Assignments as well, most likely!


0.  Tree properties, review
1.  Decision Trees
2.  Comparison of Algorithms, DoublerA and Doubler B
3.  Hi-Lo: Algorithmic Efficiency
4.  Searching, Linear and Binary


0.  So, as we recall from last time, with trees:

         size <= 2^(height+1) - 1

     Equivalently:  height >= log2(size+1) - 1


    The important thing to remember is that the height of a full
    binary tree is log2(size).  So,

      size <=~ 2^height

      height >=~ log2(size)

    For any given binary tree . . . 


1.  Decision Trees:

   Trees aren't only useful to us as data structures, but we
   can use them to analyze algorithms.

    
    decision-tree for max3
    In the decision tree,
    leaves correspond to answers,
    root-leaf path corresponds to an individual computation,
    length of that path is length of that computation,
    height of the tree is the worst-case length of computation.

    (max3 a b c)

               a b 
            > /   \ < 
            a c    b c 
          > /\ >  < /\ <
           a   c   b  c


    Consider the puzzle: you are given three coins, one of them
    counterfeit (either heavier or lighter).  Your only way of testing
    them is a balance scale, but it costs you a buck per weighing.
    ...This is also a decision tree.
    What if there are are 7 coins?


   Note that different trees correspond to different algorithms.
   Are some better than others? 
   Can we look for even better algorithms?  
   Argue that *all* trees must be at least some height!  
   (Argue via number of leaves needed, and our
   relations between the size of the tree and its height.)

2.  Computation Trees

   Another way of determining running-time of a program:
   Make a tree for the execution of doublerA, doublerB.
   These functions always produce the same result, right?

   Recall the functions:
  
   (define (doublerA n)
      (if (zero? n)
          1
          (* 2 (doublerA (sub1 n)))))

   (define (doublerB n)
      (if (zero? n)
          1
          (+ (doublerB (sub1 n))
             (doublerB (sub1 n)))))

   Let's say we're charged a cost for each arithmetic operation we perform,
   be it +,-,*, or /, and they're all equal . . . which algorithm costs more?

      (doublerA 3)                              (dB 3)
           | *                                 /   +  \
      (doublerA 2)                       (dB 2)        (dB 2)
           | *                          /  +  \       /   +  \
      (doublerA 1)                (dB 1)  (dB 1)    (dB 1)  (dB 1)
           | *                   / + \     / + \    /  +  \   / + \
      (doublerA 0)           (dB0) (dB0)(dB0)(dB0)(dB0) (dB0)(dB0)(dB0)

   Cost = 3 = height = n   Cost = 1 + 2 + 4 = 7 = 2^(height)-1 =~ 2^height = 2^n

   These are not decision-trees;
   here the computation is not just a root-to-leaf path, but the entire tree.
   (We can call them "computation trees" perhaps; the name isn't important.)
   (combine branches with AND, not OR)

   
   doublerB took 2^n operations!
   doublerA takes how many?  (Draw tree)
   How does n compare with 2^n?

      n                    2^n
     ---   ----------------------------------------
     10    K   (thousand + ...)     = 1024
     20    Meg (million + ...)      = 1048576
     30    Gig (billion + ...)      = 1073741824
     250   #particles in universe.  = you don't want to know

   So, some algorithms are better than others.

3.   Let's look at one more:

     hi-lo:
     I choose a number in 0..31. (suppose, 10).
     You ask "is it X?"  And I respond "higher" or "lower" . . . repeat.
     How many steps?  Yes, log2(n),  or nod(n), right?

     Look at the binary representation.  What are you *really* asking?
     "Is the highest bit 0 or 1?  Is the next bit 0 or 1? ..."

      How many steps?  nod(n) = log2(31) = 5.

     Aside:
     Consider an alternate question somebody asked:
     "Is it even or odd?"  That's really asking about low bits.


3.  Searching

    So, as you recall, we had our Database in the previous homework assignment . . .
    Let's say it was a huge employee database and I wanted to search efficiently
    through it, finding employee numbers quickly.

    Let's not worry about the structures, let's just deal with numbers, because
    it REDUCES to the same problem.

    Task:

    Given a list of numbers, find out if a particular number is in it.

    We've already done this type of problem, With a function like this:

    ;;find-number: number list-of-numbers -> boolean
    ;;returns true if the number is in the list and false otherwise
    (define (find-number n l)
        (if (empty? l)
            false
            (if (= n (first l))
                true
                (find-number (rest l)))))


    What can we do to help us out?  Well, what if we ensured that the numbers
    in the list were in order, from least to greatest, would that help?

    Yes it would.  Why?  Well, it wouldn't necessarily help so much if the 
    number WAS there, but what if it WASN'T there?  We could quit searching
    sooner and save ourselves some time, right?

    (list 2 4 5 6 8 9 10 11 15 18 19 20 21 24 25 26)

    Is 11 there? yes, we had to call our function 8 times, making 8 comparisons
    (Let's say this "costs" us 8 units of computing time, and that 1 unit of
    computing time is equivalent to 1 complete ITERATION of the function.)

    Is 3 there? no, and, why, that took us 16 calls!  We had to go through the 
    whole list!  But what if we modified our function?

    ;;find-number: number list-of-numbers -> boolean
    ;;returns true if the number is in the list and false otherwise
    ;;REQUIREMENT:  the list of numbers must be in ascending order!!!
    (define (find-number n l)
        (if (or (empty? l) (< n (first l)) ;;remember the "or" evaluation rules!!!
            false
            (if (= n (first l))
                true
                (find-number (rest l)))))

    Now, using our new find-number, is 3 there? no, and it only took us 2 calls
    to find that out!  We were efficient!!!

    So, on average, for a list of length N, how many calls will it take us to find
    out if a given number is there or not?  It takes us N/2 calls on average, right?
    This formula, N/2 describes the efficiency of our algorithm!


    More next time . . . we'll continue this topic . . .