Comp 200 Lecture
What do trees look like?


00.Feb.21 (mon)

[Note to myself: For size, *don't count leaves*.
  (In class we counted leaves, which makes all formulae more complicated.)]



Let's step back and ask, what is the correspondence
between size and height?
 Easy: size >= height
 But can we further constrain the size in the other direction?  
 Make examples, with a table.
     size <= 2^height-1
     Equivalently:  height >= nod(size+1)
[In a formal CS class, we would prove this more formally.]
We'll use this fact relating heights & sizes later.



Consider now a (full) tree with 10-way branching:
(a) If height is 3, there are at most 10^3 nodes.
(b) If there are > 10^3 nodes, height is > 3.

Different question: how many leaves in this 10-way tree?
We can label each leaf with its path from root.

number vs. numeral:
  number is an abstract idea.
  numeral is a representation of that number.
    (Similar to using quotes, cf
         "Love" is a four-letter word.
         My favorite band is Garbage.
     Quotes mean that replacement-by-synonyms would be wrong.
     Scheme has the same idea of quoting:
          (define pi 3.14)
          (define x  (+  1   2.14))
          (equal?  x  y)  ; true
          (equal? 'x 'y)  ; false
     )


    arabic numerals: use digits-in-a-row, with meaning determined
      using 10s place, 100s place, etc.
    Here's a 10-way tree, which represents all the single-digit numbers:

                      ^
                 / / ... \ \
                0  1     8  9

    Repeat three levels, to get a three-bit number:

                      ^
                 / / ... \ \
                0  1     8  9
              //.\ /..\   ...
             01  9 0  9
           /.\...
          0  9
       [too crowded to draw in text!]

  Reading the labels top-to-bottom, we get "000" on the far left,
  through "999" on the far right [not shown].
  Notice that the third label from left is "002",
  and the twelfth is "011".

  binary-numerals:
    Make a (full) tree with 2-way branching, 
    and label the leaves as before.  ("bit" = "BInary digiT")

    The 1-bit numbers (only two of 'em:)
                ^
               / \
              0   1

    Repeat three levels, to get a three-bit number:

                 ^
             /       \        Start at the root "^";
            0        1
           /  \     / \       let "0" mean "go left", and
          0   1    0   1
         / \ / \  / \ / \     let "1" mean "go right". 
        0  1 0 1  0 1 0  1
               ^
               |
               \--------e.g., read this one as 011;

  Read the labels top-to-bottom:
   they are "000" (0) on the far left,
   through "111" (7) on the far right.

  This suggests how to count in binary numerals:
   Instead of having a ones place, tens place, hundreds place etc,
   we'll have a ones place, twos place, fours place, eights place, etc.
   (corresponding to which branch was taking in preceding tree levels:)
   E.g.,   1 1 0 1
           ^ ^ ^ ^
           | | | \----- one's   place
           | | \------- two's   place
           | \--------- four's  place
           \----------- eight's place
      It's like we're going to the store and
      buying 1 eight-pack, 1 four-pack, 0 two-packs, 1 one-pack  (1-1-0-1),
      for a total of thirteen.

      You can imagine there's a leading 0 in the sixteen's place,
      since we also bought 0 sixteen-packs.

      Conversely, how do we express nineteen as a binary numeral?
      We realize that to buy nineteen, we want to buy:
        1 sixteen pack, plus how many eight packs?, plus how many four packs?
      Right --
        1 sixteen-pack, 0 eight-packs, 0 four-packs, 1 two-pack and 1 one-pack.
      meaning that the binary numeral 10011 denotes nineteen.

      This all works for base 10 as well -- consider buying 1234 items
      which come in thousand-packs, hundred-packs, etc.



Thus: We've seen tree-height is related to log_2(size).
      Also, that (binary) tree-height is related to number-of-doublings(size).
      Thus, Actually, number-of-doublings ~ log (+/- 1);
      you have actually programmed log! (we called it "nod")
      Furthermore, number-of-digits is related to tree-height (+/- 1).
      That is, number-of-digits is what "log" really means.
      From now on we'll use "nod" to mean number-of-digits.
      Notice: nod(a*b) = nod(a) + nod(b).




decision-tree for max3
In the decision tree,
leaves correspond to answers,
root-leaf path corresponds to an individual computation,
length of that path is length of that computation,
height of the tree is the worst-case length of computation.

Consider the puzzle: you are given three coins, one of them
counterfeit (either heavier or lighter).  Your only way of testing
them is a balance scale, but it costs you a buck per weighing.
...This is also a decision tree.
What if there are are 7 coins?


---- 00.feb.25 (fri)
00.feb.23 (fri)

consider the puzzle: you are given three coins, one of them
counterfeit (either heavier or lighter).  Your only way of testing
them is a balance scale, but it costs you a buck per weighing.
...This is also a decision tree.
What if there are are 7 coins?

Note that different trees correspond to different algorithms.
Are some better than others?  A tree with small height is worst-case!
Can we look for even better algorithms?  Argue that *all* trees must
be at least yay high!  (Argue via number of leaves needed, and our
relations between the size of the tree and its height.)


Another way of determining running-time of a program:
Make a tree for the execution of doublerA, doublerB.

These are not decision-trees;
here the computation is not just a root-to-leaf path, but the entire tree.
(We can call them "computation trees" perhaps; the name isn't important.)
   (combine branches with AND, not OR)


99.feb.25
doublerB took 2^n steps!
doublerA takes how many?  (Draw tree)
How does n compare with 2^n?
  n   2^n
 10    K   (thousand + ...)
 20    Meg (million + ...)
 30    Gig (billion + ...)
250    #particles in universe.
So, some algs are better than others.


hi-lo:
I choose a number in 0..31. (suppose, 10).
You ask "higher/lower" questions...
How many steps?  Yes, timesHalved( n ).
Look at the binary representation.  What are you *really* asking?
 "Is the highest bit 0 or 1?  Is the next bit 0 or 1? ..."
Consider an alternate question somebody asked:
 "Is it even or odd?"  That's really asking about low bits.

How many steps?  timesHalved(n) = nod_2(31) = 5.


----------------
Hey, we've been wishy-washy, unscientific about how many steps this is.
Why do we not care about a factor of 3?
Because we could just buy a faster computer... :-)
Well, because we are interested in the running time of the alg,
not of the alg-on-a-certain-computer-and-OS.
Besides, even if we're off by a factor of 15, which would
you prefer:  running time 2^n, or running time 15n?



You did insert-sort on the homework.
You may not have thought about how it works when you use the stepper,
but it's how you organize a hand of cards:
[use large cards, ian]

Question: what is the running-time of insert-sort, for a list of length n?

First, a smaller question: what is the running-time of insert, for length k?
(worst-case)

# of compares as a proxy for the time for insert-sort:
  comes out to be (1/2 n(n-1)) compares inside insert;
    (1/2 n(n-1)) compares inside insert, and 3n work inside insert-sort.
    For each compare, there is at most 5 other units of work done,
    plus 3.
    worst-case 5/2 n^2 + ....n + 3  <=  4n^2 + 3
    But we don't care about the constant -- that is subsumed by
    running on a particular machine anyway.  And the +3 is garbage.

big-oh
    So, O(n^2), where O(f) means <= proportional to f, ignoring small n.


By hand: mergesort

   ;; mergesort: list-of-number --> list-of-number
   ;; Return a sorted version of lon.
   ;;  
   (define (mergesort lon)
      (if (<= (length lon) 1)
          lon
          (merge (mergesort (first-half  lon))
                 (mergesort (second-half lon)))))
          
Show this by-hand, using the cards as a visual aid:
   [To the teacher:]
   Place 12 cards on chalk-tray, and above it write "12  = n".
   Then make the recursive call, place both halves on chalk-tray,
   and write on the board "6  = n/2", branched upward-left from
   "n=12";
   as i walk through the example, continue mirroring the tree on the board.

   Then ask "what was the running time?", referring to the tree we built.
       (How many levels of the tree are there in general?
        What is the amount of work done at each level?)