Proofs of programs

Context for examples: Numbers vs. Numerals

Number is an abstract idea; Numeral is an encoding of a number as a string. (Roman numerals, base-10 numerals, …). People often confuse a number with a representation of that number. When necessary add brackets and base around a numeral. I.e., what based is intended for "471"?

[471]10 = 4⋅102 + 7⋅101 + 1⋅100 = 471

[471]8 = 4⋅82 + 7⋅81 + 1⋅80 = 313

(Another) Structural induction example

We'll assume the input list is reversed, just to simplify the code a bit.

; digs->int: (listof int) int → int
; Purpose: Returns the number represented in base b by the digit-list d,
; when reversed.
;
;   (digs->int (list 1 7 4) 10) = 471
;   (digs->int (list 1 7 4) 8)  = 313
;   
(define (digs->int d b)
  (cond [(empty? digs) 0]
        [(cons?  digs) (+ (first digs)
                          (* b (digs->int (rest d) b)))]))

Is this correct? What does "correct" mean? Can we make the "Purpose" statement more mathematically precise?

How to prove?

But what about iterative/imperative programs?

/*
 * digs->int: (listof int) int → int
 * Purpose: Returns the number represented in base b by the digit-list d,
 * when reversed.
 */
int digs->int( (listof int) d, int b ) {
  int sumSoFar ← 0;
  int i ← 0;
  while (i < (length d)) {
    sumSoFar ← sumSoFar + (list-ref d i) * pow(b,i);
    i ← i+1;
  }
  return sumSoFar;
  }

/* 
 *   List A ← new Cons( 1, new Cons( 7, (new Cons( 4, theEmptyList))));
 *
 *   unless (digs->int(A, 10) == 471) { println("Test case 1 failed."); };
 *   unless (digs->int(A, 8) == 313)  { println("Test case 2 failed."); };
 */

Again, we'reassuming the input list is reversed, for consistency with the first example.

Hoare Logic (informally)

Pre- and post-condition of each statement. Examples:

Hoare Logic in practice

Invariants — Hoare Logic for loops

Introduce by example, using the loop in digs->int:

The relevant definitions.

To prove that G holds at the end, need to show four things:

  1. Precondition: I holds initially
  2. Loop invariant maintained: I∧C → I′
  3. Correctness: I∧¬C → G
  4. Termination: ¬C will eventually hold

Prove it…

Lemma: d and b don't change. Thus,

Show:

  1. Precondition: Check.
  2. Loop invariant maintained:

    Suppose I ∧ C: That is, we know sumSoFar = ∑k=0i-1 dk⋅bk i ≤ (length d); Then
    sumSoFar
    = sumSoFar + di⋅bi
    = ∑k=0i-1 dk⋅bk + di⋅bi
    = ∑k=0i dk⋅bk;
    = ∑k=0i′-1 dk⋅bk;
    which is exactly I′, like we want. [Hmm, we didn't actually ever use C at all, oh well.]

  3. Correctness: I∧¬C → G If ¬C, then ¬(i < (length d)), so i = (length d), and so I = G, voila.

    Hmmm, wait a sec — how did we have i = (length d)? ¬C Only gives us i≥(length d) instead of equality, and we don't have any actual previous fact to cite to conclude i = (length d)!

    We want to exclaim "but of course i=(length d); it starts at 0 and we stop as soon as we hit (length d)!" Well, if it's so clearly true, maybe we should deign to mention the fact.

Let's make our loop invariant stronger. Rename our original invariant to be I1. Our new invariant is I = I1∧I2, where I2="i≤(length d)".

A new invariant needs a new proof, so let's start over.

  1. Precondition. We need to know that length() always returns a non-negative value, which presumably you'd verify separately.
  2. Loop invariant maintained. Basically the same as before, but now we need to use the fact that C holds.
  3. Correctness: I2∧¬C gives us i=(length d); that and I give us G.
  4. Termination: Initially i=0, and it increases by a constant at each iteration, thus it will eventually become greater than (length d). [This is a standard math theoremm; provable by induction.]

When writing proofs, it's not unusual to discover snags, and need to go back and patch them up; your first write-up is rarely your last. You're encouraged to write up your proofs in a word processor, to avoid having to re-copy your homework.

The Structure of Loop Proofs

Reflecting a moment, on the style of proof: Programs that respect the intrinsic structural definition of their data (the "natural recursion" of Comp210) are amenable to proofs by induction. Programs which need to change state are harder to reason about (partly because our mathematics doesn't naturally reaon about time and state (change-over-time). Especially when the state is spread out over different functions (and if a function doesn't document what state it's modifying, things are bleak indeed). Such code is not only harder to reason about, it tends to be far more bug-prone.

A good programmer is aware of when a functional (state-free) approach is good, and when keeping state is appropriate; this is an important distinction you probably didn't realize you were learning in Comp210/How to Design Programs.

Just as the purpose of loop constructs is to provide a high-level way of separating "do a task once" from "repeat the task as needed", our proof framework is separating "show that one iteration is correct" from "show that it's repeated as needed". Our proof structure is reflecting the structure of the underlying program.

More examples: The other direction

List int->digs( int n, int b ) {
  List digsSoFar = theEmptyList();

  while (n > 0) {
    digsSoFar ← snoc( remainder(n,b), digsSoFar );
    n ← quotient(n,b)
    }
  return digsSoFar;
  }

What is the statement of correctness?

What is the loop invariant?

Prove correctness:

First, we need the following lemmas:

Proof that invariant is maintained (other parts are fairly straightforward):

On your own: Try proving correctness with I1.