(Weak) (Mathematical) Induction
Introductory example
f(n) =
0, if n=0;
n+f(n), if n>0
- Can recognize that
f(n) =
∑i=0…n i
- Can then look up closed form solution,
n(n+1)/2
- Problem: Relying on other authorities doesn't teach you how
to answer the question yourself. Most problem's answers not
in books.
- Can solve by realizing…
- f(n) =
0 + 1 + 2 + … +
(n-2) + (n-1) + n
- f(n) =
n + (n-1) + (n-2) + … +
2 + 1 + 0
- 2f(n) =
n + n + n + … +
n + n + n =
n(n+1)
- Problem: Pattern identification is implicit. Pattern
correctness is assumed. Both simple in this case, but they
aren't always.
- Problem: Doesn't tell us anything about how to solve other problems.
- Let
P(n) =
"f(n) = n(n+1)/2"
- Start with explicit identification of pattern.
- Proof by induction of
∀n∈ℕ.P(n):
- Base case: P(0)
- Inductive case:
P(i)→P(i+1)
—
∀i∈ℕ
- Inductive case structure corresponds exactly with recursive
definition case structure
More examples
We probably won't have time for all of these.
There are more examples in the book.
- fact(n) ≤ nn
- 00 undefined —
Need to revise statement to only hold for n≥1.
- ∀ n∈ℕ,
n3-n is divisible by 3
- ∀ n≥5 and n∈ℕ,
2n>n2
- For motivation, first compare 2n and
n2 for small values.
- Compare to equivalent:
&forall n∈ℕ, (n≥5
→ 2n>n2)
- Compare to equivalent:
&forall n∈ℕ,
2n+5>(n+5)2
- Suppose R is a partial order on a set A.
Prove that every finite, nonempty set
B⊆A has an R-minimal
element.
- Intuitively, what is this saying?
- Since R is a partial order, not every
element in B is
necessarily related to every other element. But this
claims there is some element x that is
"less-than-or-equal-to" any other element that it is
related to by R. I.e., for each element
y, either
R(x,y) or neither
R(x,y) nor
R(y,x). Since
R is a partial order, there might be
multiple minima, but the claim is that there is at
least one. This is also effectively saying that we
can't have a "loop" in the ordering, such as
R(a,b),
R(b,c),
R(c,a).
- How to reword as ∀ n,
P(n)?
- ∀ n≥1,
∀ B⊆A,
(B has n elements ⇒
B has an R-minimal
element)
- We prove this by induction on the size of set B.
Let n=|B|.
- When n=1, its single element is clearly
R-minimal.
- Now consider when n>1. Let
B={b1, …,
bn}, for some
arbitrary ordering of elements. Let
C={b1, …,
bn-1}. By induction,
C has an R-minimal element, and
let i be its index. In B,
bi is still
"less-than-or-equal-to" or unrelated to all other
elements, except for possibly
bn, so lets consider
how those two elements might be related.
- If they are not related by R, then
bi is
R-minimal in set B.
(bn might also
be R-minimal in this case, but that is
irrelevant.)
- If
R(bi,bn),
then bi is
R-minimal in set B.
(bn is definitely
not R-minimal, but again that is
irrelevant.)
- If
R(bn,bi),
then we claim that
bn is
R-minimal in set B. To show
this, consider an arbitrary
bj, such that
j is neither i nor
n. Since
bi was
R-minimal in C, for any
bj such that
R(bi,bj),
then
R(bn,bj)
by transitivity. And for any
bj such that
bi and
bj are unrelated,
then either bn
and bj are
unrelated, or
R(bn,bj).
In particular,
R(bj,bn)
is impossible, since transitivity would imply that
R(bj,bi),
which would be a contradiction.
- These cases are exhaustive, and in each we have
identified an R-minimal element.
- ∀ n≥3, if n distinct points on a
circle are
connected in consecutive (i.e., consistently clockwise or consistently
counter-clockwise) order with straight lines, then the interior angles
of the resulting polygon add up to (n-2)×180°.
Note that we've generalized induction to the form:
To prove ∀n∈{k,k+1,k+2,…}.P(n)
- Base case: P(k)
- Inductive case:
P(i)→P(i+1) —
∀i∈{k,k+1,k+2,…}
Incorrect uses of induction
Example:
- What's wrong with this inductive proof:
- P(n) = "n > 1000".
- Proof: …
(After all, it's easy to show P(k) → P(k+1).)
Okay, we'll patch that problem for this example:
- What's wrong with this inductive proof:
- P(n) = "n>1000 or n=0".
- Proof: …
Example:
-
False claim: All Rice students are in the same college.
P(n) = "for all sets of n students, those students are all
in the same college".
Prove ∀n≥1.P(n).
We know this conclusion is wrong, so what's wrong with the
following?
-
P(1): Yep, in a singleton set
{s1}, every student is in
s1's college.
-
P(i) → P(i+1):
We can assume that P(i) holds: for all sets
of i students, those
students are all in the same college.
We need to show P(i+1). Start with an
arbitrary set of k+1
students, S = {s1, …,
sk,
sk+1}, and we'll show they all
are in the same college as s1.
By inductive assumption, the set {s1, …,
sk} are all in the same college
as s1.
Similarly, so are {s1, …
sk-1,
sk+1}.
Since "is in the same college as" is transitive,
all elements of S are in the same college as
s1. Q.E.D.
Strong (mathematical) induction
- Recall weak induction —
to prove ∀n∈ℕ, P(n), show
- Base case: P(0)
- Inductive case:
P(i) → P(i+1),
for arbitrary i∈ℕ
- Strong induction allows you to assume more in the inductive case —
- Base case: P(0)
- Inductive case: (∀j≤i,
P(j)) → P(i+1),
for arbitrary i∈ℕ
- Alternate presentation — don't actually need the base case,
because the inductive case subsumes it.
Examples
- Fibonacci number closed form:
- Start with recursive definition:
- F0 = 0
- F1 = 1
- Fn =
Fn-2 +
Fn-1, if
n≥2
- Prove closed form is
Fn =
((an-(bn)/√5, where
- a = (1+√5)/2
— golden ratio
- b = (1-√5)/2
— golden ratio conjugate
or silver ratio
- Note that inductive form of the proof needs to be reworded with
two base cases, like the definition.
- In the proof, the following lemmas are helpful:
- Division:
∀ natural numbers n,m, where
m>0, there are natural numbers
q,r such that
n=mq+r and
r<m.
- Aside: A non-constructive definition of quotient
and remainder. Also, doesn't even claim uniqueness.
- Induction on n or m?
- for arbitrary positive m, prove
∀n.P(n), where
P(n) =
"∃q.∃r.(n=mq+r ∧ r<m)"
Relation to Weak Induction
- As name implies, stronger than weak induction, i.e., subsumes it.
- Because allowed to assume more in inductive case.
- Every weakly inductive proof is also a strongly inductive
proof.
- Thus, given strong induction, don't need weak induction.
(Although weak induction might lead to a clearer presentation.)
- To ponder: Given weak induction, do you really need
strong induction?
- I.e., given a strongly inductive proof, can it be (in
general) reworded into a weakly inductive proof?
Structural induction
Introductory example
Define: a binary tree is either empty, called a leaf, or
(make-node datum t1
t2)
,
where t1,t2 are binary trees.
Prove: for any binary tree t, #leaves(t) =
#nodes(t) + 1.
- First, define #nodes(), #leaves() recursively.
How to prove?
- (Strong or weak) induction on #nodes or #leaves.
- (Weak) induction on height.
- Somehow trying to pair up leaves and nodes, with one leaf
unpaired. How in general, for arbitrary binary tree?
- Structural induction.
Example
Define: for an arbitrary n,
an n-ary tree is either empty, or
(make-node datum ts)
,
where ts is an n-tuple of n-ary trees.
Prove:
For any n-ary tree t,
#nodes(t) ≤ nheight(t)-1
- Again, first define #nodes(), height() recursively.
Semi-aside: other ways to prove same thing?
- As before, strong induction on #nodes or height.
- Note: induction on n not helpful.
- Prove inductively for complete n-ary trees, then argue
arbitrary tree has no more nodes that complete tree.
- Prove as a sum of #nodes per level for either complete or general tree.
Example
Prove: For any propositional WFF f,
#connectives(f) ≥ #propositions(f) - 1.
Proof has same number of base and inductive cases as the data definition.
Example
Define strings:
For a set Σ (the alphabet), the set
Σ* (strings over Σ) contains:
- λ (empty string)
- wa,
where a ∈ Σ, and
w ∈ Σ*.
Define length.
Define concatenation:
For v,w ∈ Σ*,
define vw ∈ Σ* as follows:
- If w=λ, vw = v.
- if w=ua (for a letter
a and a string u),
vw = (vu)a.
Prove: length(vw) = length(v) +
length(w).
- Use induction on what? v or w?
Why?
- Induction on w matches the definition of concatenation.
P(w) = "length(vw)
= …" (for arbitrary string v).
Example
; append: list, list -> list
;
(define (append x y)
(cond [(empty? x) y]
[(cons? x) (cons (first x)
(append (rest x) y))]))
Prove:
For any list a,
(append a (append a a)) =
(append (append a a) a)
Hmm, proof gets stuck.
Ironically, by proving a stronger statement, your life
actually becomes easier!
Prove:
For any lists a,b,c,
(append a (append b c)) =
(append (append a b) c)
.
First, must decide what to do induction on: a,
b, or c?
Induction on a:
let P(a) = "For all lists b,c,
(append a (append b c)) =
(append (append a b) c)
.
Known as generalizing, strengthening, or
loading the inductive hypothesis.
Induction Overview
Why all these different forms of induction?
To ponder: Is weak induction on numbers enough?
Induction can be used on any data defined recursively (or
inductively), i.e., anything tree-like.
Note: Other forms of induction, e.g., co-induction and
transfinite induction, allow you to do different kinds of
things. They are more advanced, but still sometimes useful in CS
— consider streams (e.g., I/O) and infinite trees.
Structuring proofs correctly
What's wrong with the following?
Let our domain be 2-3 trees.
Have some P(t), but for this example, don't care
what it is.
- Base case: P(empty) holds because …
- Inductive case: Inductively assume
P(t1),
P(t2), and
P(t3), for
arbitrary 2-3 trees t1,
t2, t3.
Also assume k1 <
k2, for arbitrary key values
k1, k2.
Then P(
(make-3-node t1
k1 t2
k2 t3)
) holds
because …
Biggest problem: Never reasons about 2-nodes.
Structuring this way obscures the fact that the case split is incomplete.
- Base case: P(empty) holds because …
- Inductive case: Inductively assume
P(t1),
P(t2), and
P(t3), for
arbitrary 2-3 trees t1,
t2, t3.
Also assume k1 <
k2, for arbitrary key values
k1, k2.
Then P(
(make-3-node t1
k1 t2
k2 t3)
) holds
because …
And P((make-2-node t1
k1 t2)
)
holds because …
Fixes the previous problem, although we still can't just look at the
beginning of each case to verify this.
Other problems: It doesn't state much of what we should know about
the ordering relationships among the keys and trees. But, it does
overstate one ordering relationship — for 2-nodes, it doesn't allow
k1 to be the maximal key.
All of these problems are more easily seen with a better structure.
- Base case (t=empty):
P(t) holds because …
- Inductive case (t not empty):
- Case t=
(make-2-node
t1 k1
t2)
:
We know the following about the fields: …
By the inductive hypothesis, we can assume
P(t1) and
P(t2).
Thus we know P(t) because …
- Case t=
(make-3-node
t1 k1
t2 k2
t3)
:
We know the following about the fields: …
By the inductive hypothesis, we can assume
P(t1),
P(t2), and
P(t3).
Thus we know P(t) because …
Or, rather than having two subcases of the inductive case (as most
textbooks suggest), have two inductive cases (following the data
definition, as in COMP 210/212.)