Background We've seen lots of problems that can be solved in O(n), O(n log n), O(n^2), O(2^n), ... (sequential deterministic) time. For any interesting problem, need O(n) time just to look at input. Although we've concentrated on running times, could also look at space. Only count the space needed during computation, not that also needed to hold the input. O(1), O(log n), O(n), O(n^2), ... space is common. O() notation abstracts some details of costs. constant factors lower-order terms Complexity classes To make some large distinctions, it can be useful to further abstract these. P = sequential, deterministic, polynomial time EXP = sequential, deterministic, exponential time NP = sequential, nondeterministic, polynomial time LOGSPACE = sequential, deterministic, logarithmic space PSPACE = sequential, deterministic, polynomial space ... Important aside: What is nondeterminism? Example: SAT -- is a propositional boolean formula satisfiable? Three ways to describe: 1) Algorithm can guess (correctly), but must check correctness. 2) Algorithm can guess (all possible choices in parallel), but must check correctness. 3) Algorithm is also given a "proof", i.e., the correct guesses, and must check them. Other complexity classes, e.g., NC = problems that can be solved by a sequential circuit with a polynomial number of bounded-fan-in gates and having a polylogarithmic depth. = problems that can be solved by a parallel computer with polynomial processors in polylogarithmic time. ... Many based on alternative models of computation. E.g., different models of parallelism, e.g., PRAM. Many based on alternative logics E.g., NC -- circuits = propositional logic Equivalent to hard it is to state the problem in the logic Also, classes allowing randomness or other computational features. Computability (an aside) Complexity classes essentially address the question of how many resources does an algorithm need to solve a problem. It can also be useful to ask even bigger question of whether an algorithm exists to solve a problem. Decidable/recursive -- Can answer yes/no for all inputs in finite time Recursively enumerable/recognizable -- Can answer yes for all suitable inputs in finite time. This is the core topic of COMP 481. Comparing complexity classes Allows us to quantify the usefulness of concepts like nondeterminism and parallelism. Allows us to quantify the usefulness of "big jumps" in resources, e.g., logarithmic, polynomial, exponential. Pragmatically, P considered "tractable", everything else "intractable". I.e., if it's not polynomial deterministic time, it's too slow. By definition, e.g., P <= EXPTIME LOGSPACE <= PSPACE <= EXPSPACE P <= PSPACE EXPTIME <= EXPSPACE P <= NP How are other things related? E.g., time vs. space? Which inequalities are strict or equalities? Can prove, e.g., LOGSPACE <= NC <= P <= NP <= PSPACE <= EXPTIME <= EXPSPACE BIG OPEN QUESTION: P = NP ? Will indirectly explore this next. Pragmatically interesting since there are useful problems known to be in NP, but not known to be in P. Focus on such problems. NP NP = sequential, nondeterministic, polynomial time Example Finding X in a tree At each node, nondeterministically guess appropriate branch Time = depth in tree NP = sequential, deterministic, polynomial time when also given a "proof" I.e., given "guesses" as a "proof". Must check "proof" is valid. Example Finding X in a tree Given path to X, must check path really exists in tree Example Boolean algebra formula satisfiability (SAT) Will soon show NP-complete What is a proof? How can we check it? Example Finding longest acyclic path in graph Mentioned as NP-complete at beginning of course What is a proof? How can we check it? Note that proof can be large, relative to graph P = NP By definition P =? NP Most people think not, but still open question. Decision problems Sufficient to limit attention to decision problems. Determining f(x)=y is equivalent to what decision problem? Reductions Simple idea: can solve one problem A by using sol'n to another problem B as a subroutine. More specifically, can find f,g such that A(x) = g(B(f(x))) A reduces to B A <= B draw diagram Note: g is typically trivial, and is even omitted in some presentations. Proving a reduction = defining a suitable f,g Polynomial-time reductions When talking about NP, willing to have f,g use deterministic poly. time. I.e., f,g are relatively easy compared to NP. A <=p B Polynomial-time equivalence A <=p B and B <=p A Examples -- Familiar examples, not related to NP sorting <=p convex hull find-minumum =p sorting Using reductions Most familiar use: B is easy/known. Showing A <= B shows that A is easy/known. Another use: A is hard. Showing A <= B shows that B is hard. (Since if B were easy, then A would be easy, too.) Thus, motivation for <=p notation. A is no harder than B. Given B, we can implement A. But, maybe we can also implement A without using B. NP-completeness Some NP problems are "NP-complete". Usefulness & motivation: NP-complete problems "stand together". If one is in P, all are. NP-complete is strict subset of NP. Draw Venn diagram. B is NP-hard = for all A in NP, A <=p B Not all NP problems are NP-hard Trivial example: Program that always returns 42 NP-hard includes problems not in NP Trivial example: an NP oracle Draw Venn diagram. B is NP-complete = B is NP and NP-hard Draw Venn diagram. Thus, for any NP-complete A,B, A =p B. More generally, also interesting to look at PSPACE-complete, EXPTIME-complete, etc. Proving something NP-complete Once we show one problem A as NP-complete, easier to show B NP-complete. Show B is NP. Show A <=p B. Since all NP problems reduce to A, then all NP problems also reduce to B. In _principle_, when showing B NP-complete, can use any NP-complete A. For a given B, some choices of A are easier. Showing a first problem NP-complete is tougher. Resort to original definition of NP-hard. SAT is NP-complete SAT = Boolean algebra formula satisfiability = Boolean combinatorial circuit satisfiability SAT in NP Shown in introduction SAT is NP-hard for all A in NP, A <=p SAT Given arbitrary A, show reduction. How? Intuition: Computer is just a circuit! If A is NP, we have an algorithm, and can translate that into circuit, i.e., computer with some memory cells initialized with code. But computers aren't _sequential_ circuits. How to fix intuition? Each given input has a specific size. Build a circuit by fully unrolling for that size. For A, we have subroutine AC, which builds a circuit for A. From A's input size, AC computes A's running time (n cycles). AC builds circuit by unrolling n times. Circuit output is the one bit of output. SAT uses this circuit. Further details of construction are much too tedious. Using SAT When showing other problems NP-hard, sufficient to show SAT <=p B. Show diagram. Reduction f must "translate" any circuit to a B-input. Often difficult, because circuits can have many forms. Would be easier if we restricted the circuits... 3CNF-SAT is NP-complete 3CNF form = ... and (a or -b or c) and (-c or -d or e) and ... 3CNF-SAT in NP Same idea as with SAT 3CNF is NP-hard Sufficient to show what? SAT <=p 3CNF-SAT Show diagram. Intuition: Let 3CNF-SAT problems' variables be the "wires" of the SAT problem. For each gate in SAT problem, create equivalent 3CNF-SAT clauses. E.g., a NAND b = c <-> (a | c) & (b | c) & (-a | -b | -c) 0-1 Knapsack 0-1 Knapsack is in NP: Fairly obvious Algorithm? 0-1 Knapsack is NP-hard: Previously mentioned Subset Sum as a special case of 0-1 Knapsack. Values = weights Thus, SubsetSum <=p 0-1Knapsack Textbook shows Subset Sum is NP-Complete. Unbounded Knapsack Basically the same argument. Three more examples... Independent sets Clique Graph coloring See handout.