Priority Queues for Dijkstra's Algorithm
We previously described Dijkstra's Algorithm for finding the shortest path in a graph when the graph has edge distances. When outlining its implementation, we introduced the need for a priority queue. Today, we will see two implementations for priority queues, and outline the bigger picture.
Dijkstra's algorithm needs a priority queue to find the next node to explore. Nodes are added to the priority queue with their currently-known total distance — their distance acts as their “priority”. A node's priority can later be updated with a new total distance if we find a shorter path to the node. We also want to be able to find-and-remove the node with the lowest total distance, when Dijkstra's algorithm determines that the node's currently-known total distance must be its best-possible total distance.
So, in short, we need to add, update, and remove nodes and priorities. The expected behavior of adding and removing is pretty straightforward. However, there is a design decision to make for updating. I can see three reasonable options:
- The update function takes a node and a new distance that we have discovered. The node should already be in the priority queue. The function checks whether the new distance is lower than the old distance, and if so, updates it.
- The update function takes a node and a new distance that we have discovered. The node should already be in the priority queue. The new distance should be lower than the current distance, which means it is the responsibility of Dijkstra's algorithm to compare the distances, and which also means we need a function to find and return the distance of a node.
- We don't have an update function. Instead, the adding function does the distance comparison and updating if the node is already there.
The second option adds unnecessary complexity. Either the first or third could be argued as being the conceptually simplest. We'll use the third, as it puts the complexity, the burden of knowing whether a node is already in the priority queue, upon the code that implements the data structure, rather on Dijkstra's algorithm.
Implementations
In class, we will create two implementations:
- A priority queue is represented as a dictionary mapping node names to priorities (total distances). A dictionary makes updating simple, as we can easily lookup the node. However, to find the node with lowest distance, we have to loop over all the nodes.
- A list of pairs of nodes and priorities, kept in sorted order by priority. Since the list is sorted, we know that the node with lowest distance is stored at the front. But, adding/updating a node requires looping over the nodes to see if it is already present, plus possibly inserting the node in the appropriate position.
Python has an implementation of priorities queues:
Queue.PriorityQueue
. However, it does not have a function
that updates priorities. We could add it, but with some difficulty.
In general, priority queues are often implemented in terms of various forms of heaps, including binary heaps, pairing heaps, and Fibonacci heaps. These are more efficient than the implementations we are introducing, but are beyond the scope of the course.
Big-O in a Nutshell
Computers are fast enough these days that for small data sets, it almost doesn't matter what algorithm or implementation one uses to process it — the results are returned so fast that slowest technique gives essentially instantaneous results. But the problem is that in our “Information Age”, data sets are often huge, taking even the fastest of computers significant amounts of time to process. So how does one evaluate how well an algorithm or implementation will fare on such large data sets?
Computer scientists often focus on the notion of “scaling”, where rather than look at the computational time for a data set of a particular size, one asks the question, “How does the computational time change as the data size changes?” That is, if a data set has n elements, how does the computational time relate to n?
For instance, is it independent of n, i.e., O(1)? Is it linear in n, that is, if n doubles, the computational time doubles, i.e., O(n)? Is is quadratic in n, that is, if n doubles, the computational time quadruples, i.e., O(n2)? Another possibility is that the computational time is logarithmic in n, that is, if n doubles, the computational time only goes up by an additive amount proportional to the logarithm of 2, i.e., O(log n)?
Examples:
Retrieve an indexed value from an array (Python list) | O(1) |
Retrieve a dictionary entry with a key | O(1) |
Find an element in a list | O(n) |
Ordered insert into a list | O(n) |
DFS or BFS search through a graph | O(n) |
Insertion or selection sort of a list | O(n2) |
Merge sort of a list | O(n log n) |
Why the big deal? Consider numbers like a billion data points. For argument's sake, let's say that the processing algorithm takes an hour to process those billion data points. If we up the number data points to 10 billion data points, an O(1) algorithm will still take 1 hour; an O(n) algorithm will take 10 times as long or 10 hours; an O(n2) algorithm will take 100 times as long, or 100 hours; an O(log n) algorithm will only take 1/9th longer, or only about an additonal 7 minutes!
For more information:
- Dr. Wong's previous COMP 202: Big-O notation
- Wikipedia: Big-O notation, Analysis of algorithms