More Dictionary Exercises

As always, collaborate with your neighbor and upload your solutions.

As we progress in the course, we will tend to have fewer, but larger and more complex, exercises each day.

For our upcoming text analysis, one of the simplest, but useful, analyses is to count the number of occurrences of each word in the text.

Define a function word_counts() that takes a list of strings. It returns a dictionary that maps each string in the given list to a count of the number of times that it occurred.

For example, word_counts(["a", "b", "a", "c", "b", "a"]) should return {"a" : 3, "b" : 2, "c" : 1}.

Hint: We need to loop over the words, counting each occurrence. The challenge is that how we need to change the dictionary depends on whether this is the first time we've counted the word or not.

Test your word_counts function using OwlTest
As a variation on the previous, we would like to map each string to the percentage of the total occurrences it represents.

Define a function word_frequencies() that takes a list of strings. It returns a dictionary that maps each string in the given list to the fraction of all word occurrences that it represents.

For example, word_frequencies(["a", "b", "a", "c", "b", "a"]) should return {"a" : 0.5, "b" : 0.333333, "c" : .166667}.

Hint: Here's the basic strategy. First, use word_counts(). Also, compute the total number of words. Then, create a new dictionary by looping over the word count dictionary, computing each word's frequency from that word's count.

Test your word_frequencies function using OwlTest
In our text analysis, we will look at not only the individual words that occur, but also the word sequences. In this exercises, we'll consider a simple version of that.

Define a function word_successors() that takes a list of strings. It returns a dictionary that maps each string in the given list to a list of the words that occur immediately after that word.

For example, word_successors(["a", "b", "a", "c", "b", "a"]) should return {"a" : ["b", "c"], "b" : ["a"], "c" : ["b"]}. Note that in this example, "b" maps to a list of just one "a", even though "b" is followed by "a" twice.

Note that word_successors(["a", "b", "a", "c"]) should return {"a" : ["b", "c"], "b" : ["a"], "c" : []}. That is, if the last word in the list is its only occurrence, then it should have an empty list as its entry in the word_successors result dictionary.

Hint: This is very similar to word_counts().

Test your word_successors function using OwlTest

COMP 200: Elements of Computer Science Spring 2013

More Dictionary Exercises

COMP 200: Elements of Computer Science
Spring 2013