More Dictionary Exercises
As always, collaborate with your neighbor and upload your solutions.
As we progress in the course, we will tend to have fewer, but larger and more complex, exercises each day.
-
For our upcoming text analysis, one of the simplest, but useful, analyses is to count the number of occurrences of each word in the text.
Define a function
word_counts()
that takes a list of strings. It returns a dictionary that maps each string in the given list to a count of the number of times that it occurred.For example,
word_counts(["a", "b", "a", "c", "b", "a"])
should return{"a" : 3, "b" : 2, "c" : 1}
.Hint: We need to loop over the words, counting each occurrence. The challenge is that how we need to change the dictionary depends on whether this is the first time we've counted the word or not.
-
As a variation on the previous, we would like to map each string to the percentage of the total occurrences it represents.
Define a function
word_frequencies()
that takes a list of strings. It returns a dictionary that maps each string in the given list to the fraction of all word occurrences that it represents.For example,
word_frequencies(["a", "b", "a", "c", "b", "a"])
should return{"a" : 0.5, "b" : 0.333333, "c" : .166667}
.Hint: Here's the basic strategy. First, use
word_counts()
. Also, compute the total number of words. Then, create a new dictionary by looping over the word count dictionary, computing each word's frequency from that word's count. -
In our text analysis, we will look at not only the individual words that occur, but also the word sequences. In this exercises, we'll consider a simple version of that.
Define a function
word_successors()
that takes a list of strings. It returns a dictionary that maps each string in the given list to a list of the words that occur immediately after that word.For example,
word_successors(["a", "b", "a", "c", "b", "a"])
should return{"a" : ["b", "c"], "b" : ["a"], "c" : ["b"]}
. Note that in this example,"b"
maps to a list of just one"a"
, even though"b"
is followed by"a"
twice.Note that
word_successors(["a", "b", "a", "c"])
should return{"a" : ["b", "c"], "b" : ["a"], "c" : []}
. That is, if the last word in the list is its only occurrence, then it should have an empty list as its entry in theword_successors
result dictionary.Hint: This is very similar to
word_counts()
.