HMM Tagging Problem

Complexity issues have risen their ugly heads again and with the IPO date on your new comp ling startup fast approaching, you have discovered that if your hot new system is going to parse sentences as long as 4 words, you had better limit yourself to a 3-word vocabulary.

Consider the following HMM tagger for tagging texts constructed with a 3 word vocabulary.

  1. station
  2. ground
  3. control
and a tagset of 2 tags:
  1. V [Verb]
  2. N [Noun]

Here is a partial state table for the HMM:

The table is incomplete. The states and are all present, and all the transitions with non-zero probabilities are present, but the transition probabilities for control and station have been left out.

Our tagger assumes all input will start with a start symbol [s]. It also assumes the Start state is incompatible with any input except [s].

You can use this tagger to assign the most likely tag sequence to any sequence of words taken from our small vocabulary. Consider the following input to our tagger:

There are 4 different state sequences that will accept this input:
  1. Start V V
  2. Start V N
  3. Start N N
  4. Start N V
These correspond to 4 different assignments of part-of-speech tags to the two input words (ground ground) Here is each word aligned above its transition and transition probability.
  ground   ground
start   V   N
  (1.0 * .5)   (.3 * .9)   (.4 * .5)
So this corresponds to the path in which the first occurrence of ground is labeled a verb, and the second a noun. Let's review where these transition probabilities come from.

The probability model being used is the following.

That is, the joint probability of a word sequence n words long and a tag sequence n + 1 tags long is equal to the product of the probabilities of each word given its tag times the probability of each following tag given the word tag. For each wi,

For our example, according to this probability model we calculate the joint probability to be:

This is the product of the transition probabilities for this path through the HMM. To find the most likely assignment of tags we need to find the most probable path through the HMM. This is what the Viterbi algorithm is for.

Problem 3 Proper

Part A

The above HMM was given with only a partial probability model. Here is the entire probability model:

The first part of the problem is to use this probability model to complete the transition table for the above HMM tagger by filling in the transition probabilities for control and station.

Part B

The second part of the tagging problem is to tag the following input:

This can be done by computing the products of the transition probabilities (called the path probabilities) for all 16 paths through the HMM and choosing the most probable path.

But the assigned problem is to choose the most probable path by using the Viterbi algorithm.

To help you get started, here is the partial Viterbi matrix for our HMM and the given input:

Note that the Viterbi values for t=0 and t=1 have already been filled in. Continue the matrix and fill in the values for t=2, and t=3. Show your calculations. Use the Viterbi homework assignment and Viterbi lecture as your model of what to show.
Part C

Using the results of your Viterbi calculation, give the most probable state sequence through the HMM.