Complexity issues have risen their ugly heads again and with the IPO date on your new comp ling startup fast approaching, you have discovered that if your hot new system is going to parse sentences as long as 4 words, you had better limit yourself to a 3-word vocabulary.
Consider the following HMM tagger for tagging texts constructed with a 3 word vocabulary.
Here is a partial state table for the HMM:
| States |
Start V = Verb N = Noun |
|||
|---|---|---|---|---|
| State | Observation | Transition |
Transition Probability |
|
| Start | ground | Start => N | Pr(N | Start)   * | Pr(ground | N) |
| .5   * | .4 | |||
| .20 | ||||
| Start => V | Pr(V | Start)   * | Pr(ground | V) | ||
| .5   * | .3 | |||
| .15 | ||||
| control | Start => N | to be filled in | ||
| Start => V | to be filled in | |||
| station | Start => N | to be filled in | ||
| Start => V | to be filled in | |||
| V | ground | V => N | Pr(N | V)   * | Pr(ground | N) |
| .9   * | .4 | |||
| .36 | ||||
| V => V | Pr(V | V)   * | Pr(ground | V) | ||
| .1   * | .3 | |||
| .03 | ||||
| control | V => N | to be filled in | ||
| V => V | to be filled in | |||
| station | V => N | to be filled in | ||
| V => V | to be filled in | |||
| N | ground | N => N | Pr(N | N)   * | Pr(ground | N) |
| .5   * | .4 | |||
| .02 | ||||
| N => V | Pr(V | N)   * | Pr(ground | V) | ||
| .5   * | .3 | |||
| .15 | ||||
| control | V => N | to be filled in | ||
| V => V | to be filled in | |||
| station | V => N | to be filled in | ||
| V => V | to be filled in | |||
Our tagger assumes all input will start with a start symbol [s]. It also assumes the Start state is incompatible with any input except [s].
You can use this tagger to assign the most likely tag sequence to any sequence of words taken from our small vocabulary. Consider the following input to our tagger:
|   | ground |   | ground | ||
| start |   | V |   | N | |
|   | (1.0 * .5) |   | (.3 * .9) |   | (.4 * .5) |
The probability model being used is the following.
For our example, according to this probability model we calculate the joint probability to be:
Part A
The above HMM was given with only a partial probability model. Here is the entire probability model:
| w | Pr(w | N ) | Pr(w | V ) | Pr(w | Start ) |
|---|---|---|---|
| ground | .4 | .3 | 0 |
| control | .3 | .3 | 0 |
| station | .3 | .4 | 0 |
| Pr(X | N) | Pr(X | V) | Pr(X | Start) |
|---|---|---|
|
Pr(V | N) .5 |
Pr(V | V) .1 |
Pr(V | Start) .5 |
|
Pr(N | N) .5 |
Pr(N | V) .9 |
Pr(N | Start) .5 |
Part B
The second part of the tagging problem is to tag the following input:
But the assigned problem is to choose the most probable path by using the Viterbi algorithm.
To help you get started, here is the partial Viterbi matrix for our HMM and the given input:
| V |   | .5 |   |   |
| N |   | .5 |   |   |
| Start | 1.0 |   |   |   |
|   |   | ground | control | station |
| t=0 | t=1 | t=2 | t=3 |
Using the results of your Viterbi calculation, give the most probable state sequence through the HMM.