There are 3 problems below, labeled A, B, and C. Parts A and B are about Earley parsing. Part C is about top down parsing. You need to turn in pencil and paper answers to all 3 problems. There is an extra credit problem labeled D. The problem is really short, and a good answer can be pretty short, but it will take some real thought.
A. Explain what gave rise to S25, S26, and S27 for the parser chart in Figure 13.14. Note that the answer isn't just: The predicter put S25, S26, and S27 there. Why does the predictor put S25, S26, and S27 in the chart? What happens at previously created edges that leads to the creation of S25, S26, and S27? Don't be scared to look at the algorithm in Figure 13.13 to get your answer.
B. For this next part, use the following grammar S -> NP VP S -> Aux NP VP NP -> Det Nom Nom -> Noun Nom | Noun NP -> ProperNoun NP -> NP PP VP -> Verb VP -> Verb NP PP -> Prep NP Prep -> in | on | at | from | to Det -> this | that | a Noun -> book | flight | meal | money Verb -> book | include | prefer | landed ProperNoun -> Houston | TWA | Denver
Be an Earley parser. Parse the parse the NP part of:
Show the Earley parsing chart in the same format as is used in Figure 13.14. Assign names like S1, S2, ... to all the edges and show them. Organize the edges, as in Figure 13.13, according to what index they end at. You should try to generate the edges in the actual order that the Earley algorithm in Figure 13.13 creates them, but I won't grade that. I will deduct for missing or incorrect edges. (Incorrect means the algorithm wouldn't actually propose such an edge; not that edge does not get used in the final parse).
C. Use the NLTK recursive descent parser from nltk.parse import rd rd.demo()The command rd.demo() should give you a trace print out of an ambiguous sentence. Here is some code used by the demo function to do what it does, to show how to use the rd parser. from nltk import parse, parse_cfg grammar = parse_cfg(""" S -> NP VP NP -> Det N | NP PP VP -> V NP | V NP PP PP -> P NP NP -> 'I' N -> 'man' | 'park' | 'telescope' | 'dog' Det -> 'the' | 'a' P -> 'in' | 'with' V -> 'saw' """) for prod in grammar.productions(): print prod sent = 'I saw a man in the park'.split() parser = RecursiveDescentParser(grammar, trace=2) parses = parser.nbest_parse(sent) for p in parses: print pAs you can see, it begins by defining a grammar. Start up your own grammar file mygrammar.py and using the parse_cfg function illustrated in the definition of demo(), define your own grammar and set it to the variable new_grammar. Please don't forget to wrap your terminals in single quotes as in the example above, or your grammar won't parse anything. You can define any grammar you want to experiment with, but you should begin by defining the grammar in part B of this assignment. You should now try to parse a sentence with this grammar and the rd parser, using what you see in the definition of demo shown above to guide you in how to call the rd parser. Try to parse the sentence
Don't put in a period and don't use any upper case letters. Something goes wrong. Keep in mind that the parser works fine with the demo grammar which is shown above. You should turn in a description of what happens? Hit Control-C to stop what seems to be an infinite loop. Look carefully at your trace output. What is going on? What rule in the grammar is causing the problem. Can you explain why? Can you explain why the demo grammar does not cause ay problems.
D. Explain why the Earley parser does not have the same problem as the rd parser with the grammar used in Exercise B. The answer will require looking carefully at the algorithm in Figure 13.13. What procedure saves us from the infinite loop? Be as specific as possible. Explain how it save us, by explaining why the procedure is called, and how it avoids the problem you observed in Exercise C.
|