Rule-Ordering as Composition of Finite-State Transducers

Linguistics 581

Simplest example
of composition
 

Two rules:

  1. N => m \ _ p
  2. N -> n
The second intended as the "elsewhere" rule, the default. Should always apply last in any sequence of rules.

Three transducers of interest:

  1. Transducer for N=> m \ _ p
  2. Transducer for N => n
  3. Transducer for both rules combined (composed)

Three transducers

Two ordered
Rules
 

    N => m / _ p; elsewhere, n.(the 2 rules from our last example)
    p => m / m _
Sample
Derivation
 

k   a   N   p   a   n   Lexical form
(Underlying form )
N => m  
k   a   m   p   a   n   Intermediate form
p => m  
k   a   m   m   a   n   Surface form

Transducer
for rule 1
 

n=>m

  k   a   N   p   a   n   Lexical form
A | A | A | C | A | A | A  
  k   a   m   p   a   n   Intermediate form

Transducer
for rule 2
 

p=>m

  k   a   m   p   a   n   Intermediate form
0 | 0 | 0 | 1 | 0 | 0 | 0  
  k   a   m   m   a   n   Surface form

Composed
transducers
 

n=m;p=>m

  k   a   N   p   a   n   Lexical form
A0 | A0 | A0 | C1 | A0 | A0 | A0  
  k   a   m   m   a   n   Surface form

Implementing
Composition
 

Intuition behind transducer composition

How to combine the transduction tables.

Mathematical
Definition
of Composition
 

Definition of transducer composition

No rewrite
rule for
Composed
transducer
 

A rewrite rule conditions a rewrite on the basis of the input environment.

Consider the transition from C1 to A0 when a p rewrites as an m.

  1. Observation: State C1 encodes the fact that an N has rewritten as an m. (look at all the transitions leading to it). It is not an input environment that licenses p=>m in state C1, but an intermediate environment, the fact that N has rewritten as an m.
  2. Moral: Every rewrite rule corresponds to a transducer but not every transducer corresponds to a rewrite rule.
A simple
alternative
 

A cascade of transducers, passed through in the intended order.

Why bother computing the composition of two functions when you get the same effect just by applying them in sequence?

Advantage of
Single transducer
 

Efficiency.

The mapping is one-to-one in the generation direction, one-many in analysis direction:

    kaNpat ==> kammat    Generation
    {kammat, kaNpat, kampat} <== kammat    Analysis
Not all of these are guaranteed to be words. Ultimately, we have to check the dictionary (lexicon) to know which are right.

With multiple rewrite rules and multiple transducers, we would have to go through all the intermediate steps berfore we could check the lexicon. The number of intermediate forms multiplies in proportion to the number of rules.

Disadvantage of
Single transducer
 

Impracticality.

Composed transducer gets VERY large.

Composed transducer for Finnish was too large for the machines available in the 80s (Karttunen 91).

Declarativeness  

A large goal in programming, especially for large programs, is to separate algorithm and data. In the case of a complex constraint system this means separating the constraints themselves from the program that applies them.

A subgoal, first articulated in the logic programming world, is that constraints be formulated in such a way as to make the order of their application immaterial. The order of application belongs with the algorithm that implements the constraints.

Conclusion: The rewrite rule formalism is not declarative. The rewrite rules have their intended effect only when implemented under the proper procedure. Namely one that gets the rules in the right order.

Suppose our two rules were ordered in the reverse order:

    p => m / m _
    N => m / _ p; elsewhere, n.

We now get different results. kaNpan surfaces not as kamman but as kampan.
k   a   N   p   a   n   Lexical form
(Underlying form )
p => m  
k   a   N   p   a   n   Intermediate form
N => m  
k   a   m   p   a   n   Surface form

The result of the computation thus depends on the order in which constraints are applied. This is the definition of a non-declarative characterization of what the constraints are.

Parallel
description
 

In at least some cases, there is an alternative to rewrite rules which is opened to us by the transducer formalism. To state constraints on the surface and underlying forms in parallel.

Rule ordering relationships can now be captured a different way.

    Effect of Rule 1 followed by Rule 2
    (a) N:m <=> _ p:
    (b) p:m <=> :m _
Reference is made to whether environments are on the surface or the lexical level. but only two levels are allowed. Rule (a) says N is realized as m if and only if the following segment is an underlying p. Rule (b) says p is realized as an m if and only if the preceding segment is a surface m.

To get the effect of ordering the rules in the reverse order:

    Effect of Rule 2 followed by Rule 1
    (a) N:m <=> _ p:
    (b) p:m <=> m: _
Advantages
of parallel
description
 

  1. Rules can be operated in parallel. Declarative constraints that all must be satisfied simultaneously.
  2. Transducers working in parallel make for a smaller system. The sizes of separate transducers are added rather than (in worst case) multiplying.
The new
transducers
 

n=>m

p=>m

The new
Combined
Transducer
 

n=m;p=>m

Comparing
Composition and
Parallel Description
 

Composition of two rules.

Parallel description

Both

Moral: The differences between the final transducers are very small. There is a slight difference between the machines. Can you construct examples on which they will give differing outputs? Where there IS an important difference is in the way of representing the input rules, for example, N=>m:

    Two n=>m transducers.
Notice the bottom (twolevel) transducer makes mention of the possibility of "p" being realize as an "m" (the transition from state 2 to state 0.

The assignment explores this difference further.

Assignment:
Problem 1
 

Answer the questions in boldface beflow. Turn in your answers.

You can load the two level rules into the fsa tool as follows:

  1. Start up an xterm window on bulba or any machine in the lab. I will refer to this below as the host xterm window.
  2. Start up an editor in the background of that xterm window or use another window if you prefer. (If you use emacs you can background it by typing "emacs &" to the prompt. If you use "pico", you need to start another window to run an editor)
  3. Start up fsa with the command fsa -tk. A new window is created with fsa in it. I will refer to this as the fsa window.
  4. The fsa window has two input lines labeled "Regex" and "String" and a large white area with scroll bars for displaying FSAs.
  5. Click on File (upper left hand corner) and "load Aux".
  6. Choose the following file to load on bulba:
      /opt/fsa6/Examples/twolevel/kamman.pl
  7. Using the regex line, you can now bring up 3 different transducers, nm, pm, and kamman. Try this. Type in "nm" and hit carriage return. Now type "pm". Now type "kamman". Three different transducers should be displayed. Using the "zoom-in" button at the bottom can give you a closer look.
    1. The "nm" transducer implements a twolevel n=> m rule.
    2. The "pm" rule implements a two level p => m rule.
    3. The "kamman" transducer implements the simultaneous application of those rules in parallel.
  8. Using your editor, visit the file
      /opt/fsa6/Examples/twolevel/kamman.pl
    Or bring it up in your browser here. You can see the definitions of the three transducers in this file.
  9. Back in the fsa window type "nm" on the Regex line.
  10. Now type "kaNpat". Look back at your host xterm. This shows the surface forms related to this underlying form. Q1: What are they?
  11. Back in the fsa window type "pm" on the Regex line.
  12. Now type "kaNpat". Look back at your host xterm. This shows the surface forms related to this underlying form. Q2: What are the surface forms related to this underlying form?
  13. Back in the fsa window type "kamman" on the Regex line.
  14. Now type "kaNpat". Look back at your host xterm window. This shows the surface forms related to this underlying form. Q3: What are the surface forms related to this underlying form?
  15. The answers to Q1-Q3 should all differ. Remember:
    1. The "nm" transducer implements a twolevel n=> m rule.
    2. The "pm" rule implements a two level p => m rule.
    3. The "kamman" transducer implements the simultaneous application of those rules in parallel.
  16. To understand what is going on, type "pairs*" in the Regex window. (Compare it with the effect of typing "pairs", if you like). Typing "pairs*" has the effect of bringing up a transducer that allows any underlying/surface pairs that are possible anywhere in the language, regardless of the context.
  17. Now type "kaNpat". Q4: What are the surface forms related to "kaNpat" now?
  18. Notice that the output of "nm" given input "KaNpat" is a subset of the output of "pairs*". So is the output of "pm" given input "kaNpat". A two level rule always has the effect of filtering out some elements of "pairs*". In effect each two level rule describes some "language" (set of strings) that is a subset of "pairs*".
  19. Q5: Express the language of the "kamman" transducer in terms of the language of the "nm" and "pm" transducers.
  20. For much more info about the fsa tool than you'd ever want to know, look in the manual (ps, pdf, html).
The rule
Notation
of Problem 1
 

For this explanation, look in:

    /opt/fsa6/Examples/twolevel/kamman.pl
used for the twolevel transducer of problem 1.

We saw that there was a set called "pairs" defined for the transducers. this was the set they filtered to get the set of outputs allowed by the rules. The set of feasible pairs is explicitly defined in kamman.pl. Here's the definition in the file.

    macro(pairs,{a..z,'N':m,'N':n,p:m}).
"a..z" is just an abbreviation for all lower case alphabetic characters. So it's an abreviation for the following set:
    { a b c d e f g h i j k l m n o p
    q r s t u v w x y z}
In turn in the context of a transducer, each letter's an abbreviation for the pair of the letter with itself. So the above becomes:
    { a:a b:b c:c d:d e:e f:f g:g h:h i:i j:j k:k l:l m:m n:n o:o p:p
    q:q r:r s:s t:t u:u v:v w:w x:x y:y z:z}
So the entire set of feasible input-output pairs for the language is:
    { a:a b:b c:c d:d e:e f:f g:g h:h i:i j:j k:k l:l m:m n:n o:o p:p
    q:q r:r s:s t:t u:u v:v w:w x:x y:y z:z 'N':m 'N':n p:m}

Now for the N=>m rule. It is:

N:m  <==> _ p: 
in the kamman.pl file, this is written
macro(nm, comp(pairs, 'N':m, [], p: ? )).
Here '?' is just used as a wild card character.
    p: ?
means p realized as anything (including itself). And '[]' means any left context. So N is realized as m preceding an underlying p, which is what we want.

p:m  <==>  :m _ 
becomes
macro(pm, comp(pairs, p :m, ? :m, [] )).

In general

    Rulename: TransducedString <==> LeftContext __ RightContext
becomes
    macro(Rulename, comp(FeasiblePairs, TransducedString, LeftContext, RightContext)
Assignment:
Problem 2 e-insertion
 

Answer the questions in boldface beflow. Turn in your answers. Also turn in a revised version of the file jurafsky.pl as described below.

e-insertion transducer

You can bring up a two level transducer version of Jurafsky'e e-insertion rule einsertion transducer in fsa by doing the following:

  1. Click on File>Loadaux. Load in the file:
      /opt/fsa6/Examples/twolevel/jurafsky.pl.
  2. Type "einsertion" in the regex window. An fsa should be displayed.
  3. In this implementation, to simplify things, I've used "+" for both morpheme and word boundaries.
  4. To test the e insertion rule, type
      fox+s+
    into the string window. Check the output in the host xterm window. Q1: What is the output?
  5. Try:
      fop+s+
    Q2: What is the output?
  6. Revise the rule to handle the case of words ending in "ch" and "sh". Here are some hints.
    1. The left and right contexts of rules are regular expressions (as notated in the fsa tool). You may want to go back and look at how regular expressions in the fsa work as we we practice them in our lab exercise
    2. Obviously the left context of the the e-insertion rule needs to be generalized. Currently the left context is:
        [{x,s,z}, (+):0]
      This means an "x","s" or "z" concatenated with a "(+):0". "(+):0" is how the rule notation represents a morpheme boundary realized as the empty string, using a "0" to represent the empty string. Note that curly braces {   } represent "or" and square brackets [   ] represent concatenation.
Problem 3
Flap Rule
  Flap Problem