[ Up to Examples of Machine Translation]
Developed by A.P.M. Witkam (Netherlands).
Relies on Esperanto as an interlingua.
Intended as the basis for translating among European languages.
[ Up to DLT- Distributed Language Translation.]
Esperanto was designed over 100 years ago as a completely regular, hopefully easy to learn language.
The builders argued that it would serve as a good interlingua because it has matured over a long period of time to express a wide range of meanings.
A single natural-ish language at the core of the translation process allows most of the knowledge engineering to be done on the central language, with the interface with other languages consisting mainly on getting possible parse trees.
[ Up to DLT- Distributed Language Translation.]
The parse trees used in this system are dependency trees, which emphasize the relationships between words rather than their phrase structures.
At this point, words like 'can't' and 'cannot' can be regularized into 'can not', words like 'had' can be regularized to 'have + [past tense]'.
All possible parses for each sentence are generated, and passed to the next level, on the assumption that pragmatic and semantic constraints can be relied on for disambiguation.
[ Up to DLT- Distributed Language Translation.]
A large knowledge base of 'Metataxis' rules describes how to translate trees in the source language into trees in the target language.
At this point, there are still a number of possible parses for each sentence being entertained concurrently.
[ Up to DLT- Distributed Language Translation.]
So at this point, there are a number of competing parses which we've just received, all encoded with Esperanto values.
Now, world-knowledge and pragmatic rules can be referred to which apply only to Esparanto, and this can be used as the basis for chosing among competing parses.
Where the knowledge base can't make a decision, it can refer to the operator to make clarifications.
[ Up to DLT- Distributed Language Translation.]
After all competing parses have been resolved, the single remaining dependency tree can be sequenced.
[ Up to DLT- Distributed Language Translation.]
This involves a process parallel to the generation of the Esperanto text, although it's a lot cleaner, since Esperanto is a more regular language, and it was generated by a machine, so everything is in a cannonical format.
Where there is a tranfer ambiguity, statistical correlations are used to effect the disambiguation.
[ Up to DLT- Distributed Language Translation.]
The original version of this was applied between English and French.
While this is an interesting example of an approach to MT, in practice the results were disappointing. Hindsight indicated that better use could be made of frequency information, with deeper analysis of the source language.
Subsequent work focused on using bilingual corpora, allignment, and an example-based approach.