This is Philipp Kuehn's aligner, as released with the Europarl corpus. It consists of two pieces:
| Script | Description |
|---|---|
| Sentence align | Sentence aligner |
| Preprocess | Splits sentences and tokenizes. |
| abbreviations.de | Abbreviations list for German tokenizer. |
| abbreviations.en | Abbreviations list for English tokenizer. |
| abbreviations.fr | Abbreviations list for French tokenizer. |