Choosing the optimal matching type for your TM

Matching type

There are some settings which have an important impact on the matching accuracy and the number of hits.

Fuzzy & Hits (default)

This is the default setting appropriate for Latin based and other languages where there is a space separator between words. CafeTran analyzes source segments on a word basis and performs statistical analysis of smaller chunks called subsegments to determine their translation.

Fuzzy

To speed up the matching process you may wish to switch off subsegment matching. The program shows only fuzzy matching results.

Fuzzy without word separator (previously called ‘Detailed matching’)

This option is suitable for languages without a word boundary (e.g. Japanese and Chinese). When this option is selected, the program analyzes source segments on a character basis. Please, select this setting if your source language does not have a word space boundary. Remember to switch back to the default Fuzzy & Subsegment matching when you translate from a Latin based language, for example.

Prefix matching

When this option is selected, CafeTran will analyze the beginnings of words (here called prefixes) and discard any endings responsible for inflection of words. It is an option which increases significantly the number of hits for highly inflected languages. The length of prefixes is set by a percentage number. The bigger the percent number the longer the prefix of words which the program will analyze. The minimal prefix length option (menu Edit | Options | Memory | Minimal prefix length) lets you set the minimal allowed length of prefixes. The length can also be fixed, when the "fixed" option selected, instead of a set percentage length. It means that all the words will have the minimal prefix length, no matter their actual length.

Custom prefixes

If the inflection of a word is too high for automatic prefix matching you can enter your terms to the memory determining the prefix of a word manually. This is done by inserting the pipe character | at the end of a prefix in a word. For example, the Polish phrase "piękny dzień" (a beautiful day) has a highly inflected word "dzień" occuring in a number of various cases (dnia, dni, dniom). If you insert the pipe characters at the following positions - "pięk|ny d|zień", CafeTran will also recognize other forms of the phrase (pięknego dnia, pięknych dni etc.). Note that inserting the pipe character at the first word in the phrase - "pięk|ny" is optional since its inflection is quite regular and CafeTran should recognize its prefix automatically.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License