CAT tools: what are they, and how do they work?

When most people hear “CAT tools”, they immediately think of something with whiskers and a tail. But CAT tools have nothing to do with cats – even though I have received the odd cat training request during my time working as a CAT expert. CAT stands for computer-assisted translation. Simply put, CAT tools are computer programs that assist translators. People often confuse CAT and machine translation (MT), and although MT is a different beast altogether, the two are based on some of the same principles. But that is a story for another day, and perhaps another blog article …

Translation memories are perhaps the most important part of a CAT tool

Translation memories (or TMs for the initiated) are essentially databases in which each sentence is saved as it is translated or as the translation is revised. They are saved as pairs (e.g. a German sentence and its English translation) to enable the translation to be found based on the source text. And, because the context in which a word appears is crucial in order to translate it correctly, the translations are saved as sentences in the database.

Here is a simple example of a sentence that might appear in a TM:

The purpose of a TM is to carry out an automatic search to find sentences that were previously translated into the other language, whether a year, a month or a minute ago. If the sentence is identical to a sentence that was previously saved in the TM (a 100% match), it will be automatically entered into the field containing the translation.

 CAT tools can also find similar sentences

CAT tools can also find sentences in which one part is missing or different to the one being translated. The extent to which the sentence matches the current source text is displayed as a percentage value (89% in the example below). The words that are different from the sentence being translated are underlined and highlighted.

Warning: CAT tools are not a replacement for translators

Even a 100% match can be wrong. In the following text about a black dog, the 100% match is incorrect due to the German noun gender.

In German, the correct pronoun for a dog is er (“he” or “it”), and the TM match was perhaps about a cat, which takes the feminine pronoun sie (“she” or “it”). It is the translator’s job to review the suggested translation and decide whether it contains any “mistakes”. So no, we can’t just feed our CAT tools with texts and wait for them to spit out a perfect translation.

CAT tools are also used for terminology management

Terminology management is the other main task that CAT tools are used for. This involves creating glossaries that, due to the amount of data they contain, are referred to as termbases. These termbases are used to record specific terms and any additional information. Unlike a TM, a termbase is used to store important terms (names, brand names, product information etc.) rather than full sentences. And while the TM is filled with information automatically as texts are translated, terms need to be manually entered into the termbases. This can be done beforehand, for example when the text is being prepared for translation, or while the text is being translated or edited. In addition to the terms themselves, further information relating to each term – such as a short explanation of what the term means or examples of how the term is used in different contexts – is often added to the termbase. This additional information can also extend to the validity of the term, for example, whether it has been approved by the client or perhaps even rejected by the client.

The termbase automatically searches for terms in the sentence currently being translated and displays them for the translator. In the example below the term is displayed in the top right of the screen.

This allows the translator to see how specific terms should be translated. Additional information for individual terms can also be added to ensure that all translators and editors use the correct terms.

TMs and termbases are two different types of language databases that perfectly complement one another. The termbase can be very useful for new texts containing sentences that do not appear in the TM, as it allows individual terms from these sentences to be translated consistently and correctly. Furthermore, the additional term-specific information saved in the termbase – such as which product the term refers to or whether the term is valid or not – can also be crucial in ensuring that the terms are translated correctly.

Saving time and resources

CAT tools ensure that texts are consistent and contain the correct client terminology. Every time a client sends us feedback regarding a sentence or term, we change the TM or termbase entries where necessary to make sure that the client’s preferences are implemented in future translations. This saves our clients both time and resources, as they don’t need to answer as many questions or make as many changes to their texts.

Bruno Ciola, Head of Terminology