Marco Rendina: Can you explain what a CAT Tool is?
Mª Ángeles García: Sure! A CAT Tool - short for Computer-Assisted Translation Tool - is used to store and retrieve translations when they are provided by a translation service. A CAT tool facilitates the translation process by dividing the text into smaller, translatable segments, and organising these segments in a manner that makes it easier for the translator to manage them effectively. This makes the translation process time-efficient.
The segments can be retrieved later, ensuring that the translator adheres to the original terminology and writing style. This tool provides matches for material similar to previously translated content, further reducing errors by saving the translated segments alongside the source phrases. The translator can easily access any translated segment at any time to ensure accurate translation.
Computer-assisted translation tools were developed to allow translators to quickly search and modify text segments as needed. These tools also assist with timely revision.
For example, the sentences ‘the cat is fat’ and ‘the cat is black’ are 75% similar. If our database knows how to translate ’the cat is fat’, it will suggest this translation, requiring the human translator to change only ‘fat’ to ‘black’.
MR: Why should a cultural heritage professional use a CAT Tool?
MAG: When cultural heritage professionals need to upload textual data to Europeana.eu, they can use a CAT Tool to revise the content in a quick and efficient way. They can build their own translation memories and ‘recycle’ previously translated material. When publishing similar content, a Computer Assisted Translation Tool can help to obtain reliable and consistent translations. Pangeanic’s CAT tool, called PECAT, provides a range of advantages: time-saving, error reduction, consistency and the ability to handle complex terminology.
MR: Can you explain how PECAT achieves that?
MAG: Of course! PECAT breaks down the text to be translated into segments, which can be sentences, paragraphs, or single words. These segments are presented in a convenient way, making translation easier and faster. Each segment's translation is saved alongside the original text, and both are stored in the database system as a Translation Unit (TU).
PECAT offers quality control features to ensure translation consistency and accuracy. When editing machine translations, estimating the quality of the translation is crucial. PECAT provides a confidence score for each translation unit, allowing users to prioritise editing lower-scoring units or focus on higher-scoring ones while rejecting the rest.
Additionally, PECAT enables users to accept or reject translation units. This is particularly useful when non-semantic text, such as code or symbols, is mistakenly included in the text to be edited. Users can simply reject these segments to remove them.
MR: What makes PECAT useful for cultural heritage institutions, and how has it been tailored to serve their needs?
MAG: In the course of AI4Culture project, Pangeanic’s PECAT tool has been adjusted to serve the needs of the cultural heritage community. The resulting version of PECAT will be released as an open-source tool and free online service, enabling cultural heritage institutions to take advantage of several unique functionalities. It is a versatile tool that leverages both machine and human expertise to ensure high-quality translations, making it ideal for managing complex translation tasks in cultural heritage settings. It enhances both individual and collaborative translation projects by offering efficient workflow control, adaptability to user needs, and ease of use without requiring technical expertise. PECAT also provides features such as side-by-side text display, a secure work environment, and advanced filtering of segments of translation according to their validation status or confidence score.
Moreover, the tools have been extended with specialised functionalities, which support the efficient handling of cultural heritage metadata records that follow the Europeana Data Model. To this end, PECAT is connected with the EuropeanaTranslate machine translation engines, which have been fine-tuned on metadata from Europeana.eu and support the translation from all European languages to English. PECAT has also been connected with the MINT metadata aggregation tool, used by many cultural heritage institutions and aggregators. This supports smooth data exchange between the tools and the accurate processing and transmission of translations before they are published on Europeana.eu.
MR: Thanks so much for the insight! We cannot wait to test it - when will it be available?
MAG: As I mentioned before, within the framework of the AI4Culture project, we are releasing an open-source version of PECAT. This version can be freely used by cultural heritage institutions for their metadata translation tasks. This version will be available on the AI4Culture platform from September 2024.
Find out more
As Mª Ángeles García mentions, in September 2024, the AI4Culture project will launch a platform where open tools will be made available online, together with related documentation and training materials. Keep an eye on the project page on Europeana Pro for more details and stay tuned on the project LinkedIn and X account!