Blog Post

How To Create and Manage a Translation Memory Properly?

How To Create and Manage a Translation Memory Properly?

Translation Memories, or TM, are a key asset in translation and localization. They are often talked about and deemed essential, as they allow to maintain consistency across translated content and can significantly speed up the translation process. But why and how?

How, and why translation memories can be used for automatic translation? What are their benefits, and do they apply everywhere? How to create and maintain them? In this article, we will bring you some practical perspectives of what translation memories can do for improving the efficiency of a translator’s work.

What is a Translation Memory?

A Translation Memory, or TM, is a file format that stores “segments,” which can be sentences, paragraphs, or sentence-like units (such as headings, titles, etc.) that have been previously translated. The memory stores the source text and its corresponding translation, allowing translators to reuse these translations in future projects, or to use them as a reference. They help translators “remember” how they translated a text in the past.

Translation Memory vs Machine Translation

Translation Memories and Machine translation are not the same thing. The two concepts are often confused. More details below, but for now, just remember this.

Machine Translation is an automatic translation engine/service, such as Google Translate, DeepL, and others. Translation Memories are user-curated databases. They are completely different things, but are often confused because they sound similar.

How to use Translation Memory

There are two popular formats of Translation Memories: TXM, and XLIFF. Modern CAT may use alternative, home made formats.
So, they are files that can be loaded into specialized software, which then exploit matching algorithms to determine their relevance, and give advice to human translators so they can maintain consistency, or even reuse directly pieces of past translations. When two segments, or strings, are entirely similar, this is called “perfect match”. Algorithms can change from a software to another, but the logic remains the same.

Translation Memory 1

Benefits of Using Translation Memories

By building on the foundation of stored translations, translation memories not only save time and reduce costs but also ensure consistency across large or repetitive projects. This is especially valuable when dealing with specialized terminology or maintaining a unified tone throughout the text.

But they are not always relevant. Depending on context, two similar segments can mean entirely different things, and therefore require different translations or style (or different capitalization).

Also, highly creative source content, such as news articles, or novels, may not be able to exploit Translation Memories at all, since their content is unlikely to repeat itself. Therefore, despite being very powerful tools, their use and relevance is the choice of the translator using them.

Additionally, memories may require maintenance if used over a long period, because many revisions can occur to a source or target text over the lifespan of a project.

How to create a Translation Memory

Most modern CAT will allow you to create and manage Translation Memories from the content you have, with varying degrees of complexity and user-friendliness. They will also allow you to export the content you translate using them into various supported formats.

Raiverb also supports exporting to the most popular Translation Memory formats. It also supports importing content from a large range of different sources to convert it into compatible Translation Memories.

The new version 1.2 will also include a powerful Translation Memory editor, which will allow you to maintain your Translation Memories directly.

Can Machine Translation also take advantage of Translation Memories?

Let’s go back to this topic. Traditionally, they can’t. Machine Translation engines typically do not support third party content to customize a translation. Which is a serious limitation, since it prevents the translations to really adapt to a context or to an existing style.
AI can lift this limitation in theory, but in practice, since Translation Memories can be tens of thousands of strings long, they are impractical to use in regular LLM prompts.

This is why Raiverb integrates the use of Translation Memories directly into its workflow. With Raiverb, you can use as many Translation Memories (along with glossaries and other contextual content) as you want to influence the machine translation.
If your project already has some translated material, or has already been translated into other languages, you’ll be able to take advantage of this.

Translation Memory Creation

Create a Translation Memory with Raiverb

But what if you don’t have a Translation Memory ready?
As mentioned, Translation Memories are mainly seen in the form of TMX and XLIFF files.

Raiverb integrates a converter for this reason. With Raiverb, you can create a Translation Memory from any imported bilingual content, including:

– Excel (.xlsx) tables.
– Copy and pasted (CTRL+V) content.
– The Translation Center, which allows you to use any file format supported.

Here are the steps to build a Translation Memory:

Go to the Utilities tab, then find the “TM Converter” tab.
This should look like this.

Translation Memory Converter

 

    1. Import your bilingual content
      Just like in the Translation Center, you can click “Import File” and find any bilingual .xlsx file you wish to convert. You will need your source and target aligned.
      Alternatively, drag-and-drop will work as well.

    1. Setup the content
      The Importer will appear. Tell Raiverb which column is the source, and which column is the target.

    1. Setup the export options
      Choose where you want to export your file, and how to name the file.

    1. Click on “Convert”
      Here we go, that completes the translation memory creation process!

Wait a minute, what if I don’t have an Excel table?

No worries! There’s a trick!

If you have a way to copy and paste data from a spreadsheet, such as from Google Docs, or Lark, you can directly use Import Clipboard.
The only thing you’ll need is to first copy (CTRL+C) your content first, then pick-up from point 2 of the step-by-step tutorial above.

Sure, but what if I don’t have a table I can copy?

You can directly import content from the Translation Center. You simply need to import content, just like you would do for any other file format.

For more detailed step-by-step guides on how to create a translation memory, please refer to the Raiverb manual here.

Translation Memory Tools

 

Best Practices for Managing Translation Memory

We have mentioned already that Translation Memories need to be maintained. There are two main elements to keep in mind: Segmentation, and Revisions.

Segmentation means how your text is split into segments in both the CAT software, AND the Translation Memory. CAT softwares will use matching algorithms that will compare two segments to determine its relevance. For instance, if you want to translate a document written in Word, you will most likely work with paragraphs, more than with split sentences. But if you work with update notes, subtitles, or any type of lists (think update notes, for example), you will most likely want to work with single sentences.

If the segmentation does not match the source text you are working on, the CAT will struggle trying to find relevant matches in the Translation Memory.

Tips for Maintaining Consistency Across Projects

With time, translation memories can become outdated or contain a lot of strings inherited from old content that are not relevant to a project anymore. Regular cleanup is necessary to maintain the quality and relevance of your TM. Modern CAT tools introduce such tools, there are also standalone software to do the job.
Raiverb will introduce, from version 1.2, an editor for your Translation Memories.

Updating and optimizing your Translation Memory is an ongoing process that ensures it remains accurate and relevant. Begin by regularly importing translation memory files and removing duplicate or outdated entries. This not only keeps your TM clean but also improves its efficiency by reducing clutter.

Additionally, consider merging smaller TMs into a larger, more comprehensive one to create a centralized resource that can be used across multiple projects. This remains a useful trick for optimizing translation memory usage.

Common Mistakes using Translation Memory

Avoid to trust blindly a translation memory and verify its integrity before importing it into a project. They are a powerful tool, but they remain a helper in the process, not a replacement for proper attention.

A poorly maintained or outdated TM can lead to the repetition of errors, such as bad translations or typos, across hundreds of segments. This not only compromises the quality of your translations but also undermines the credibility of your work. Always review and validate your TM to ensure it meets the required standards before use.

TM use case in game localization

FQAs and TLDR about Translation Memories

If you still have questions, they will hopefully be answered below!

Visit here to read more general FAQs on Raiverb!

Where are Translation Memories most useful?

If you're looking at large projects with lots of repetitions, these will certainly be a life saver! This is often mostly seen in game localization, or software, with a lot of UI elements and/or dialogues that need to remain consistent across different sections of the game. They are also a great help, legal documentation, medical translations, technical manuals, marketing materials, and e-learning content, where maintaining consistency in terminology and style is critical.

Is there a difference between XLIFF and TMX as Translation Memories?

Yes, but they're rather minimal. Both are based on the XML standard, therefore both use the same "language". One of the great advantages of XML, is that formats are human-readable. Which means anyone can open and edit a TMX or XLIFF Translation Memory with a text editor (which does not mean it is easy or convenient, but it's possible), and change what they need to change. XLIFF supports several languages in a same file, which is theoretically an advantage over TMX. However, in practice, not all CAT tools support multilingual Translation Memories. Also, this tends to over-complicate projects and over-saturate the files, making them hard to maintain.

Why CAT tools tend to use their own formats, and not Translation Memories, to save content?

Some, if not most, CAT software use their own formats to store translation information. For instance, Trados uses SDLTM. Raiverb 1.2 will introduce its own open format as well. But why? Aren't TMX or XLIFF files enough, if they can store translations? Sadly, no. These formats were primarily designed to be exchange formats, which means, they are an intermediate standard to guarantee interoperability between different software. By design, they are limited in the data they are designed to hold: They can store a source text, a target text, an author, a date, some comments and other metadata, but little else. Modern CAT, however, need to store more data in order to deliver modern, advances features. For instance, professional teams will want some sort of tracking and revisions history features along with their translation data. TMX and XLIFF cannot do that.

Why not make a super Translation Memory of Everything Ever and be done for eternity?

If only things were so simple. The reality is that the primary use of a Translation Memory is to maintain consistency, and consistency is driven by the style of the project you are working on. It is impossible to have a super memory of how to translate anything, because the result is not only driven by the source text, but also by the context.

Is Raiverb easy to use? What about for beginners?

Setting up translation memory for beginners can be difficult sometimes. That is why Raiverb offers flexible input methods—such as copy-paste, Excel integration, and direct translation center support— to eliminate the complexity often associated with traditional CAT tools, allowing beginners to focus on the translation itself rather than technical hurdles. Additionally, Raiverb leverages AI translation to let the machine follow relevant segments for you if exceeding a certain reliability threshold. This allows you to fully leverage TMs in a way that maximize your productivity without compromising quality.

Can I edit Translation Memories in Raiverb?

This is coming very soon, if you are not only looking at how to create a translation memory, but how to edit one. We are building TM Manager, which will be a revamp from the current "Archives". In the future, the TM Manager will support editing of imported TMs as well. In the meanwhile, the TM converter will allow you to import any content and turn it into a Translation Memory. It's slighly more cumbersome that using a proper editor, but it's still possible.

What languages are supported in Translation Memories? How many entries are supported in Raiverb?

There is no limitation for which language is being used in a Translation Memory. Users can usually define their own language pair (see localization country codes here) for their TMs. Even with AI translated and OCR entries, users can still modify and correct the language codes as needed. There is also no limit of size supported. Be careful though, because bigger isn't always better when it comes to Translation Memory. Too large memories can bloat the matching algorithm, and show irrelevant segments.

Related Posts