Introduction
A UI string in a software application has a unique identifier. Ensure that your localization environment supports an ID-based update process.
This is the first article in a series about various aspects of software localization, such as visual context during localization, software documentation, and language acceptance testing. This article describes the, in my opinion, most important feature of software localization tools: ID-based localization.
As a consultant and solutions architect with a specialty in localization, I help companies achieve their release goals by launching localized products on time and with the right quality to various markets. A localized product includes but is not limited to, online help, training materials, marketing materials, instructions for use (IFUs), labels, and UI (user interface) texts.
Software UI texts are a small part of the total translation volume, typically around 5%. Due to this small volume, companies tend to use the same translation management tools to translate flowing texts in manuals. Unfortunately, these tools often lack important concepts to reliably localize UI texts. One of them is ID-based localization.
Software localization tools help to prepare translation jobs faster and more reliably when developers send updates of application source texts. This article compares a text-based approach using a TM (translation memory) and an ID-based approach.
Translation memory
Assuming that we have a training manual translated into five languages and that all segments are stored in a TM. The author added a new chapter and made some changes to source texts. The modified document needs to be translated. Before sending it to the translators, it is pre-translated using the TM. This process segments the document and looks-up existing translations.
If the current, preceding, and following segments are the same, then a so-called in-context exact match (a.k.a. ICE or 101% match) is found, and the translation can be used without any problem. Penalties for translation candidates are given if the source, preceding and/or next segment are different. This results in a so-called fuzzy match, for example, indicating that the string is 87% the same.
Translation companies usually provide quotes to translate the strings with fuzzy or no matches. This way, the customer does not pay for the already translated content. Larger companies usually manage translation jobs themselves and have TM’s in place.
This approach may work well if source texts do not change a lot, which is common for flowing texts. Translators can not see changes that authors made in the source document and changes that other translators made in the past. Consistency between terms in the documentation and UI must be ensured. Often, a terminology system is used in addition to the TM. A main challenge of a (source) text-based pe-translation is there is no guarantee that the right translation candidate is selected. Quality assurance steps may be required before sending materials to the translators and when receiving them back.
Translating UI texts using a TM
Pre-translation using an ID-agnostic TM results many fuzzy matches. Some TMscan store a string ID with a translation. However, if there are multiple translations for the same string ID, then the TM may still pick an old one. It is simply unreliable, especially for larger projects.
Another problem occurs when the same (short) source text occurs multiple times in the application with different meanings. One of my customers used the term ‘application’ (all having different string identifiers) at dozens of places, having different meanings, such as ‘software application’, ‘job application’, ‘apply settings’, etc. The TM often took wrong translations based on the source text with each update of the application, causing days of work to pre-correct before sending the translation jobs out for translation.
Using a text-based TM tool to translate ID-based UI texts impacts each person involved in localization directly or indirectly. Involved stakeholders are developers, testers, project managers, language testers, marketing, leadership, and most important: the organization’s customers. The organization of the end customer has the main burden in terms of costs, quality, and time-to-market.
Most of the internal costs are caused by the amount of preparations that must be done before translation jobs can be sent out. Wrong pre-translations made by the TM need to be manually corrected. This is laborious and boring work that can easily result in errors and costs a lot of effort. I have seen translators quitting their jobs because of this.
The right tool to translate UI texts
The right tool must at least support proper ID-based localization. A software text has string ID, source text and translations in various languages. Each translation has a status, such as untranslated, translated, reviewed and final.
When developers change a source text that is already translated and validated for many languages, the status of those translations shall be degraded to translated. Translators see the original source text, modified source text, and the current translation for that string-ID. They will be able to quickly adapt the translation.
When developers send new source files to be translated, the project manager updates the (ID-based) project with the new files. Modified and new strings are immediately recognized. Translations for strings where the source text did not change remain untouched. The project manager instructs translators to translate that subset of new and modified strings. Note that translators can still use the TM to get suggestions during their translation job.
The ID-based concept keeps a project manager in control of the translation project. It ensures that existing translations will stay the same if the source text and/or string identifier do not change. Translators have historical information about the source texts and translations. This concept saves time and reduces the number of errors. It helps to focus on the new and modified strings.
Return on investment
Using the wrong tools and concepts will eventually result in translated materials, but it takes much more time, results in subpar quality and is frustrating. Software translation using a TM tool for pre-translation skyrockets internal costs and leaves its employees with uncreative tasks to fix problems and create patches.
A localization tool prevents pre-translation errors and eliminates correction work before translation jobs can be sent out. The cost and time savings depend on the volume, expressed in the number of total source strings, number of target languages, average number of new and modified strings per release, and the number of releases per year. The number of errors that are made using a TM tool is a percentage of this. Each error, that could have been prevented using an ID-based tool, has an average cost of two internal hours.
The impact for smaller volumes may not justify the costs for a software localization tool. However, the investment will earn itself back quickly if the number of releases, texts, and/or languages grow.