How XTM Cloud enables linguists to work more efficiently by leveraging the power of AI
How XTM Cloud enables linguists to work more efficiently by leveraging the power of AI illustration
Melissa Favre-Lorraine
AuthorMelissa Favre-Lorraine
Reading time 3 minutes

A key requisite for a TMS is to be able to retrieve Translation Memories in the most efficient way. Indeed, the more these are leveraged, the more direct savings are realized, enabling linguists to work more quickly while creating high-quality, on-brand content. This is like the engine in your car: with a V8, you can accelerate much faster than with a 2 cylinder for example.

From Q-gram to Weighted Token Levenshtein

At XTM, we used the standard approach in the market, Q-gram, which analyzes sentences on a character level and looks for matching character patterns. The issue? It looks only for matching patterns, without being able to take syntax into consideration. This didn’t give us enough leveraging power, which wasn’t good enough for us.

Therefore, we decided to develop our own, proprietary algorithm: Weighted Token Levenshtein (WTL). This unique algorithm calculates fuzzy matches by taking into account and recognizing the syntax of the segments in the localization project. This enables us to retrieve more matches than any other TMS.

Continuous improvements

Developing our own solution not only enables us to deliver a more powerful alternative to Q-gram, it also allows us to continually improve it. Indeed, the latest version of WTL, introduced in XTM Cloud 13.0, leverages up to 25% more matches from your TM than previously. How, you may ask? By addressing discrepancies between the translated sentence and a potential TM match: for instance, when the translated segment is composed of the same words, but in a different order.

Linguists should not have to manually translate “For more information, check our features page” when “Check our features page for more information” has already been translated. WTL is able to retrieve the previously translated segment and provides a 75% match where no matches were leveraged previously. This is a huge improvement which will save linguists considerable time and effort.”

Doctor Rafał Jaworski, Linguistic AI Expert.

Another situation we tackled relates to segments which differ only by proper names, say a geographical place. For instance, a project manager for a travel company has a 200k word document including many country and city names. They need to get all these translated, without having to pay for words which have already been translated: “Visit Paris” and then “Visit Hanoi”. With WTL, they won’t have to. In this example, the algorithm will be able to recognize that these segments are almost identical, providing a 92% match.

In both these situations, Weighted Token Levenshtein will retrieve those matches, ensuring that the TM is fully leveraged. Standard fuzzy matching algorithms, however, would discard those potential matches completely, and linguists would have to translate those sentences manually, from scratch, and charge for new translated words. Rafal gives us an example showing how this impact the project costs

For simplicity, let’s say that a no match word is charged at $0.10. Therefore, a new translation for “Enjoy your stay in Chicago” would cost $0.50. With our algorithm delivering a 97% match, linguists would only have to add in “Las Vegas”, which would cost $0.10 only. In this case, the translation would be 5 times cheaper! Here, the savings come from the fact that our fuzzy matching calculation score algorithm was able to retrieve the TM match, as opposed to standard algorithms which would assign too low a score to this example, or possibly fail to retrieve it at all.”

Doctor Rafał Jaworski, Linguistic AI Expert.

Let AI place inline tags where they need to be

AI can also be used to save linguists time by reducing the need to place inline tags. Indeed, manually placing those into the correct location is a time-sapping activity, and linguists should be able to solely focus on their translation. Following user-feedback, our AI-powered auto-inline feature has been improved to deliver 73% of accuracy (increased from 62%). This enhanced functionality leverages the latest XTM NLP advancements in deep translation analysis, delivering improved results which will enable linguists to spend more time focusing on their localization projects, and less time manually correcting inline tags placement.

Get more from Machine Translation with AI-enhanced TM technology

XTM recently partnered with SYSTRAN to strengthen their alliance between TM and Machine Translation, creating an opportunity to leverage existing resources in an innovative way. Fuzzy matches coming from the Translation Memory are now directly sent to SYSTRAN, enabling TM and Machine Translation to work together to deliver the best match possible.

This functionality, powered by SYSTRAN Neural Fuzzy Augmented (NFA), enables linguists to save time by leveraging existing assets. Artificial Intelligence and Machine Translation work together to increase clarity and unlock value in your language assets. 

Find out more about AI-enhanced TM

AI development continues at a good pace at XTM. In the near future, we’ll be able to deliver AI-enhanced detection of non-translatable words, along with AI-powered automatic translation reviews. As Rafal concluded:

AI techniques have now become persuasive in most of the modules of XTM Cloud, we are on a mission to do more so our clients see direct results and linguists can focus on what they do best”

Doctor Rafał Jaworski, Linguistic AI Expert.

John Rafal and other linguists experts at XTM LIVE, the translation technology event of 2022,  in San Francisco on April 27th and 28th for in-depth sessions on AI, and engage with the people and the innovations that shape the localization industry.

Register for XTM LIVE now