Word of the Week – Natural Language Processing (NLP)

XTM International Word of the week

Word of the Week – Natural Language Processing (NLP)

In the context of this planet’s history a thousand years is a blink of an eye, but when we compare human communication in the early Middle Ages with the communication technology we have at our fingertips today, the difference is staggering.

The practice of medicine a millennium ago may appear grimly comical to a modern audience. Across Europe, people rarely saw a doctor and were more likely to visit a wise elder or even a witch who would offer herbs or incantations. The importance of the Church and the power of superstition meant that people often combined spells with prayers. While treatment of illness in Greece, India and the Islamic world was more advanced, the lack of communication and of authoritative information was, quite literally, a killer.

When we have no reliable information to guide us, we’re forced to rely on guesswork. Flash forward a thousand years. We now have a world of data to draw on, so how can we make the best use of it?

Natural Language Processing is a term for the technological process that aids computers in understanding human language. It teaches machines to understand the nuance of our communication. That isn’t easy. When we apply NLP to translation between languages, it becomes harder still. The rules that govern information sharing can be abstract and confusing. How many times have you seen a social media post that caused anger and conflict because people failed to recognise an attempt at humour? If a human audience can’t be relied on to decipher tone and intention in their native language, what are the chances of a machine capturing that level of nuance in multilingual communication?

With a world of data to draw on, the chances are improving all the time.

NLP applies algorithms to identify language rules and “teach” these rules to computers. Over the past year, XTM International have taken this process to a new level with the development of Inter-language Vector Space (ILVS).

ILVS indicates the approximate closeness between source and target words within a language segment. It offers a mathematically exact view of the accuracy of a machine translation, creating a precise calculation of how much work needs to be done in human editing and how much this should cost.

It mirrors the linguistic problem-solving of the human brain on a macro-level, analysing similarities between words across 250 languages and 31,125 language pairs, drawing on vast amounts of online textual data. Unlike our ancestors in the early Middle Ages, we do have reliable information to guide us, and the decisions made in the ILVS process are informed by 200 terabytes of data. That’s the equivalent of the Bible multiplied 40 trillion times, and the alignments are completed in less than one second.

Tomorrow XTM International Content and Partnership Manager Dave Ruane and AI specialist Dr. Rafał Jaworski will discuss these advancements and their benefits as part of a Global Saké ParlamINT discussion on Multilingual AI, NLP and Applied Machine Learning.

Rafał Jaworski is happy to report that he and his fellow language doctors are far better placed to make an accurate diagnosis than the witches and apothecaries of a thousand years ago.

“Modern NLP is based on Artificial Intelligence, which performs analysis of vast amounts of data. Each automatic linguistic decision made by a machine is driven by the expert knowledge of numerous human beings. It’s a bit like getting your diagnosis from one doctor and then seeking a second opinion in consultation with another. Except that when it comes to AI, you can consult millions of doctors at once.”

Natural Language Processing harvests and distils intelligence, enabling us to share information and benefit from the work and wisdom of others in ways that would have been inconceivable a few decades ago, let alone a millennium. If you need a language doctor, XTM International have the technology and the expertise, and we’re on call 24/7/365.