MT = magic translation? A journey through all that is possible and impossible with this tool

“Machine Translation” (MT): all of us have most likely used solutions like Google Translate at least once, be it at work or in our private lives.

Nowadays, MT-solutions seem to be becoming more and more established. Many, however, are skeptical about this technological development – some rightly so.

Our aim is to clarify this topic regarding what works, what doesn’t and what has to be taken into account when employing “artificial intelligence”. For this reason, we invited our colleague and in-house Language Technology expert Madelein to a short interview, hoping to get rid of some misconceptions and catch a glimpse of what goes on under the “bonnet” of machine translation.

Good morning Madelein,

I’m glad you found the time to introduce us to the world of machine translation. Being a Language Technology expert, you’re confronted with this topic on a daily basis at tsd.

tsd: When does it generally make sense for the customer to use MT?

Madelein: MT applies to a great variety of use cases. Nowadays, an infinite amount of content is created every day that would not be translated without MT.
But, generally, it makes sense to use MT if you have a large amount of translation work, if you need translations to be processed quickly and/or you have texts that occur frequently.

tsd: What kinds of MT are there?

Madelein: A distinction is generally made between “Custom MT” and “Out-of-the-Box MT”. Here at tsd, we employ KantanMT as our solution for Custom MT, and the commercial versions of DeepL and Google Translate as our Out-of-the-Box solution.
The great advantage of “Custom MT” is that we are able to provide our own support for the subject-specific content, which we call “MT Engines Training”. By doing so, we can optimise the output of the MT, which leads to better results than Out-of-the-Box on a qualitative level.
Furthermore, using “Custom MT” allows you to manage tsd’s data security in terms of content and output – a major benefit if you are not keen on sharing contents on the World Wide Web.

tsd: Custom MT or Out of the Box – which would you recommend?

Madelein: Generally, the more specific customer data I can add to an engine, the higher the chances are that the output has a better quality than an Out-of-the-Box MT. This answer doesn’t apply to every case, though. It always depends on what kind of use case we are dealing with as well as on the customer’s requirements and expectations.

tsd: What advantages do I have as a customer if a machine translation-solution is included in the translation process?

Madelein: MT applies to a great variety of use cases. Nowadays, an infinite amount of content is created every day that would not be translated without MT.
But, generally, it makes sense to use MT if you have a large amount of translation work, if you need translations to be processed quickly and/or you have texts that occur frequently.

tsd: Real-time translation is a good key point. Could you briefly explain again what real-time translation means and how it is used?

Madelein: Machine translation in real time corresponds more or less to what we see in Google Translate, i.e. boxes into which you can insert pieces of text that are then translated within a few seconds. In some cases, the function is embedded directly, so that the reader cannot recognize that this is an MT output.
In case customers prefer using their own customized engines, we can activate them in our intranet for real-time translation applications.

tsd: We are often asked by customers whether we can translate certain languages with MT. Which language combinations can be translated with MT? Are there languages that suit MT more than others?

Madelein: Theoretically, all languages are suitable for editing with a machine translation. In practice, it depends on the quantity of data for the given language pair, i.e. whether translation memories, term banks and stock data are available or not. European languages and languages with many speakers are generally easier to work with. By “lower resource languages”, i.e. languages that have a little amount of data, MT results must be expected to be worse.

tsd: Many customers are also interested in the amount of texts that can be translated with MT on a daily basis. Would you be able to set a benchmark?

Madelein: It would be an exciting experiment but it hasn’t been tested here yet, as far as I know! But to give you an idea, a text of 35,000 words could be translated in 20 to 30 minutes, leaving out post-editing, which means having the machine-created content checked by a professional translator.

tsd: What has been the oddest request so far?

Madelein: That’s a difficult question… I can tell you what I reckon to be a request that could only be met as if by magic.
Computers are not very clever. AIs may be a little cleverer, but they’re not much smarter than a normal computer. Conversely, this means that if you run texts through a program, you can’t expect them to be correct in every part. This applies especially to cases involving optical character recognition (OCR) and raw machine translations. A request that can only be met as if by magic would be an OCR with subsequent machine translation, expecting it to be fully correct without having it proofread by a human.

tsd: The following question sounds fundamental: Human or machine – which translates better?

Madelein: Machine translations are generally not bad and humans do not always provide error-free translations. Yet humans are still better translators.
Very good translations can be achieved by means of machine translation. Technology should be rather regarded as a supporting tool for a human translator than its replacement. Once you are aware of that, you can lower both processing time and costs significantly without getting upset about a mediocre raw machine-driven translation because of misconceptions.

tsd: Translation mistakes have become pretty notorious. What are the typical mistakes one should be aware of with MT?

Madelein: Oh. Well, the beloved bloomers, especially popular on the Web. There are, in fact, typical MT mistakes that do not fall into the category of typing errors commonly made by humans, but rather other kinds of mistakes:

MT duplicates or triplicate words sometimes, as in the following examples:
“Dokumentation zur Dokumentation der Manure Sensing-Dokumentation“, and when “bidirectional” became “Bibibi”.

In addition to that, correlations are often incorrectly recognised, so that “Supplier was inconsistently applying grease […].” became “Der Lieferant wurde ständig mit Schmierfett versehen.”

Very often, however, content is simply “forgotten”; the target text looks perfect and error-free, whereas the source text contains something completely different…

tsd: Last question: Is it possible to use MT as a single tool or would you rather recommend a combined use with other tools in order to compare the results?

Madelein: When it comes to processes where high quality standards are a priority, I would always recommend a combined use of MT and TMS (Translation Management Tool). The same applies when liability policies are involved.
Raw machine-driven translations are not perfect and our post editors (language experts) work best in a TMS interface. Within the TMS, the process can also be mapped in a better and more elegant way (for example, by pre- and post-processing the data) and through the symbiosis of both these tools, an optimal pre-translation can be achieved.
But generally, every specific application should be ultimately discussed with the customer and the process adapted to the concrete requests. To put it in kölsch, our dialect here in Cologne, “Jeder Jeck ist anders” (“everyone’s different”).

Thank you, Madelein! I think this gives us a good insight into the exciting world of machine translation.