Curiosity got the better of me: I have given AI interpreting a go. 

Here’s my honest review as a professional conference interpreter.

I was contacted via e-mail by a SaaS company offering their newest API for “live voice translation”. It claimed it could “open more revenue-generating opportunities” for my business. I would essentially pay them to re-sell their software to my clients (I do love a good long, bloated, middleman-riddled supply chain). 

On their website they offered a demo of the software, so what did I have to lose?

I put it to the test feeding in practice speeches taken from the European Commission Speech Repository. These are the speeches used by the EU in their interpreting accreditation test, considered as extremely stringent and with a low pass rate.

I chose two from my language pair: Italian into English, and English into Italian. Read on for the results!

Italian into English: oh boy

The speech was dense, very abstract and the speaker was…well, very Italian: not very structured, with digressions, lost train of thoughts, flowery, redundant. This was, essentially, a typical example of what we deal with day in and day out in the booth.

AI just couldn’t cope – the live captions picked up 60-70% of the original speech. As it was missing links and a clear train of thought, the “interpretation” was made of disjointed sentences that had very little to do with the source. 

Verdict: unusable.

English into Italian: what you’d expect from AI

The speech was “beginner” level, with a clear structure, a straightforward topic and a slow-paced delivery. 

AI did its best, but all its inherent flaws shone bright: mistakes in the speech recognition lead to nonsensical interpretation and gaps, some English was left in (and it was hilariously mispronounced by the synthetic voice) and sentences were split in the wrong place. Moreover, the delay between the speaker and the interpretation was at least two sentences (that’s an eternity!). Finally, some content was picked up by the same-language transcription but wasn’t interpreted out loud.

Verdict: a training interpreting student would do better.

 

Human vs the machine in the interpreting sector 

It just goes to show that AI can’t do it all: it’s not up to par to deal with real-life speakers – who are a far cry from the slow-paced, clearly-enunciating, reading-off-a-pre-prepared-script speakers you often see in demo videos. To put it simply: it doesn’t deliver even when dealing with speeches that can be tackled by interpreting students.

That doesn’t mean that, as in interpreter, I won’t be using AI. There are many interesting ways I can put it to good use before, during and after assignments. It just can’t do my job, because it doesn’t understand what it is being said, it just processes tokens it was trained to switch from one language to the other based on probability.

Fancy a demo of how human interpreters work and create added value for your business or event? Let’s talk!