Human-driven transcription services appear best equipped with 99% accuracy rate
The BBC’s run of errors started in February this year when its live subtitling service mistakenly produced a subtitle reading “Nigel Owens is a gay”, when it should have been “Nigel Owens is saying penalty and yellow card” in reference to the Welsh international referee awarding Scotland a penalty during a Scotland v England rugby match.
Another NSFW subtitle gaffe happened during the recent royal wedding in which automated subtitles read “beautiful breasts” instead of “beautiful dress.”
Human ear is better adapted
The BBC claims that the errors in the subtitles do not lie with its automated speech recognition (ASR) subtitling service, which it claims “produces accuracy levels in excess of 98%” - yet a survey from the World Economic Forum shows that whilst AI has a word accuracy rate of 95%, it will not yet be fully able to substitute human translation or transcription services.
In recent years, there have undoubtedly been major strides made in terms of AI transcription with AI-driven transcription services producing error rates of just 5.9%. However, language experts believe that the human ear is better adapted to recognising a broader vocabulary, different accents, and interlocked speech.
One of the biggest reasons for inaccuracies in ASR-driven subtitles is the narrow set of simple, short, command-based vocabularies surrounding interactions with bots, who rely on a dictionary-based vocabulary. As a result, they are generally unable to recognise slang-terms, colloquialisms, and interlocked speech.
Human-powered transcriptions still prefered
“The inaccuracies seen in the BBC’s ASR-driven subtitles are understandable,” explained transcription service provider GoTranscript CEO, Peter Trebek. “Although AI-driven tools are able to complement features such as ASR-driven subtitles, they are not yet able to recognise interlocked speech as found in sporting and social events, and neither are they yet able to recognise colloquial language or varied accents. This is why for the immediate future, the human ear will still serve as the standard bearer in terms of transcription and translation services.”
AI-powered transcription certainly has its perks when it comes to speed, which is likely the main reason it is used for live event transcriptions. However, human-powered transcriptions are still the prefered choice for those looking to avoid awkward mistakes.