Speech Recognition Technology is one of the developments that simply won’t stop speaking to the imagination. Complete interaction between objects and users has been a feature in many science-fiction movies, and its possible applications are truly exciting. Whether it is giving commands to any electronic object or dictating long speeches that get transcribed instantly, speech recognition has the potential to make life a lot easier.
While the technology has been around for over 2 decades, its development has reached a plateau since the late 90s. Google Search by Voice and Siri has certainly made the technology more popular, but its accuracy levels have not changed much over the past few years. Research indicates that speech recognition accuracy levels skyrocketed from 48% to 81% in 1995 and 1999-2001 respectively, to stagnate since.
Of course an accuracy level of 81% is still incredible and will suffice in many applications, but what happens when you start replacing transcription services with speech recognition technology?
Unfortunately the accuracy levels are simply not up to par with current audio transcriptions. While speech recognition can go a great way in dictating instructions and emails, the process of providing academic, conference call, or even market research transcriptions is simply out of its league. Traditional transcription providers put transcripts through a series of proofing and quality control processes, to ensure a level of accuracy that speech recognition can’t match. When it comes to captioning and typing out transcripts, it appears speech recognition is not the stand-alone solution that can go without re-editing from a trained transcriptionist.
The voice related technology does have its proponents in the transcription industry and many consumers have already adopted it. One often cited benefit for transcription users is the potential cost reduction. Unfortunately many people underestimate the implementation costs and fail to oversee that many of the transcribed documents still need extensive re-editing and proofing to ensure the quality of the transcript. It is true that in theory you may reduce labor costs but in practice you will find that you’re nowhere near your target productivity level and cost savings.
Speech recognition presents some great potential applications and has certainly drawn enough attention in the past few years. It does look however like its accuracy development has hit a wall., it is currently far from being sophisticated enough to completely replace human operators, and unless this changes, the technology is not a viable substitute for transcription services.