Guest blog by Alan Morris. Alan is the co-founder and brand ambassador of Retail Assist.
It’s somewhat ironic that I’m writing a blog about voice recognition technology, at the beginning of 2019, using a computer keyboard that’s an adaption of the original typewriter built in 1873. The QWERTY keyboard has been the input device of choice for millions of computer users across the world for nearly 50 years and it remains central to human-computer interaction today. But, whilst it has a past, does the keyboard have a future in relation to how we use technology in our everyday lives?
If you consider that the average person can speak 150 words in a minute but in the same time frame can only type 40 words, you realise that speech is a human’s best form of communication. How many times have you thought that some things are “easier to say, than to write”? Given this, you’d be forgiven for questioning why computers weren’t designed to respond to the spoken word from the get-go: surely, that would have been more intuitive?
Well, technologists have been trying to get computers to recognise and respond to the human voice since 1952. One of the very first examples was ‘Audrey’, who could distinguish ten numbers between 0 and 9 and, whilst at the time this was acknowledged as significantly advanced, compared to the human brain it was somewhat lacking. The problem for the early pioneers was that the technology was very computer resource hungry – which meant costly – so widespread adoption was unlikely.
As time moved on, so did the technology. In 1962 IBM launched ‘Shoebox’ which could recognise a vocabulary of 16 English words and by 1976 ‘Harpy’ later increased the word count to 1011. Continued advancement saw the introduction of faster microprocessors, which meant the opportunities for voice recognition grew.
By 1997, ‘Dragon Dictate’ allowed users to speak at 100 words per minute – two thirds the normal human speed. Impressive, yes, but it took 45 minutes to train the program and it cost about $695. In 2010, Google launched personalised recognition on Android devices which would record different users’ voice queries to develop an enhanced speech model. It consisted of 230 billion English words. One year later, Apple introduced the world to its voice-activated digital assistant, Siri. Not only intelligent, Siri was funny too, if asked the right questions or given the correct commands.
So, over a 66-year period, voice recognition technology moved from being able to distinguish between the ten numbers to providing us with a voice activated digital assistant that listens to our speech and takes specific actions based upon our commands. Some predict that by the end of 2021, more than 1.6 million people will use voice-activated digital assistants on a regular basis. But if it is ever to become a ‘can’t live without technology’ we are going to have to accept that this technology has to offer more than just timing the boiling of an egg, playing our favourite songs or reminding us of the time our train to work leaves in the morning.
Read the second part of Alan’s blog on voice recognition technology on Wednesday 13th February.