Skip to main content

See No Evil, Hear No Evil: Introducing Amazon Alexa Voice Service (AVS) Security Program

Written by Andrew Jamieson

“Computer …. Oh, computer …”

Forgive my nerdy-ness for a second, but that scene from Star Trek: The Voyage Home – where Scotty is trying to dictate to the computer the method for constructing ‘transparent aluminum’ – has always stuck in my mind. At the time it was funny because of course you can’t talk to a computer, and of course transparent aluminum is not a real thing. However, like many things from science fiction, the vision of a future where our voice is the main interface to our computing systems may be coming true.

Star Trek has brought us many predictions of the future – from tablets to warp drives (at least, theoretically). Even the mobile phone could be said to be presaged by the Star Trek personal communicators, which rapidly morphed into a ubiquitous ‘always on’ voice command system that is now mirroring the rise of the ‘smart speaker’.

Far from a flash-in-the-pan gimmick, ‘smart speakers’ – such as the Amazon Alexa, Google Home, and Apple HomePod – are expected to experience continued growth in market share over the next few years. Analysis on the exact size of the market varies, of course – see here, here, and here – but estimates peg the market as up to $30B in 2024, which is no fad in anyone’s language.

But for this market to grow, the use of voice as a primary interface method has to become as ubiquitous as the keyboard and mouse is for our PCs, or the touch screen is for our phones. To do this, the large vendors of ‘voice assistant’ technology are looking at 3rd party developers to drive this interface into new verticals and distribution channels. Along these lines, and the primary topic of this post, Amazon have their ‘Alexa Voice Service’ (AVS) program, which is designed to provide quality assurance over such 3rd party devices.

As part of this program, Amazon have security requirements that must be tested by one of the AVS security labs – such as UL – to validate that minimum security measures are put in place. This is a 'blackbox' evaluation, essentially a penetration test, on the product that aims to validate common issues have been remediated in the end product. What kinds of issues may exist? Well, our IoT Top 20 provides a good general overview of security best practice, but specifically for these voice activated smart home devices common issues may be:

  • Insecure setup. It’s common for such devices to expose a wireless access point during setup to allow for the user to connect to their mobile devices. This must be correctly secured, using strong ciphersuites, authentication, and ensuring that any exposed access point is removed as soon as possible once the user setup process is complete.

If this is not done, it can lead to anyone else within range connecting to the product through this access point, potentially providing them further access into the users home Wi-Fi network.

  • Unauthenticated firmware updates. Most computing systems need regular patching to remain secure, and smart speakers are no exception. However, it’s just as important to make sure that the update process is secure, to prevent this becoming a vector for attacks. Software updates should be authenticated using strong cryptography across the entire image, and implement anti-roll back features wherever possible.

If this is not done, it may be possible for someone to deploy malicious firmware to the device, or ‘roll-back’ the current firmware version to an older version with known vulnerabilities. This sort of issue is particularly concerning when the product has a centralized location for updates, which may be deployed automatically. This may allow for a malicious party to deploy their own software across the entire population of installed devices!

  • Vulnerable local services. Smart speakers will often have additional services that are exposed, over Bluetooth or wireless connections. This may be a management / configuration interface, user authentication for specific services, or simply for data / control of the system. Such services must be correctly configured, using secure and authenticated connections.

If this is not done, it may allow for a malicious party to ‘hack’ the system once they are within the network it is on. This may seem like a lesser problem – if the party is already on the network, then surely they can be trusted? But this is not necessarily the case; smart speakers are already being deployed into hotels and other commercial locations, where the network users may have hostile intent.

  • Exposed cloud services. Smart devices may often come with some features exposed in the ‘cloud’. This may be in a database, a remote control server, or many of a number of different options. It is vital that any cloud interface is properly secured, as this is exposed to all on the Internet, and as a concentration point may allow for a single point of attack to compromise a large population of devices.
  • Vulnerable software. It’s common these days for the firmware of a device to contain a large portion of open source or commercial software that is obtained from third parties. This software must be kept up to date to ensure that it is not exposing publicly disclosed vulnerabilities, and processes should be in place to validate the authenticity of any software obtained from third parties before it is integrated and used in a formal release of software.

There are many other things device vendors should be aware of, and of course much more detail to be considered even in the items I have described above. However, the fundamental core is to have security processes and testing embedded into the development of your product. UL can assist with understanding the requirements, and helping formulate processes and procedures to ensure that your solutions will be developed to meet the requirements of the AVS program in the most time and cost efficient way possible.

I firmly believe that voice is poised to become a regular and integral part of our interaction with the devices around us, and at a time when we find that even transparent aluminum is actually a thing now, perhaps the most crazy idea from that scene in Star Trek is that Scotty would have known how to use the archaic interface that is the computer keyboard. However, as we place more intelligence and trust in our user interfaces, we need to also work to understand how these systems can go wrong, and put in place controls to ensure they don’t go wrong.

One to beam up.

Speak with a Voice Security Expert Now