Voice tech is probably the closest we’ll ever get to wizardry. Want to cancel all alarms on a lazy Sunday but you’re already in bed and away from your phone? Say the words and your weekend schedule is reset. Like casting a spell.
It’s true – we’ve advanced enough to control entire environments by muttering under our breath. But is it here to stay? What happens when the novelty wears off? Here we examine the rise of voice commands and a trend that may have reached its tipping point:
Last week 85-year-old grandmother Maria Actis took the internet by storm, declaring that the Google Home is “a mystery” and she’s “scared.” It’s definitely one of the funnier viral videos floating around the web at the moment, right up there with spitting goats and toddlers talking like adults. Talk about drama! Actis challenged the smart device to tell her about the weather and leapt up from the table in surprise when it responded with an actual answer. She eventually gave up, exasperated that it wouldn’t play her favorite Italian song despite her very clear instructions: “OK Goo Goo, do it!”
Actis’ reaction to her new voice assistant mirrors our own reaction to new user interfaces. We’re still learning how to interact with computers in the most intuitive way possible. We’ve come a long way from MS-DOS and the iPod click wheel to the touchscreen, but it’s still a screen. Commanding a computer with just words? It negates the need for a physical user interface at all. It may even make the touchscreen obsolete. The fact is, the new user interface therefore involves something much more complex and abstract – natural language.
The use of natural language almost exclusively dictated the Escape Room experience created by Amazon at New York Comic Con to promote its 2018 “Jack Ryan” series. Only Amazon’s smart assistant Alexa could help solve each puzzle room. Participants had to verbally instruct her to crack riddles, make crucial calls and even turn off the lights to reveal clues.
Smart assistants also have use cases that are less about convenience than about added entertainment. An app like Pikachu Talk lets you converse with the cute yellow Pokémon in its native language. Yes, the one consisting entirely of 3 words: Pi, Pika and Pikachu. But intelligent personal assistants like Alexa are now expected to carry out coherent, multi-threaded conversations. Amazon is seeking out company and university researchers who’ll build bots that can hold conversations for longer than 20 minutes.
The internet of things has made it possible for our connected devices not only to talk to one other but also to have two-way conversations with us. Perhaps the best way to describe a Google Assistant or an Alexa is a “digital butler”, because these devices remember our preferences, anticipate our needs and do as they’re told. With Google Home Minis and Amazon Echo Dots costing less than $50, no wonder they’ve been dominating our shopping carts and holiday wish lists for a while.
Our new relationship with sound is also reflected in an upcoming partnership between Sonos and Ikea, the third step in Ikea’s Home Smart initiative after wireless charging and smart lighting furniture. Sonos speakers will soon deliver the dings in our doorbells, the rings in our intercom systems, the chimes in our washing machines and the sirens in our smoke alarms. Our vision of a smart home calibrated by connected devices is not a new one, but Ikea’s new sound-focused series seals the deal.
Voice technology changes the way we search. We tend to be specific and detailed when we speak. Pair these long-tail search queries with our location data and suddenly the companies running the virtual assistants have much more context to work with. We’ve just given them untold amounts of information on our daily habits, purchase behaviors and interests that can be fed back into our phones and browsers in the form of advertising.
Voice technology changes the way we talk. If you’re a non-white, non-native English speaker you know what I’m talking about. Have you ever noticed yourself adjusting your accent to talk to Siri just so she wouldn’t struggle to understand you? The slight change in your voice and inflection represents a big shift in the canons of natural language. It could even compare to the recent decline in regional American accents following mass media and globalization to favor a more homogeneous accent, devoid of unique character or identity.
Voice technology changes how we view safety and privacy. Alexa can call the cops on you. Last year in a domestic dispute a man threatened his girlfriend with a gun, asking her if she had called the authorities, at which point the device actually called them and the man was taken into custody for illegal possession of a firearm. The incident serves as an Orwellian reminder that smart devices are always listening. But that may well be a small price to pay to ensure our safety.
Now that we finally have the capability to play music with our eyes shut, we face a more pressing question: What next?
Using our fingers on screens is already starting to feel outmoded, and voice-enabled devices are just the beginning. Our user interfaces may eventually shed all physical footprint. Instead, to get things done we’ll input our natural language, gestures, gaze and even brain activity into these vessels of data. Will this truly hands-free world bring us technological salvation? Or does Actis have good reason to fear her Google Assistant?
Brands need to participate in this new-found liberation in creative and non-intrusive ways. If Alexa can be quirky and Siri can be sassy, even brands can turn their tone of voice into a real voice. These latest developments show how people not only care about how a brand looks but also how it sounds. Companies have an opportunity to identify pain points in our new experience and find their voice in a constantly connected world.