Are you talking to me?
Phones, cars, speakers and appliances: Google outlined ambitious plans to take voice assistance everywhere at its Google I/O developer conference in Mountain View, Calif. this week. But as the company is bringing voice input to potentially dozens of devices around your home and beyond, it also faces some interesting challenges unique to this new interaction model.
For instance: If all of your devices listen, how do they know which one you are talking to?
That problem isn’t a mere hypothetical anymore. Google announced this week that its assistant is already installed on over 100 million Android phones and Google Home speakers. The company also announced partnerships with 19 hardware manufacturers and chipset vendors to take its digital assistant beyond its Google Home speaker and mobile phones and launch it on stereo systems, cars and other consumer electronics devices, some of which will be available before the end of the year.
Google also launched an app for the assistant on iPhones, where it now directly competes with Apple’s Siri, and gave developers tools to bring the Google Assistant to their own devices and hardware hacks. “Speakers, toys, drink-mixing robots, whatever crazy devices all of you think up,” joked Assistant VP of Engineering Scott Huffmann with his audience of developers at the conference Wednesday.
All of these devices will by design answer to the same so-called hot word — “Okay Google” — to jump into action and answer your questions and demands. So how will they know which one should respond?
The solution to this problem is to treat voice assistance like a conversation, explained Google Assistant Product Lead Gummi Hafsteinsson during an interview with Variety. As technology prompts users to be more conversational, it also has to get better at participating in these conversations like humans would.
When people talk to each other in a group setting, they don’t address each other by name all the time. Instead, we use more subtle cues like eye contact, body language and even personal history to figure out who is talking to whom.
Google’s vision for its digital assistant also comes without first names. Instead, you just say “Okay Google,” and the assistant comes to life on phones, Google Homes and other devices. And increasingly, those devices also use cues to figure out whose turn it is to respond to your commands.
Google already uses the sensors of your phone for that very purpose, said Hafsteinsson. If the phone is in your hand, chances are you want it to respond. If it has been resting on your desk, you may be talking to your Google Home instead.
On the other hand, if you ask Google’s Assistant to send a text message, you are likely addressing your phone, and not the smart speaker in your kitchen. And a request for a Bloody Mary is probably best answered by that (still hypothetical) drink-mixing robot. “It uses context,” Hafsteinsson said about the Assistant’s attempts to answer your requests with the right device.
The ultimate goal was to let users seamlessly switch between different input devices simply by analyzing their intent, and the context of their requests, he said. “As you move through different contexts throughout the day, you will be able to continue the same conversation.”