×

How Google Found Its Voice

Google executives talk gender, names and other key design decisions for the Google Assistant

A few years back, Google was actively exploring whether it should launch a male counterpart to Amazon’s female Alexa voice assistant.

“When we first launched the Google Assistant, we intended to use a male voice, just to be different,” Google Assistant product manager Brant Ward recently recalled.

However, at the time, text-to-speech technology was still struggling to make male voices sound natural. “You’d get these warble effects, oftentimes, with male voices,” said Ward. This forced the company to ultimately go with a female voice for the Assistant’s launch in 2016.

Fast forward three years, and the Assistant offers U.S. consumers to choose among 11 male and female voices, with Google announcing this week that it is bringing additional voice choices to nine more languages.

Much of that has been made possible by rapid advancements in cutting-edge artificial intelligence. But the way Google presents itself through voice interfaces has just as much to do with early design decisions related to the Assistant’s name, personality and more. Ward and Google Assistant personality character lead Emma Coats recently talked to Variety to explain how Google found its voice.

Does software have a gender?

Early on, the team working on the Google Assistant had to figure out a key question: What is Google’s persona? Looking at existing Google products like Gmail and Chrome offered few clues. “You don’t see a lot of personality in these products,” recalled Coats.

Without clear guidance from existing products, the Assistant team decided to map out some potential choices. “There were two schools of thinking,” Coats said. One was to turn the Assistant into a kind of audible version of Google.com, a matter-of-fact oracle that spits out knowledge whenever you ask for it. The other was to create more of a character. Make it helpful, but also a bit playful.

Coats’ team developed 20 hypothetical questions consumers might ask the Assistant, and the answers these two types of personalities might respond with. All of the questions and answers were hung up in a conference room, and key stakeholders were asked to mark the ones they preferred with little dot stickers.

The result of this dot-voting process was that the character won over the oracle — the right choice, if you ask Coats. After all, if you search for “Hello” on Google.com, the first result is a video of the Adele song. Said Coats: “Is that really what you want when you are speaking to something?”

But while the Assistant was supposed to have personality, it was also clear early on that consumers shouldn’t confuse it with an actual person. “It should speak like a human, but it should never pretend to be one,” said Coats.

In addition to personality traits, the Assistant team also had to settle on a range of other issues, recalled Coats: “What are the implications of giving the Assistant a human name? Does software have a gender?”

In order to help the assistant be more approachable in a variety of cultures, the team ultimately decided against giving it a human name, and instead settled on the “Google Assistant” moniker. “We did want to make it feel like a conversation with Google,” she said. This also helped to avoid that consumers would associate the Assistant with a single gender, ultimately paving the way for Google to roll out additional voices.

When the Assistant sounds like a ransom note

When Google initially developed the Assistant, it was still relying on traditional text-to-speech technology. This required to record a lot of source material with a voice actor, which was then chopped up, and reassembled by Google’s algorithms to create words and sentences. “It’s kind of like a ransom note,” said Ward.

That system worked reasonably well for common words and phrases, but would trip up a lot on edge cases. “It would sound really choppy,” he recalled. “Aberrations are hard.”

Google achieved a breakthrough when it replaced its traditional text-to-speech model with a deep learning-based approach called Wavenet in 2017. Curious minds will find more on the way Wavenet works on Google’s Deepmind blog, but in essence, the algorithm generates sounds from scratch after having received enough training from voice samples.

This not only resulted in a lot more natural-sounding Google Assistant, it also made it significantly easier for Google to develop and deploy new voices to the Assistant. “We can build more voices in less time,” Ward explained.

With the help of Wavenet, Google has been able to launch 11 voices total in the U.S. market, and many more internationally. The company was even able to work with John Legend to bring his voice to the Assistant.

Ward didn’t want to reveal how many voice samples Legend exactly had to record for this collaboration, but he said that it didn’t require too much of the singer’s time. Now, when users ask Google to “talk like a legend,” they get to hear their local weather forecast and other tidbits in his voice. This is being generated by Wavenet on the fly, with a little bit of Legend ad-libbing and singing thrown in for good measure.

With the Google offering more voices to choose from, the Assistant team found itself confronted with a new challenge: They didn’t want to resort to old gender and personality stereotypes to present all those choices. The solution has been a color picker within the Google app, which lets users quickly switch back and forth between different voices without gender and similar labels.

Google Assistant voice choices

Google has also been randomizing the voice it starts with in select countries, either starting users with the “red” or the “orange” voice, and users in Italy and South Korea got to hear from a male Assistant voice by default at launch.

In the future, consumers may be able to change more than just the voice about the Google Assistant, suggested Coats. “We think a lot about personalizing the assistant.” For instance, consumers may one day be able to name the Assistant, or decide that it should be more professional during business hours, and more playful after-hours.

No matter what form the Assistant takes in the future, getting that fine line between professionalism and personality right, while also keeping it approachable across demographics and cultures, will likely keep the Assistant team busy for some time to come. Said Coats: “We are still finding the balance.”

Popular on Variety

More Digital

  • Neilsons Measurment Problems TV Digital

    AT&T's Ad-Tech Unit Xandr Buys Clypd To Help Place TV Commercials More Precisely

    Xandr, the AT&T ad-technology unit, has purchased a new company that helps advertisers use data to place commercials in front of the audiences most likely to want to watch them The AT&T division said Friday it had acquired clypd, a company that helps advertisers move forward in a growing desire by Madison Avenue to run [...]

  • AT&T Logo Building

    AT&T TV Now Price Hike Coming Next Month, Base Package to Cost $65

    AT&T is instituting a substantial price hike for its live TV streaming service AT&T TV Now: Customers who have subscribed to the service’s basic “Plus” package will see their bill go up by $15, to a total of $65 per month, starting next month. The telco has started to inform existing subscribers about the price [...]

  • Disney-Family-Movies

    Disney Family Movies SVOD Service Is Shutting Down Ahead of Disney Plus Debut

    After 11 years, Disney is pulling the plug on Disney Family Movies On Demand — with the service’s shutdown coming just days before the launch of the Mouse House’s Disney Plus. Disney Family Movies, which cost between $5-$10 per month, has been available via pay-TV providers in the U.S., including Comcast Xfinity, Charter Communications, Verizon [...]

  • Amazon Orders ‘All or Nothing: Tottenham

    Amazon Orders ‘All or Nothing: Tottenham Hotspur’ Soccer Doc Series

    Amazon has greenlit a new “All or Nothing” sports documentary series, this time following London-based soccer team Tottenham Hotspur. “All or Nothing: Tottenham Hotspur” will follow a year in the life of the team, charting the ongoing 2019-20 season. The squad made it to the final of the European Champions League last year, losing to [...]

  • Jeffrey Katzenberg

    Jeffrey Katzenberg's Quibi Picks T-Mobile as Wireless Launch Partner

    Quibi, the short-form mobile TV service founded by Jeffrey Katzenberg, announced a pact with T-Mobile to be the official telecommunications partner for its April 2020 launch. T-Mobile will be the exclusive wireless distributor when Quibi launches next spring. However, the arrangement doesn’t mean only T-Mobile customers will be able to subscribe to Quibi: Anyone will [...]

  • Steve Kornacki

    Steve Kornacki, Chris Matthews Stand at Center of New NBC News Podcasts

    After vowing to press ahead more directly into the world of podcasts, NBC News is readying the launch of three additional audio shows centered around politics. “Article II: Inside Impeachment” will help listeners understand the impeachment process. The program, new episodes of which are slated to debut Mondays, Wednesdays and Fridays as people head for [...]

More From Our Brands

Access exclusive content