How Google Found Its Voice

Google executives talk gender, names and other key design decisions for the Google Assistant

A few years back, Google was actively exploring whether it should launch a male counterpart to Amazon’s female Alexa voice assistant.

“When we first launched the Google Assistant, we intended to use a male voice, just to be different,” Google Assistant product manager Brant Ward recently recalled.

However, at the time, text-to-speech technology was still struggling to make male voices sound natural. “You’d get these warble effects, oftentimes, with male voices,” said Ward. This forced the company to ultimately go with a female voice for the Assistant’s launch in 2016.

Fast forward three years, and the Assistant offers U.S. consumers to choose among 11 male and female voices, with Google announcing this week that it is bringing additional voice choices to nine more languages.

Much of that has been made possible by rapid advancements in cutting-edge artificial intelligence. But the way Google presents itself through voice interfaces has just as much to do with early design decisions related to the Assistant’s name, personality and more. Ward and Google Assistant personality character lead Emma Coats recently talked to Variety to explain how Google found its voice.

Does software have a gender?

Early on, the team working on the Google Assistant had to figure out a key question: What is Google’s persona? Looking at existing Google products like Gmail and Chrome offered few clues. “You don’t see a lot of personality in these products,” recalled Coats.

Popular on Variety

Without clear guidance from existing products, the Assistant team decided to map out some potential choices. “There were two schools of thinking,” Coats said. One was to turn the Assistant into a kind of audible version of Google.com, a matter-of-fact oracle that spits out knowledge whenever you ask for it. The other was to create more of a character. Make it helpful, but also a bit playful.

Coats’ team developed 20 hypothetical questions consumers might ask the Assistant, and the answers these two types of personalities might respond with. All of the questions and answers were hung up in a conference room, and key stakeholders were asked to mark the ones they preferred with little dot stickers.

The result of this dot-voting process was that the character won over the oracle — the right choice, if you ask Coats. After all, if you search for “Hello” on Google.com, the first result is a video of the Adele song. Said Coats: “Is that really what you want when you are speaking to something?”

But while the Assistant was supposed to have personality, it was also clear early on that consumers shouldn’t confuse it with an actual person. “It should speak like a human, but it should never pretend to be one,” said Coats.

In addition to personality traits, the Assistant team also had to settle on a range of other issues, recalled Coats: “What are the implications of giving the Assistant a human name? Does software have a gender?”

In order to help the assistant be more approachable in a variety of cultures, the team ultimately decided against giving it a human name, and instead settled on the “Google Assistant” moniker. “We did want to make it feel like a conversation with Google,” she said. This also helped to avoid that consumers would associate the Assistant with a single gender, ultimately paving the way for Google to roll out additional voices.

When the Assistant sounds like a ransom note

When Google initially developed the Assistant, it was still relying on traditional text-to-speech technology. This required to record a lot of source material with a voice actor, which was then chopped up, and reassembled by Google’s algorithms to create words and sentences. “It’s kind of like a ransom note,” said Ward.

That system worked reasonably well for common words and phrases, but would trip up a lot on edge cases. “It would sound really choppy,” he recalled. “Aberrations are hard.”

Google achieved a breakthrough when it replaced its traditional text-to-speech model with a deep learning-based approach called Wavenet in 2017. Curious minds will find more on the way Wavenet works on Google’s Deepmind blog, but in essence, the algorithm generates sounds from scratch after having received enough training from voice samples.

This not only resulted in a lot more natural-sounding Google Assistant, it also made it significantly easier for Google to develop and deploy new voices to the Assistant. “We can build more voices in less time,” Ward explained.

With the help of Wavenet, Google has been able to launch 11 voices total in the U.S. market, and many more internationally. The company was even able to work with John Legend to bring his voice to the Assistant.

Ward didn’t want to reveal how many voice samples Legend exactly had to record for this collaboration, but he said that it didn’t require too much of the singer’s time. Now, when users ask Google to “talk like a legend,” they get to hear their local weather forecast and other tidbits in his voice. This is being generated by Wavenet on the fly, with a little bit of Legend ad-libbing and singing thrown in for good measure.

With the Google offering more voices to choose from, the Assistant team found itself confronted with a new challenge: They didn’t want to resort to old gender and personality stereotypes to present all those choices. The solution has been a color picker within the Google app, which lets users quickly switch back and forth between different voices without gender and similar labels.

Google Assistant voice choices

Google has also been randomizing the voice it starts with in select countries, either starting users with the “red” or the “orange” voice, and users in Italy and South Korea got to hear from a male Assistant voice by default at launch.

In the future, consumers may be able to change more than just the voice about the Google Assistant, suggested Coats. “We think a lot about personalizing the assistant.” For instance, consumers may one day be able to name the Assistant, or decide that it should be more professional during business hours, and more playful after-hours.

No matter what form the Assistant takes in the future, getting that fine line between professionalism and personality right, while also keeping it approachable across demographics and cultures, will likely keep the Assistant team busy for some time to come. Said Coats: “We are still finding the balance.”

More Digital

  • The Isle of the Dead

    UniFrance Puts Spotlight on Emerging French VR Sector at Rendez-Vous

    France has a burgeoning eco-system of virtual reality and augmented reality producers, and is one of Europe’s leading VR/AR hubs. The UniFrance Rendez-Vous with French Cinema in Paris, a showcase of French projects that wraps Monday, included a VR/AR showcase, with recent projects demonstrated by Wide Management VR, VRrOOm and Atlas V. UniFrance’s online MyFrenchFilm [...]

  • Blood Oath

    MBC Studios to Launch Arabic TV Series 'Blood Oath,' Written by Britain's Tony Jordan

    MBC Studios, the Saudi-owned production unit of top Middle Eastern free-to-air satellite network MBC Group is set to launch “Blood Oath,” an Arabic crime series written by Britain’s Tony Jordan (“EastEnders,” “Life on Mars”). The announcement comes as Dubai-based MBC Studios, set up in 2018 and headed by former president of NBCUniversal International Peter Smith, is [...]

  • Spirited Away

    Netflix to Carry Iconic Studio Ghibli Animated Films

    The iconic animated features of Japan’s Studio Ghibli will be available on Netflix from February. The move is a further change of position for the studio which has repeatedly resisted the idea that its beloved cartoons would be released on digital platforms. Netflix, sales agent Wild Bunch, and Studio Ghibli, which counts Hayao Miyazaki as [...]

  • ‘Drag Race’ Producers Greenlight Two Docu-Series

    ‘RuPaul’s Drag Race’ Producers Greenlight Two Docu-Series for SVOD (EXCLUSIVE)

    “RuPaul’s Drag Race” producer World of Wonder has greenlit two docu-series: “God Shave the Queens!” and an “Untitled Trinity Taylor Pageant Project.” The announcement will be made Sunday at the first-ever DragCon U.K. convention, which is taking place at Olympia, London. “God Shave the Queens!” features the first-ever cast of “RuPaul’s Drag Race UK” and [...]

  • VRrOOm Launching VR Platform for Immersive,

    VRrOOm to Launch VR Platform for Immersive, Multi-User Streaming of Live Events

    French VR company VRrOOm is launching a six degrees of freedom (6-DoF) social VR platform that operates within the VRChat live platform, and enables multiple users to take part in live events, and includes the possibility of real-time photo-realistic representation. Louis Cacciuttolo founded VRrOOm in 2016 after working three years at THX in San Francisco. [...]

  • Any Given Wednesday With Bill Simmons

    Spotify in Talks to Acquire Bill Simmons' The Ringer: Report

    Spotify is in early talks to acquire The Ringer, the digital content and podcast network launched by ESPN alum Bill Simmons in 2016, according to a report in the Wall Street Journal. A representative for Spotify declined to comment on the report. Reps for Ringer did not immediately respond to a request for comment. Spotify’s [...]

More From Our Brands

Access exclusive content