A paper set to be delivered at next week’s SIGGRAPH 2017 conference has garnered a lot of pre-confab attention because the technology could possibly be used to produce fake news videos. But the technology described in the paper, “Synthesizing Obama: Learning Lip Sync From Audio,” could have many more beneficial uses, especially in the entertainment and gaming industries.
Researchers from the University of Washington have developed the technology to photorealistically put different words into former President Barack Obama’s mouth, based on several hours of video footage from his weekly addresses. They used a recurrent neural network to study how Obama’s mouth moves, then they manipulated his mouth and head motions as to sync them to rearranged words and sentences, creating new sentences.
It’s easy to see how this could potentially be used for nefarious purposes, but the technology is a long way away from becoming widely available and it would be fairly easy to detect in fake videos, according to Supasorn Suwajanakorn, the lead author of the study. “It would be relatively easy to develop a software to detect fake video,” he says. “Producing a truly realistic, hard-to-verify video may take much longer than that due to technical limitations.”
SIGGRAPH’s conference chair Jerome Solomon, dean of Cogswell Polytechnical College, notes that any new technology can be used for good or bad. “This is new technology in computer graphics,” he explains. “We’re making things that might not be believable believable and worlds that don’t exist exist. And I think people potentially using any technology out of our industry could use it for bad purposes or good.”
Plus, Solomon says echoing Suwajanakorn, “I think it’s a ways away from being available to everybody. Our conference is really a place where new technology comes in through our technical papers program, but it takes awhile for the technology to appear in the tools. Developers have to go and create the software to actually take this research and get it into the tools.”
And there are a wide variety of uses for this particular technology.
“Automatically editing video to allow accurate lip-sync to a new audio track is a novel advance on a very hot topic with many practical applications,” says Marie-Paule Cani,SIGGRAPH’s technical paper chair. “It could be used, for instance, to seamlessly dub a movie in a foreign language, or to correct what people said in video footage and no cost.”
A number of papers and exhibits of new technology will be on display at SIGGRAPH 2017, to be held July 30 through Aug. 3 at the Los Angeles Convention Center.
Among the many new technologies will be a presentation by brain-computer interface company Neurable. “They make a cap that you put on your head and it reads your brainwaves so you can use it instead of a mouse or a keyboard to do different things,” says Solomon. “They’re coming to SIGGRAPH with that technology to show how you can use it to play a game. Imagine playing a game without have a controller in your hand.”
A new addition to SIGGRAPH this year is a VR theater with ongoing programming. “We’re going to show VR films,” Solomon explains. “We’ll have high-end VR headsets and can actually demonstrate VR storytelling. With the sound and the high-end digital, it’s a really different experience.”