Recycle-GAN: The Next Step in Deepfakes

By Hunter Gallant

Image source: https://www.cs.cmu.edu/~aayushb/Recycle-GAN/
The approach to translating from John Oliver to Stephen Colbert (top row) and from a hibiscus to a synthetic daffodil (bottom row)

Highlighted in a recent report, Carnegie Mellon University researchers have created a method of transforming the style of one video’s content into another using artificial intelligence. They’ve created false videos of Donald Trump speaking like Barack Obama, daffodils blooming like hibiscuses, and Stephen Colbert mimicking John Oliver. Their end result appears similar to another A.I. algorithm known as “deepfakes”.

Deepfakes are videos where one person’s face (Person A) is merged onto another person’s (Person B). The intended effect is to convince the viewer that Person A performed the actions of Person B. For instance, the actor Nicolas Cage has been a popular target of Deepfake creators, being placed in many movie roles he was not originally in. This same technology can also be used for converting black-and-white films to color and training self-driving cars to drive at night.

Link to Nicolas Cage Deepfake Example: https://www.youtube.com/watch?v=UwiagqaX4fA

The main method for creating Deepfakes involves using massive “facesets”, collections of thousands of images of a celebrity’s face, to train the network to replace the face in each frame of the input video with the celebrity’s face. This technique is both resource and time intensive. Using a specific technique called “Recycle-GAN”, researchers at CMU can transform one video into the style of different second video. The result looks like a Deepfake video, but without requiring any facesets.

This technology is based on algorithms called generative adversarial networks (GAN). In a GAN, two computer models compete: a generator creating videos and a discriminator scoring the generator’s effectiveness. Working in competition enables the overall systems to learn how to transform content into a different style. Cycle-GAN is an improved variant of GAN, where the GAN can process both forwards and backwards, increasing the quality of the generated content.

Until now, these GAN methods dealt purely with the spatial representation in videos, leaving undesired video artifacts scattered throughout the final product. The CMU researchers developed Recycle-GAN to resolve with this issue, taking temporal input as well as spatial, leading to much more polished results.

What is the future of this technology? A powerful application could be in self driving cars, generating images of locations in stormy weather or at night. On the other hand, this tool could be dangerous, creating false video of people doing actions they wouldn’t otherwise.

Sources

Carnegie Mellon University. (2018, September 11). Beyond deep fakes: Transforming video content into another video’s style, automatically: Applications include movie production, self-driving cars, VR content. ScienceDaily. Retrieved September 13, 2018 from
www.sciencedaily.com/releases/2018/09/180911083145.htm

Bansal, A., Ma, S., Ramanan, D., & Sheikh, Y. (2018). Recycle-GAN: Unsupervised Video Retargeting. arXiv preprint arXiv:1808.05174.
https://www.cs.cmu.edu/~aayushb/Recycle-GAN/

Dartmouth Undergraduate Journal of Science

Leave a Reply Cancel reply