We propose an algorithm for “fully automatic neural face swapping in images and videos.”
So begins a startling revelation by Disney Researchers Jacek Naruniec, Leonhard Helminger, Christopher Schroers and Romann M. Weber in a paper delivered virtually at The 31st Eurographics Symposium on Rendering in London recently.
Here’s the abstract:
“In this paper, we propose an algorithm for fully automatic neural face swapping in images and videos. To the best of our knowledge, this is the first method capable of rendering photo-realistic and temporally coherent results at megapixel resolution. To this end, we introduce a progressively trained multi-way (comb network) and a light- and contrast-preserving blending method. We also show that while progressive training enables generation of high-resolution images, extending the architecture and training data beyond two people allows us to achieve higher fidelity in generated expressions. When compositing the generated expression onto the target face, we show how to adapt the blending strategy to preserve contrast and low-frequency lighting. Finally, we incorporate a refinement strategy into the face landmark stabilization algorithm to achieve temporal stability, which is crucial for working with high-resolution videos. We conduct an extensive ablation study to show the influence of our design choices on the quality of the swap and compare our work with popular state-of-the-art methods.”
Got that?
My advice: just watch the video and be prepared to be wowed.
My take: Deep fakes were concerning enough. However, this technology actually has production value. I envision a (very near) future where “substitute actors” (sub-actors?) are the ones who give the performances on set and then this Disney technology replaces their faces the those of the “stars” they represent. In fact, if I was an agent, I’d be looking for those subactors now so I could package the pair. A star who didn’t want to mingle with potentially COVID-19 carriers could send their doubles to any number of projects at the same time. All that would be left is to do a high resolution 3D scan and some ADR work. Of course — Jimmy Fallon already perfected this technique five years ago: