Tero Karras, Samuli Laine and Timo Aila of Nvidia have just published breakthrough work on Generative Adversarial Networks and images:
“We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation.”
I’ve blogged about Nvidia’s GANs and image generation before, but this improvement in quality is remarkable.
If I understand it correctly, the breakthrough is applying one picture as a “style” or filter on another picture. Applying the filters in the left column to the pictures across the top yields the AI-generated pictures in the middle.
Read the scientific paper for full details.
Of course, we’ve seen something similar before. Way back in 1985 Godley & Creme released a music video for their song Cry; the evocative black and white video used analogue wipes and fades to blend a myriad of faces together, predating digital morphing. Here’s a cover version and video remake by Gayngs, including a cameo by Kevin Godley:
My take: Definitely scary. But if that’s the current state of the art, I think it means we are _not_ living in the Simulation — yet, even though Elon Musk says otherwise.