MAGIC: Mask-Guided Image Synthesis by Inverting a Quasi-Robust Classifier

¹University of Southern California (USC), ²University of California Los Angeles (UCLA), ³Sapienza, University of Rome
The Thirty-Seventh Conference on Artificial Intelligence (AAAI) 2023

Abstract

We offer a method for one-shot image synthesis that allows controlling manipulations of a single image by inverting a quasi-robust classifier equipped with strong regularizers. Our proposed method, entitled Magic, samples structured gradients from a pre-trained quasi-robust classifier to better preserve the input semantics while preserving its classification accuracy, thereby guaranteeing credibility in the synthesis. Unlike current methods that use complex primitives to supervise the process or use attention maps as a weak supervisory signal, Magic aggregates gradients over the input, driven by a guide binary mask that enforces a strong, spatial prior. Magic implements a series of manipulations with a single framework achieving shape and location control, intense non-rigid shape deformations, and copy/move operations in the presence of repeating objects and gives users firm control over the synthesis by requiring simply specifying binary guide masks. Our study and findings are supported by various qualitative comparisons with the state-of-the-art on the same images sampled from ImageNet and quantitative analysis using machine perception along with a user survey of 100+ participants that endorse our synthesis quality.

BibTeX

@article{rouhsedaghat2023magic, author = {Rouhsedaghat, Mozhdeh and Monajatipoor, Masoud and Kuo, C.-C. Jay and Masi, Iacopo}, title = {MAGIC: Mask-Guided Image Synthesis by Inverting a Quasi-Robust Classifier}, journal = {The Thirty-Seventh Conference on Artificial Intelligence - AAAI}, year = {2023}, }

MAGIC: Mask-Guided Image Synthesis by Inverting a Quasi-Robust Classifier

Abstract

For each input, we fix the mask and start the synthesis from three different starting points x'. While observing the boundaries specified by the target mask y' and generating realistic images, MAGIC keeps specificity and generates diverse results.

AAAI23 Poster

BibTeX