Generative Deformable Radiance Fields for Disentangled Image Synthesis of Topology-Varying Objects
The 30th Pacific Graphics Conference (PG 2022) (& Computer Graphics Forum, CGF)

Abstract

3D-aware generative models have demonstrated their superb performance to generate 3D neural radiance fields (NeRF) from a collection of monocular 2D images even for topology-varying object categories. However, these methods still lack the capability to separately control the shape and appearance of the objects in the generated radiance fields. In this paper, we propose a generative model for synthesizing radiance fields of topology-varying objects with disentangled shape and appearance variations. Our method generates deformable radiance fields, which builds the dense correspondence between the density fields of the objects and encodes their appearances in a shared template field. Our disentanglement is achieved in an unsupervised manner without introducing extra labels to previous 3D-aware GAN training. We also develop an effective image inversion scheme for reconstructing the radiance field of an object in a real monocular image and manipulating its shape and appearance. Experiments show that our method can successfully learn the generative model from unstructured monocular images and well disentangle the shape and appearance for objects (e.g., chairs) with large topological variance. The model trained on synthetic data can faithfully reconstruct the real object in a given single image and achieve high-quality texture and shape editing results.

Overview

Overview of our generative deformable radiance field. It consists of a template radiance field conditioned on a appearance code , and a deformation field and a correction field conditioned on a shape code. A discriminator is introduced for adversarial learning between generated images and real ones. All networks are train end-to-end on a collection of color images.

Generation Results

Shape and Appearance Code Swapping

Comparison

The images on the diagonal are the reference instances and others are generated by combining the row-wise shape code and column-wise appearance code. Our method achieves much better shape and appearance disentanglement than EditNeRF and CodeNeRF (e.g., see the color of car roof in the column-wise texture swapping results)