LEGO-3D: Learning Generative 3D Scene Models for Training and Validating Intelligent Systems

ERC Starting Grant #850533

The field of computer vision has witnessed a major transformation away from expert designed shallow models towards more generic deep representation learning. However, collecting labeled data for training deep models is costly and existing simulators with artist-designed scenes do not provide the required variety and fidelity. ERC Starting Grant project LEGO-3D (project number #850533) tackles this problem by developing probabilistic models capable of synthesizing 3D scenes jointly with photo-realistic 2D projections from arbitrary viewpoints and with full control over the scene elements. Our key insight is that data augmentation, while hard in 2D, becomes easier in 3D as physical properties such as viewpoint invariances and occlusion relationships are captured by construction. Thus, our goal is to learn the entire 3D-to-2D simulation process. In particular, we focus on the following problems:

  • We devise algorithms for automatic decomposition of real and synthetic scenes into latent 3D primitive representations capturing geometry, material, light and motion.
  • We develop probabilistic generative models to synthesize 3D environments. In particular, we develop unconditional, conditioned and spatio-temporal scene generation networks.
  • We combine differentiable and neural rendering techniques with image synthesis, yielding high-fidelity 2D renderings of the generated 3D representations while capturing ambiguities and uncertainties.

We believe that LEGO-3D will significantly impact a large number of application areas. Examples include vision systems which require access to large amounts of annotated data, safety-critical applications such as autonomous cars that rely on efficient ways for training and validation, as well as the entertainment industry which seeks to automate the creation and manipulation of 3D content.

Related Publications

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields (oral, best paper award)
M. Niemeyer and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Counterfactual Generative Networks
A. Sauer and A. Geiger
International Conference on Learning Representations (ICLR), 2021

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis
K. Schwarz, Y. Liao, M. Niemeyer and A. Geiger
Advances in Neural Information Processing Systems (NeurIPS), 2020

VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids
K. Schwarz, A. Sauer, M. Niemeyer, Y. Liao and A. Geiger
Advances in Neural Information Processing Systems (NeurIPS), 2022

KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients (oral)
N. Hanselmann, K. Renz, K. Chitta, A. Bhattacharyya and A. Geiger
European Conference on Computer Vision (ECCV), 2022

StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
A. Sauer, K. Schwarz and A. Geiger
International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2022

gDNA: Towards Generative Detailed Neural Avatars
X. Chen, T. Jiang, J. Song, J. Yang, M. Black, A. Geiger and O. Hilliges
Conference on Computer Vision and Pattern Recognition (CVPR), 2022

KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
Y. Liao, J. Xie and A. Geiger
Pattern Analysis and Machine Intelligence (PAMI), 2022

ATISS: Autoregressive Transformers for Indoor Scene Synthesis
D. Paschalidou, A. Kar, M. Shugrina, K. Kreis, A. Geiger and S. Fidler
Advances in Neural Information Processing Systems (NeurIPS), 2021

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
C. Reiser, S. Peng, Y. Liao and A. Geiger
International Conference on Computer Vision (ICCV), 2021

Benchmarking Unsupervised Object Representations for Video Sequences
M. Weis, K. Chitta, Y. Sharma, W. Brendel, M. Bethge, A. Geiger and A. Ecker
Journal of Machine Learning Research (JMLR), 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
D. Paschalidou, A. Katharopoulos, A. Geiger and S. Fidler
Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition
H. Alhaija, S. Mustikovela, V. Jampani, J. Thies, M. Niessner, A. Geiger and C. Rother
International Conference on 3D Vision (3DV), 2020

Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision
M. Niemeyer, L. Mescheder, M. Oechsle and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis
Y. Liao, K. Schwarz, L. Mescheder and A. Geiger
Conference on Computer Vision and Pattern Recognition (CVPR), 2020