What is the research focus and the team’s vision?

The team’s research is aiming at photo-realism in acquiring high resolution, very accurate 3D models that capture the geometry and the appearance of the real world with the goal to enable photorealistic relighting and finally photorealistic rendering. To achieve that goal specific acquisition methods and technology are developed by combining optics, camera design, and image processing in the field of computational photography. Reconstructing the photorealistic model from captured data including the reflection properties is the next challenge. Robust reconstruction calls for intelligent, and highly efficient approaches which can only be achieved by combining GPGPU programming with advanced machine learning. Similarly, the reconstruction is tightly coupled to efficient rendering where we particularly address challenging light transport scenarios. A newer trend combines text and visual information in order to explore how both modalities can support each other, e.g. in language grounding or in creating graphics and scenes from textual descriptions.

Realistic Object Acquisition and Rendering

Team: Raphael Braun, Andreas Engelhardt, Lukas Ruppert, Arjun Majumdar, Faezeh Zakeri

The core competence of the team is in the acquisition, rendering, and display of photo realistic 3D models of real-world objects and scenes. In order to fully capture the appearance, both the 3D geometry as well as the light transport characteristics, i.e., the view and illumination dependent reflection properties, have to be measured precisely. The current focus is on the acquisition outside of a lab environment or the acquisition of multispectral reflectance. In addition, capturing the reflectance of moving, dynamic objects is still to be solved.

Besides acquiring reflection properties we also work on rendering them efficiently in complex global illumination settings, e.g. using path guiding.

Computational Photography

Team: Andreas Engelhardt, Arjun Majumdar, Raphael Braun, Alexander Oberdörster

Beyond traditional ways of taking pictures, the group is researching novel cameras and algorithms. By modifying the optics, the sensor, and by combining multiple different sensors paired with powerful algorithms it is possible to acquire better pictures or pictures showing other modes rather than color. By employing programmable diffractive optics we can shape the PSF of a camera. In such a way, images might be refocused after capture, the contrast might be increased while reducing noise in HDR images. One can capture 3D geometry from image collections and videos, or capture aspects that otherwise are invisible, i.e. temporal, polarization, or wavelength effects. By coupling cameras with projectors and compute power the additional information can also be visualized on real-world surfaces.

Massively Parallel Programming (GPGPU)

Team: Lukas Ruppert, Raphael Braun

Besides the traditional topic of real-time rendering on graphics cards our research focus on using the massive parallelism in modern GPUs with thousands of cores for supporting general purpose computing. All aspects of designing algorithms on massively parallel platforms are of interest. For example, we employ GPUs for the simulation of particular visual effects such as diffraction, fluorescence, and alike, to perform advanced image processing in real-time. In addition, the group works on general purpose GPU (GPGPU) applications, for example, highly efficient nearest neighbor search in high-dimensional spaces based on product quantization or hierarchical kNN graphs.

Combining Text and Visual Information

Team: Hassan Shahmohammadi, Zohreh Ghaderi, Leonard Salewski

Linking images and natural language opens new application scenarios. For one, providing a natural language interface to the visual world enables localization and reasoning about everyday objects in our environment. This includes visual grounding of word embeddings, video captioning, or visual Q&A including explanation generation.  Furthermore, by closely analyzing the semantics of objects and by interpreting scenes we work on automatically translating text into visual representations. A key ingredient is harvesting multiple knowledge sources both in form of texts as well as image and video databases, with the focus to strengthen both the graphics/vision understanding as well as solving psycho-linguistic questions.

Machine Learning for Image and Video Processing including Medical Imaging

Team: Zohreh Ghaderi, Raphael Braun, Simon Holdenried-Krafft, Simon Doll, Sarah Müller

Many image and video processing algorithms have to deal with constantly changing input data as each frame might show different content in a different illumination. We apply machine learning architectures such as deep convolutional neural networks, recurrent networks, Transformer, and alike to enable novel information extraction and enhancement applications. They include image and video deblurring, road lane prediction, appearance robust similarity estimations, measuring aesthetics, or analyzing retinal fundus images and whole-slide images in pathology.