Uni-Tübingen

Robust Vision Projects

The CRC 1233 "Robust Vision" is organized in fourteen research projects (TP1-TP14), a mercator project and a junior research group, which are supported by a central coordination project (Z-Project).

Project 1: Physics-based scene understanding

Project description

The goal of this project is to develop models that recover a physical interpretation of visual scenes. We aim to infer representations of images and videos that include semantic object properties and physical quantities such as weight, stiffness, shape, motion and others. This physical based understanding goes beyond what is currently possible with scene understanding models. We will develop new computational models based on generative models and deep learning that include meaningful and interpretable representations. These will be empirically validated on novel challenging datasets that will be created.

Principle Investigators

Project 2: Robust material inference

Project description

This project aims at robust material inference exploiting deep learning and generative modeling such that independent of the current view or illumination the material type and reflectance parameters are recovered and novel views can be predicted. One central question of the project is to identify which visual features are exploited to robustly distinguish materials. Using causal inference the answer will be derived from a comprehensive data set which features a systematic variation of input modalities from images, videos, reflectance fields in controlled and uncontrolled illumination.

Principle Investigators

Project 3: Comparing humans and machines on robust visual inference

Project description

Biological visual systems facilitate navigation through complex environments and can recognise objects despite highly variable contextual conditions. This robust visual inference relies crucially on a suitably general feature space: the abstraction of meaningful environmental properties from a light field. In this project we will develop a psychophysical paradigm to directly compare human and machine vision systems (primarily, the feature spaces encoded by deep convolutional neural networks) on two robust visual inference tasks (scene understanding and style / content dissociation). We will develop two convolutional deep neural network models: one optimised to get the task correct, and one optimised to mimic human decisions. We will then compare the representational similarity of these two models to gain insight into which features are shared.

Principle Investigators

Project 4: Causal inference strategies in human vision

Project description

Recently there has been considerable progress in understanding causal inference by viewing it as a machine learning problem. In our approach we combine recent advances in causal inference in machine learning with a classical psychophysics approach. We concentrate on causal processes taking place in time. Time is “asymmetric,” with the cause preceding the effect: the arrow-of-time. Our aim is to determine to what extent the human visual system makes use of causal inference strategies when determining the arrow-of-time. We answer this question on multiple levels, exploring low-level motion of an autoregressive (AR) process up to intentionally caused motion and collective biological motion.

Principle Investigators

Project 5: Task–dependend top down modulation of visual processing

Project description

We investigate under which conditions a stimulus is better processed when it is task-irrelevant than when it is task-relevant (indirect task advantage, ITA). We develop a general statistical toolkit and apply it to the data from experiments, systematically varying the complexity of the tasks. We hypothesize to find an ITA mainly in simple but maybe not in complex tasks. Based on these results we will provide simple cognitive models for the ITA.

Principle Investigators

Project 6: Probabilistic inference in early visual cortex

Project description

Computational accounts of vision and perceptual decision-making have been dominated by feed-forward processing, but feedback connections in the brain are ubiquitous. Here we will explore the role of feedback on primate early visual processing, guided by a generative framework of perceptual inference. The key hypothesis is that a role of feedback is to provide prior, context-dependent information (‘beliefs’), and to modulate the sensory representation accordingly. Focusing on the influence of context-dependent changes in belief on short (trial- by-trial) time-scales, we will combine human psychophysics, macaque population recordings in V1 and V2, as well as computational modeling.

Principle Investigators

Project 7: Large-scale neuronal interactions during natural vision

Project description

Local features of natural visual stimuli are well predictable from their spatiotemporal context. This project will test the hypothesis that, during natural vision, the brain exploits this predictability and continuously conveys stimulus predictions through feedback projections. To this end, we will perform directly comparable experiments in humans (MEG) and monkeys (spiking, LFP, EEG). Using the same stimuli and tasks in both species, we will investigate the processing of natural and degraded stimuli across neuronal scales.

Principle Investigator

Project 8: Integration of bottom-up and top-down processing in perceptual learning during sleep

Project description

Robust visual inference requires that visual information processing adapts to changing contexts on different timescales. An archetypal example of adaptiveness on slow time-scales is the behavioural improvement resulting from repeated performance on low-level perceptual tasks. Such changes are stable across extended time intervals, occur even in the adult brain and are collectively described as perceptual learning.

Although extensively studied, basic questions regarding the neural underpinnings of perceptual learning remain unanswered. An ongoing controversy concerns the extent of generalization of perceptual learning over different stimulus dimensions. For example, it is still unclear whether highly demanding texture discrimination in the periphery can lead to performance improvements at untrained locations or untrained stimulus orientations. Importantly, traditional findings of highly specific learning effects in such tasks contradict the idea that the visual system performs flexibly and reliably in changing environments.

Our project starts from the hypothesis that generalized visual perceptual learning involves top- down influences in early visual cortices. Based on this hypothesis, we pursue four objectives:

  • we will test whether robust perceptual learning critically involves top-down information flow;
  • we will investigate whether the formation of such top-down feedback loops during perceptual learning critically relies on offline re-processing during sleep;
  • we will establish a comparative approach, in order to test our hypotheses in healthy human participants and non-human primates using the same experimental protocols;
  • we will develop and refine our experimental approach based on current generative models of visual inference, thereby ensuring interpretability of our results at an algorithmic level.

In order to achieve these goals, we will conduct behavioral as well as fMRI studies in healthy humans, accompanied by high-density sleep EEG recordings. In parallel, we will establish methods for obtaining sleep recordings in non-human primates, in order to adress the previously unexplored question of whether perceptual learning in these animals depends on sleep.

Our overall goal is to achieve a better understanding of how long-lasting changes in the adult primate visual system are implemented that contribute to robust visual inference. This aligns with Aim 1 of the Collaborative Research Center, in that we specifically adress the contribution of top-down feedback in balancing specifity and generalization of perceptual learning.

Principle Investigators

Project 9: Natural dynamic scene processing in the human brain

Project description

Our visual system has evolved in and for processing natural stimuli. We hypothesize that several functions of and interactions between dorsal and ventral visual streams rely on interactions between scene content and visual motion as they occur in natural stimuli. [AB1] We will combine advanced computer vision, computer graphics, and human fMRI neuroimaging to characterize neural responses and regional connectivity using more natural, dynamic stimuli, and compare brain responses to those of deep neural networks.

Principle Investigators

Project 10: Natural stimuli for mice: environment statistics and neural representations in the early visual system

Project description

Matching visual encoding resources to natural scene statistics may be a fundamental mechanism to achieve optimal neural coding and robustness of visual processing. Mice have become an important model system for vision research, but still little is known about the statistics of their natural visual environment. The goal of this project is to record with a custom- build miniature camera the natural visual input from the perspective of the mouse, to characterize basic environmental statistics contained in these movies and ask whether biases in the statistics of the visual input are reflected in the representations of the early visual system. Finally, the project will use the newly created natural movies for probing neural responses in retinal ganglion cells and primary visual cortex.

Principle Investigators

Project 11: Stable vision in the presence of fixational eye movements: where and how is the retinal image perceptually stabilized?

Project description

Three aims will be addressed: (1) the hypothesis that local subtractive processing, perhaps already in the retina, is used to stabilize a fixated object against the background which is moving with the eye. Spatial features, weighting, and size of integrated areas will be determined by partial gaze-contingent stimulation with psychophysical and eye tracking methods; (2) the hypothesis that retinal image jitter due to fixational eye movements may improve visual performance. Visual acuity and contrast sensitivity with natural, stabilized, and artifically jittered presentations will be analyzed. Adaptation to jitter and interocular transfer of adaptation will also be studied to better localize the retina or brain area where these operations take place; (3) the hypothesis that fixational eye movements specifically shape the responses of visual neurons in V1 and SC, as recorded with linear electrode arrays in monkeys from two different retinal input areas, and using gaze-contingent display manipulations.

Principle Investigators

Project 12: Image processing within a locally complete retinal ganglion cell population

Project description

Retinal circuits extract visual features from the incoming photon stream, giving rise to more than 30 parallel image representations sent to the brain via the axons of as many types of retinal ganglion cells (RGCs). The goal of this project is to develop a computational model and theory of nonlinear spatio-temporal image processing in the retina that reveals its contribution to robust visual inference. In particular, we seek to make significant advances in understanding the diversity of RGC computations by combining two-photon population imaging with anatomical data and convolutional neural network (CNN) modelling. To this end, we will build a data-driven model that can accurately predict how spatio-temporal image information is transformed into RGC responses to natural stimuli by the complete network in a local patch of retina. In addition, we will develop a theory-driven model of retinal processing by which we seek to derive the empirically observed RGC response properties from first principles that are captured by the data-driven model. Ultimately, we envision to find abstract signal processing principles that are able to characterize the usefulness of the diversity of RGC types for a large variety of natural vision tasks.

Principle Investigators

Project 13: Visual processing of feedforward and feedback signals in the dLGN

Project description

How is the representation of visual information transformed between neuronal populations in the retina and the dorsolateral geniculate nucleus (dLGN)? In this transformation, what is the role of cortico-thalamic feedback? This project will use two-photon calcium imaging to functionally characterize the population of retinal ganglion cells projecting to the dLGN, and develop a computational model based on a weighted combination of these retinal outputs to predict dLGN responses. Finally, the project will use optogenetics to causally test how this transformation is shaped by cortico-thalamic feedback. This project will therefore contribute an important building block for understanding the cascade of pre-cortical information transformations and its effect on the robustness of the visual system.

Principle Investigators

Project 14: Image-processing computations in artificial vision

Project description

The healthy retina robustly converts spatial visual information into temporal patterns of retinal ganglion cell spikes. Although this functionality is lost in blind patients with retinal degenerations, first clinical evidence indicates that it can be partially replaced by artificial electrical stimulation using retinal implants.

Here we aim to understand how objects embedded in natural scenes can be detected by decoding the retinal output and how CNN-based image-processing algorithms can be optimized to facilitate robust object identification first in ex vivo blind retinas and then in blind patients with light-sensitive retinal prostheses using specific electrical stimulation patterns.

Principle Investigators