Lecture: Computer Vision

The goal of computer vision is to compute geometric and semantic properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an object, determining how things are moving and recognizing objects or scenes. This course will provide an introduction to computer vision, with topics including image formation, camera models, camera calibration, feature detection and matching, motion estimation, geometry reconstruction, object detection and tracking, and scene understanding. Applications include building 3D maps, creating virtual avatars, image search, organizing photo collections, human computer interaction, video surveillance, self-driving cars, robotics, virtual and augmented reality, simulation, medical imaging, and mobile computer vision. Modern computer vision relies heavily on machine learning in particular deep learning and graphical models. This course therefore assumes prior knowledge of deep learning (e.g., deep learning lecture) and introduces the basic concepts of graphical models and structured prediction where needed. The tutorials will deepen the understanding of deep neural networks by implementing and applying them in Python and PyTorch. A strong emphasis of this course is on 3D vision.

Qualification Goals

Students gain an understanding of the theoretical and practical concepts of computer vision including image formation, camera models, feature detection, multiple view geometry, 3D reconstruction, motion estimation, object recognition, scene understanding and structured prediction using deep neural networks and graphical models. A strong emphasis of this course is on 3D vision. After this course, students should be able to understand and apply the basic concepts of computer vision in practice, develop and train computer vision models, reproduce research results and conduct original research in this area.

Overview

Course number: ML-4360
Credits: 9 ECTS
Recommended for: Master, 2nd semester
Total Workload: 270h
This lecture is taught as flipped classroom: Lectures will be provided via YouTube and must be watched before the respective interactive live sessions.
Each week, we host an interactive live session where questions regarding the lecture and exercises are posed and discussed together. It is essential for students to attend the live sessions.
We also offer additional weekly helpdesks where students may ask questions to obtain individual feedback and support for solving the exercises.
In addition, we provide regular quizzes via our quiz server with questions on the lectures and exercises for self-assessment and self-motivation.
Finally, we are providing continuously and timely support via our chat.
See 'Important Links' in the sidebar to access the videos, slides, exercises, chat, zoom room and quiz.

Prerequisites

Basic Computer Science skills: Variables, functions, loops, classes, algorithms
Basic Python and PyTorch coding skills
Basic Math skills: Linear algebra, probability and information theory (eg., Math for ML lecture https://www.tml.cs.uni-tuebingen.de/teaching/2020_maths_for_ml/index.php). As a refresher we recommend reading Chapters 1-4 of: http://www.deeplearningbook.org or watching our newly micro tutorials Mathematics for Deep Learning.
Experience with Deep Learning (eg., through participation our Deep Learning lecture)

Registration

To participate, you must register via our Quiz Server (see sidebar)
Information about exam registration can be found here

Exercises

The exercises play an essential role in understanding the content of the course. There will be 6 assignments in total. The assignments contain pen and paper questions as well as programming problems. For some of the exercises, the students will use PyTorch, a state-of-the-art deep learning framework which features GPU support and auto-differentiation. If you have questions regarding the exercises or the lecture, please ask them during the interactive sessions, at the zoom helpdesk or in our chat.

Schedule

Date	Lecture Slides and Videos	Live Sessions	TA Support
	Recap: Math for Deep Learning
16.04.	L01 - Introduction \| Slides 1.1 Organization \| Video 1.2 Introduction \| Video 1.3 History of Computer Vision \| Video	L01 - Lecture Organization E01 - Exercise Introduction \| Problems	Gege Gao
23.04.	L02 - Image Formation \| Slides 2.1 Primitives and Transformations \| Video 2.2 Geometric Image Formation \| Video 2.3 Photometric Image Formation \| Video 2.4 Image Sensing Pipeline \| Video	L02 - Lecture Q&A E01 - Exercise Q&A	Gege Gao
30.04.	L03 - Structure-from-Motion \| Slides 3.1 - Preliminaries\| Video 3.2 - Two-frame Structure-from-Motion\| Video 3.3 - Factorization \| Video 3.4 - Bundle Adjustment \| Video	L03 - Lecture Q&A E01 - Exercise Q&A E02 - Exercise Introduction \| Problems	Gege Gao Patricia Gschoßmann
07.05.	L04 - Stereo Reconstruction \| Slides 4.1 - Preliminaries \| Video 4.2 - Block Matching \| Video 4.3 - Siamese Networks \| Video 4.4 - Spatial Regularization \| Video 4.5 - End-to-End Learning \| Video	L04 - Lecture Q&A E02 - Exercise Q&A	Patricia Gschoßmann
14.05.	L05 - Probabilistic Graphical Models \| Slides 5.1 - Structured Prediction \| Video 5.2 - Markov Random Fields \| Video 5.3 - Factor Graphs \| Video 5.4 - Belief Propagation \| Video 5.5 - Examples \| Video	L05 - Lecture Q&A E02 - Exercise Q&A E03 - Exercise Introduction \| Problems	Patricia Gschoßmann
21.05.	L06 - Applications of Graphical Models \| Slides 6.1 - Stereo Reconstruction \| Video 6.2 - Multi-View Reconstruction \| Video 6.3 - Optical Flow \| Video	L06 - Lecture Q&A E03 - Exercise Q&A	Patricia Gschoßmann
28.05.	L07 - Learning in Graphical Models \| Slides 7.1 - Conditional Random Fields \| Video 7.2 - Parameter Estimation \| Video 7.3 - Deep Structured Models \| Video	L07 - Lecture Q&A E03 - Exercise Q&A E04 - Exercise Introduction \| Problems	Patricia Gschoßmann Christina Tze
04.06.	L08 - Shape-from-X \| Slides 8.1 - Shape-from-Shading \| Video 8.2 - Photometric Stereo \| Video 8.3 - Shape-from-X \| Video 8.4 - Volumetric Fusion \| Video	L08 - Lecture Q&A E04 - Exercise Q&A	Christina Tze
	Break	Break
18.06.	L09 - Coordinate-based Networks \| Slides 9.1 - Implicit Neural Representations \| Video 9.2 - Differentiable Volumetric Rendering \| Video 9.3 - Neural Radiance Fields \| Video 9.4 - Generative Radiance Fields \| Video	L09 - Lecture Q&A E04 - Exercise Q&A E05 - Exercise Introduction \| Problems	Christina Tze
25.06.	L10 - Recognition \| Slides 10.1 - Image Classification \| Video 10.2 - Semantic Segmentation \| Video 10.3 - Object Detection and Segmentation \| Video	L10 - Lecture Q&A E05 - Exercise Q&A	Christina Tze
	Break	Break
09.07.	L11 - Self-Supervised Learning \| Slides 11.1 - Preliminaries \| Video 11.2 - Task-specific Models \| Video 11.3 - Pretext Tasks \| Video 11.4 - Contrastive Learning \| Video	L11 - Lecture Q&A E05 - Exercise Q&A E06 - Exercise Introduction \| Problems	Gege Gao Christina Tze
16.07.	L12 - Diverse Topics in Computer Vision \| Slides 12.1 - Input Optimization \| Video 12.2 - Compositional Models \| Video 12.3 - Human Body Models \| Video 12.4 - Deepfakes \| Video	L12 - Lecture Q&A E06 - Exercise Q&A	Gege Gao