Lecture: Computer Vision

The goal of computer vision is to compute geometric and semantic properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an object, determining how things are moving and recognizing objects or scenes. This course will provide an introduction to computer vision, with topics including image formation, camera models, camera calibration, feature detection and matching, motion estimation, geometry reconstruction, object detection and tracking, and scene understanding. Applications include building 3D maps, creating virtual avatars, image search, organizing photo collections, human computer interaction, video surveillance, self-driving cars, robotics, virtual and augmented reality, simulation, medical imaging, and mobile computer vision. Modern computer vision relies heavily on machine learning in particular deep learning and graphical models. This course therefore assumes prior knowledge of deep learning (e.g., deep learning lecture) and introduces the basic concepts of graphical models and structured prediction where needed. The tutorials will deepen the understanding of deep neural networks by implementing and applying them in Python and PyTorch. A strong emphasis of this course is on 3D vision.

This class received the CS teaching award in summer 2021

Qualification Goals

Students gain an understanding of the theoretical and practical concepts of computer vision including image formation, camera models, feature detection, multiple view geometry, 3D reconstruction, motion estimation, object recognition, scene understanding and structured prediction using deep neural networks and graphical models. A strong emphasis of this course is on 3D vision. After this course, students should be able to understand and apply the basic concepts of computer vision in practice, develop and train computer vision models, reproduce research results and conduct original research in this area.

Overview

  • Course number: ML-4360
  • Credits: 6 ECTS, from 2023: 9 ECTS 
  • Recommended for: Master, 2nd semester
  • Total Workload: 270h
  • This lecture is taught as flipped classroom. Lectures will be held asynchronously via YouTube (see sidebar for link). We will provide all lectures before the respective interactive live sessions for self-study. Please watch the relevant videos before participating in the interactive live sessions.
  • Each week, we host an interactive live session where questions regarding the lecture and exercises are discussed together (see sidebar for details).
  • We also offer a weekly zoom helpdesk where students may ask questions or share their screen to obtain individual feedback and support for solving the exercises (see sidebar for details).
  • Exercises will not be graded. Instead, we will discuss the solution together.
  • Students may obtain bonus points for the exam by answering questions about the lectures and exercises in weekly quizzes. The questions also serve as a measure for self-assessment and self-motivation. All quizzes are provided via our Lecture Quiz Server (see sidebar for details).

Prerequisites

Registration

  • To participate in this lecture, you must enroll via ILIAS (see sidebar for link)
  • Registration via ILIAS will open on 30.03. at 12:00
  • Information about exam registration can be found here

Exercises

The exercises play an essential role in understanding the content of the course. There will be 6 assignments in total. The assignments contain pen and paper questions as well as programming problems. For some of the exercises, the students will use PyTorch, a state-of-the-art deep learning framework which features GPU support and auto-differentiation. If you have questions regarding the exercises or the lecture, please ask them during the interactive sessions, at the zoom helpdesk or in our ILIAS forum.

Further Readings

Schedule

Date

Lecture Slides and Videos
 

Live Sessions
(Zoom | MvL6+Zoom)

TA Support

22.04.

L01 - Introduction | Slides

1.1 Organization | Video

1.2 Introduction | Video

1.3 History of Computer Vision | Video

L01 - Lecture Organization
E01 - Exercise Introduction | Problems


 

Michael Niemeyer

29.04.

L02 - Image Formation | Slides
2.1 Primitives and Transformations | Video

2.2 Geometric Image Formation | Video
2.3 Photometric Image Formation | Video
2.4 Image Sensing Pipeline | Video

L02 - Lecture Q&A
E01 - Exercise Q&A

Michael Niemeyer

06.05.

L03 - Structure-from-Motion | Slides

3.1 - Preliminaries| Video

3.2 - Two-frame Structure-from-Motion| Video

3.3 - Factorization | Video

3.4 - Bundle Adjustment | Video

L03 - Lecture Q&A

E01 - Exercise Q&A

E02 - Exercise Introduction | Problems

Michael Niemeyer

13.05.

L04 - Stereo Reconstruction | Slides

4.1 - Preliminaries | Video

4.2 - Block Matching | Video

4.3 - Siamese Networks | Video

4.4 - Spatial Regularization | Video

4.5 - End-to-End Learning | Video

L04 - Lecture Q&A

E02 - Exercise Q&A

Michael Niemeyer

20.05.

L05 - Probabilistic Graphical Models | Slides

5.1 - Structured Prediction | Video

5.2 - Markov Random Fields | Video

5.3 - Factor Graphs | Video

5.4 - Belief Propagation | Video

5.5 - Examples | Video

L05 - Lecture Q&A

E02 - Exercise Q&A
E03 - Exercise Introduction | Problems

Michael Niemeyer
Zehao Yu

27.05.

L06 - Applications of Graphical Models | Slides

6.1 - Stereo Reconstruction | Video

6.2 - Multi-View Reconstruction | Video

6.3 - Optical Flow | Video

No Lecture Q&A This Week

E03 - Exercise Q&A

Zehao Yu

03.06.

L07 - Learning in Graphical Models | Slides

7.1 - Conditional Random Fields | Video

7.2 - Parameter Estimation | Video

7.3 - Deep Structured Models | Video

L07 - Lecture Q&A

E03 - Exercise Q&A
E04 - Exercise Introduction | Problems

Zehao Yu

10.06.

No Lecture (Pfingstpause)

No Exercise (Pfingstpause)

17.06.

L08 - Shape-from-X | Slides

8.1 - Shape-from-Shading | Video

8.2 - Photometric Stereo | Video

8.3 - Shape-from-X | Video

8.4 - Volumetric Fusion | Video

L08 - Lecture Q&A

E04 - Exercise Q&A

Zehao Yu

24.06.

No Lecture

No Exercise

01.07.

L09 - Coordinate-based Networks | Slides

9.1 - Implicit Neural Representations | Video

9.2 - Differentiable Volumetric Rendering | Video

9.3 - Neural Radiance Fields | Video

9.4 - Generative Radiance Fields | Video

L09 - Lecture Q&A

E04 - Exercise Q&A
E05 - Exercise Introduction | Problems

Zehao Yu

Markus Flicke

08.07.

L10 - Recognition | Slides

10.1 - Image Classification | Video

10.2 - Semantic Segmentation | Video

10.3 - Object Detection and Segmentation | Video

L10 - Lecture Q&A

E05 - Exercise Q&A

Markus Flicke

15.07.

L11 - Self-Supervised Learning | Slides

11.1 - Preliminaries | Video

11.2 - Task-specific Models | Video

11.3 - Pretext Tasks | Video

11.4 - Contrastive Learning | Video

L11 - Lecture Q&A

E05 - Exercise Q&A
E06 - Exercise Introduction | Problems

Markus Flicke

22.07.

No Lecture

No Exercise

29.07.

L12 - Diverse Topics in Computer Vision | Slides

12.1 - Input Optimization | Video

12.2 - Compositional Models | Video

12.3 - Human Body Models | Video

12.4 - Deepfakes | Video

L12 - Lecture Q&A

E06 - Exercise Q&A

Markus Flicke