Lecture: Computer Vision

The goal of computer vision is to compute geometric and semantic properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an object, determining how things are moving and recognizing objects or scenes. This course will provide an introduction to computer vision, with topics including image formation, camera models, camera calibration, feature detection and matching, motion estimation, geometry reconstruction, object detection and tracking, and scene understanding. Applications include building 3D maps, creating virtual avatars, image search, organizing photo collections, human computer interaction, video surveillance, self-driving cars, robotics, virtual and augmented reality, simulation, medical imaging, and mobile computer vision. Modern computer vision relies heavily on machine learning in particular deep learning and graphical models. This course therefore assumes prior knowledge of deep learning (e.g., deep learning lecture) and introduces the basic concepts of graphical models and structured prediction where needed. The tutorials will deepen the understanding of deep neural networks by implementing and applying them in Python and PyTorch. A strong emphasis of this course is on 3D vision.

News: This class received the CS teaching award in summer 2021!

Qualification Goals

Students gain an understanding of the theoretical and practical concepts of computer vision including image formation, camera models, feature detection, multiple view geometry, 3D reconstruction, motion estimation, object recognition, scene understanding and structured prediction using deep neural networks and graphical models. A strong emphasis of this course is on 3D vision. After this course, students should be able to understand and apply the basic concepts of computer vision in practice, develop and train computer vision models, reproduce research results and conduct original research in this area.


  • Course number: ML-4360
  • Credits: 6 ECTS (2h lecture + 2h exercise)
  • Total Workload: 180h
  • Lectures and exercises will be held asynchronously through YouTube (see sidebar for link). We will provide all lectures and exercise introductions several days before the respective interactive live sessions for self-study. You should watch these videos before participating in the interactive live sessions.
  • Each Thursday, we will host an interactive plenary lecture Q&A session starting at 16:00 via Zoom (see sidebar for link) where questions regarding the lecture and exercises are answered. Every second Thursday (when a new exercise starts) this session is followed by a plenary exericse Q&A session where remaining questions for the current exercise can be answered. Make sure that you have the latest Zoom client installed.
  • Every other week, we will hold individual live exercise Q&A sessions throughout the entire Thursday in smaller groups to obtain personalized feedback and help on the currently running exercise or get answers to general questions. The assignments to these individual sessions is done through ILIAS.
  • Exercises will not be graded. We will provide solutions before the final plenary Q&A session.
  • Students shall watch both the lecture and exercise videos before the Q&A session and take note of questions that they like to have answered during the Q&A sessions.



  • To participate in this lecture, you must enroll via ILIAS (see sidebar for link) until 29.4.2021
  • Information about exam registration can be found here


  • To qualify for the final exam, students must have registered to the lecture on ILIAS
  • To participate in the exam, students must register through ILIAS towards the end of the semester
  • To obtain a 0.3 bonus in the final exam, students must have submitted lecture notes for one lecture

All topics discussed in lectures, Q&A sessions and exercises are relevent for the final exam.


The exercises play an essential role in understanding the content of the course. There will be 6 assignments in total (see content table below). The assignments contain pen and paper questions as well as programming problems. For some of the exercises, the students will use PyTorch, a state-of-the-art deep learning framework which features GPU support and auto-differentiation. If you have questions regarding the exercises or the lecture, please ask them during the interactive zoom sessions or in our ILIAS forum.

Lecture Notes

Interested students collectively write Latex lecture notes to complement the slides, summarizing the content discussed in the lecture videos. Submission of lecture notes is voluntary (a bonus of 0.3 is obtained) and are not a requirement to participate in the exam. In the beginning of the course, every registered student will be assigned one lecture. Students can find their assignments in ILIAS. The lecture notes must be submitted via ILIAS at the latest 7 days after the respective official lecture date (see content table below). Lecture notes must be written individually (not in groups). We will continuously merge and consolidate the lecture notes into a single document. You can edit the lecture notes in Overleaf or a local Latex editor. To get started, copy the Computer Vision Lecture Notes Latex Template.

Further Readings




Exercises (4pm)

TA Support


L01 - Introduction | Slides

1.1 Organization | Video

1.2 Introduction | Video

1.3 History of Computer Vision | Video

L01 - Plenary Q&A
E01 - Plenary Introduction | Problems


Michael Niemeyer


L02 - Image Formation | Slides
2.1 Primitives and Transformations | Video

2.2 Geometric Image Formation | Video
2.3 Photometric Image Formation | Video
2.4 Image Sensing Pipeline | Video

L02 - Plenary Q&A
E01 - Individual Q&A

Michael Niemeyer


L03 - Structure-from-Motion | Slides

3.1 - Preliminaries| Video

3.2 - Two-frame Structure-from-Motion| Video

3.3 - Factorization | Video

3.4 - Bundle Adjustment | Video

L03 - Plenary Q&A

E01 - Plenary Q&A

E02 - Plenary Introduction | Problems

Michael Niemeyer

Niklas Hanselmann


No Lecture (Ascension)

No Exercise


L04 - Stereo Reconstruction | Slides

4.1 - Preliminaries | Video

4.2 - Block Matching | Video

4.3 - Siamese Networks | Video

4.4 - Spatial Regularization | Video

4.5 - End-to-End Learning | Video

L04 - Plenary Q&A

E02 - Individual Q&A

Niklas Hanselmann


No Lecture (Pentecost break)

No Exercise


No Lecture (Corpus Christi)

No Exercise


L05 - Probabilistic Graphical Models | Slides

5.1 - Structured Prediction | Video

5.2 - Markov Random Fields | Video

5.3 - Factor Graphs | Video

5.4 - Belief Propagation | Video

5.5 - Examples | Video

L05 - Plenary Q&A

E02 - Plenary Q&A
E03 - Plenary Introduction | Problems

Niklas Hanselmann
Carolin Schmitt


L06 - Applications of Graphical Models | Slides

6.1 - Stereo Reconstruction | Video

6.2 - Multi-View Reconstruction | Video

6.3 - Optical Flow | Video

L06 - Plenary Q&A

E03 - Individual Q&A

Carolin Schmitt


L07 - Learning in Graphical Models | Slides

7.1 - Conditional Random Fields | Video

7.2 - Parameter Estimation | Video

7.3 - Deep Structured Models | Video

L07 - Plenary Q&A

E03 - Plenary Q&A
E04 - Plenary Introduction | Problems

Carolin Schmitt
Stefan Baur


L08 - Shape-from-X | Slides

8.1 - Shape-from-Shading | Video

8.2 - Photometric Stereo | Video

8.3 - Shape-from-X | Video

8.4 - Volumetric Fusion | Video

L08 - Plenary Q&A

E04 - Individual Q&A

Stefan Baur


L09 - Coordinate-based Networks | Slides

9.1 - Implicit Neural Representations | Video

9.2 - Differentiable Volumetric Rendering | Video

9.3 - Neural Radiance Fields | Video

9.4 - Generative Radiance Fields | Video

L09 - Plenary Q&A

E04 - Plenary Q&A
E05 - Plenary Introduction | Problems

Stefan Baur

Katja Schwarz


L10 - Recognition | Slides

10.1 - Image Classification | Video

10.2 - Semantic Segmentation | Video

10.3 - Object Detection and Segmentation | Video

L10 - Plenary Q&A

E05 - Plenary Q&A

Katja Schwarz


L11 - Self-Supervised Learning | Slides

11.1 - Preliminaries | Video

11.2 - Task-specific Models | Video

11.3 - Pretext Tasks | Video

11.4 - Contrastive Learning | Video

L11 - Plenary Q&A

E05 - Plenary Q&A
E06 - Plenary Introduction | Problems

Katja Schwarz
Kashyap Chitta


L12 - Diverse Topics in Computer Vision | Slides

12.1 - Input Optimization | Video

12.2 - Compositional Models | Video

12.3 - Human Body Models | Video

12.4 - Deepfakes | Video

L12 - Plenary Q&A

E06 - Plenary Q&A

Kashyap Chitta