The goal of computer vision is to compute geometric and semantic properties of the three-dimensional world from digital images. Problems in this field include reconstructing the 3D shape of an object, determining how things are moving and recognizing objects or scenes. This course will provide an introduction to computer vision, with topics including image formation, camera models, camera calibration, feature detection and matching, motion estimation, geometry reconstruction, object detection and tracking, and scene understanding. Applications include building 3D maps, creating virtual avatars, image search, organizing photo collections, human computer interaction, video surveillance, self-driving cars, robotics, virtual and augmented reality, simulation, medical imaging, and mobile computer vision. Modern computer vision relies heavily on machine learning in particular deep learning and graphical models. This course therefore assumes prior knowledge of deep learning (e.g., deep learning lecture) and introduces the basic concepts of graphical models and structured prediction where needed. The tutorials will deepen the understanding of deep neural networks by implementing and applying them in Python and PyTorch. A strong emphasis of this course is on 3D vision.
Students gain an understanding of the theoretical and practical concepts of computer vision including image formation, camera models, feature detection, multiple view geometry, 3D reconstruction, motion estimation, object recognition, scene understanding and structured prediction using deep neural networks and graphical models. A strong emphasis of this course is on 3D vision. After this course, students should be able to understand and apply the basic concepts of computer vision in practice, develop and train computer vision models, reproduce research results and conduct original research in this area.
The exercises play an essential role in understanding the content of the course. There will be 6 assignments in total. The assignments contain pen and paper questions as well as programming problems. For some of the exercises, the students will use PyTorch, a state-of-the-art deep learning framework which features GPU support and auto-differentiation. If you have questions regarding the exercises or the lecture, please ask them during the interactive sessions, at the zoom helpdesk or in our chat.
Date | Lecture Slides and Videos | Live Sessions | TA Support |
Recap: Math for Deep Learning | |||
28.04. | L01 - Introduction | Slides 1.1 Organization | Video 1.2 Introduction | Video 1.3 History of Computer Vision | Video | L01 - Lecture Organization | Markus Flicke |
05.05. | L02 - Image Formation | Slides 2.2 Geometric Image Formation | Video | L02 - Lecture Q&A | Markus Flicke |
12.05. | L03 - Structure-from-Motion | Slides 3.1 - Preliminaries| Video 3.2 - Two-frame Structure-from-Motion| Video 3.3 - Factorization | Video 3.4 - Bundle Adjustment | Video | L03 - Lecture Q&A E01 - Exercise Q&A E02 - Exercise Introduction | Problems | Markus Flicke |
19.05. | L04 - Stereo Reconstruction | Slides 4.1 - Preliminaries | Video 4.2 - Block Matching | Video 4.3 - Siamese Networks | Video 4.4 - Spatial Regularization | Video 4.5 - End-to-End Learning | Video | L04 - Lecture Q&A E02 - Exercise Q&A | Markus Flicke |
26.05. | L05 - Probabilistic Graphical Models | Slides 5.1 - Structured Prediction | Video 5.2 - Markov Random Fields | Video 5.3 - Factor Graphs | Video 5.4 - Belief Propagation | Video 5.5 - Examples | Video | L05 - Lecture Q&A E02 - Exercise Q&A | Markus Flicke |
No Lecture (Pfingstpause) | No Exercise (Pfingstpause) | ||
09.06. | L06 - Applications of Graphical Models | Slides 6.1 - Stereo Reconstruction | Video 6.2 - Multi-View Reconstruction | Video 6.3 - Optical Flow | Video | L06 - Lecture Q&A E03 - Exercise Q&A | Markus Flicke |
16.06. | L07 - Learning in Graphical Models | Slides 7.1 - Conditional Random Fields | Video 7.2 - Parameter Estimation | Video 7.3 - Deep Structured Models | Video | L07 - Lecture Q&A E03 - Exercise Q&A | Markus Flicke Haiwen Huang |
23.06. | No Lecture | No Exercise | |
30.06. | L08 - Shape-from-X | Slides 8.1 - Shape-from-Shading | Video 8.2 - Photometric Stereo | Video 8.3 - Shape-from-X | Video 8.4 - Volumetric Fusion | Video | L08 - Lecture Q&A E04 - Exercise Q&A | Haiwen Huang |
07.07. | L09 - Coordinate-based Networks | Slides 9.1 - Implicit Neural Representations | Video 9.2 - Differentiable Volumetric Rendering | Video 9.3 - Neural Radiance Fields | Video 9.4 - Generative Radiance Fields | Video | L09 - Lecture Q&A E04 - Exercise Q&A | Haiwen Huang |
14.07. | L10 - Recognition | Slides 10.1 - Image Classification | Video 10.2 - Semantic Segmentation | Video 10.3 - Object Detection and Segmentation | Video | L10 - Lecture Q&A E05 - Exercise Q&A | Haiwen Huang |
21.07. | L11 - Self-Supervised Learning | Slides 11.1 - Preliminaries | Video 11.2 - Task-specific Models | Video 11.3 - Pretext Tasks | Video 11.4 - Contrastive Learning | Video | L11 - Lecture Q&A E05 - Exercise Q&A | Haiwen Huang |
28.07. | L12 - Diverse Topics in Computer Vision | Slides 12.1 - Input Optimization | Video 12.2 - Compositional Models | Video 12.3 - Human Body Models | Video 12.4 - Deepfakes | Video | L12 - Lecture Q&A E06 - Exercise Q&A | Haiwen Huang |
Our website uses cookies. Some of them are mandatory, while others allow us to improve your user experience on our website. The settings you have made can be edited at any time.
or
Essential
in2cookiemodal-selection
Required to save the user selection of the cookie settings.
3 months
be_lastLoginProvider
Required for the TYPO3 backend login to determine the time of the last login.
3 months
be_typo_user
This cookie tells the website whether a visitor is logged into the TYPO3 backend and has the rights to manage it.
Browser session
ROUTEID
These cookies are set to always direct the user to the same server.
Browser session
fe_typo_user
Enables frontend login.
Browser session
Videos
iframeswitch
Used to show all third-party contents.
3 months
yt-player-bandaid-host
Is used to display YouTube videos.
Persistent
yt-player-bandwidth
Is used to determine the optimal video quality based on the visitor's device and network settings.
Persistent
yt-remote-connected-devices
Saves the settings of the user's video player using embedded YouTube video.
Persistent
yt-remote-device-id
Saves the settings of the user's video player using embedded YouTube video.
Persistent
yt-player-headers-readable
Collects data about visitors' interaction with the site's video content - This data is used to make the site's video content more relevant to the visitor.
Persistent
yt-player-volume
Is used to save volume preferences for YouTube videos.
Persistent
yt-player-quality
Is used to save the quality settings for YouTube videos.
Persistent
yt-remote-session-name
Saves the settings of the user's video player using embedded YouTube video.
Browser session
yt-remote-session-app
Saves the settings of the user's video player using embedded YouTube video.
Browser session
yt-remote-fast-check-period
Saves the settings of the user's video player using embedded YouTube video.
Browser session
yt-remote-cast-installed
Saves the user settings when retrieving a YouTube video integrated on other web pages
Browser session
yt-remote-cast-available
Saves user settings when retrieving integrated YouTube videos.
Browser session
ANID
Used for targeting purposes to profile the interests of website visitors in order to display relevant and personalized Google advertising.
2 years
SNID
Google Maps - Google uses these cookies to store user preferences and information when you view pages with Google Maps.
1 month
SSID
Used to store information about how you use the site and what advertisements you saw before visiting this site, and to customize advertising on Google resources by remembering your recent searches, your previous interactions with an advertiser's ads or search results, and your visits to an advertiser's site.
6 months
1P_JAR
This cookie is used to support Google's advertising services.
1 month
SAPISID
Used for targeting purposes to profile the interests of website visitors in order to display relevant and personalized Google advertising.
2 years
APISID
Used for targeting purposes to profile the interests of website visitors in order to display relevant and personalized Google advertising.
6 months
HSID
Includes encrypted entries of your Google account and last login time to protect against attacks and data theft from form entries.
2 years
SID
Used for security purposes to store digitally signed and encrypted records of a user's Google Account ID and last login time, enabling Google to authenticate users, prevent fraudulent use of login credentials, and protect user data from unauthorized parties. This may also be used for targeting purposes to display relevant and personalized advertising content.
6 months
SIDCC
This cookie stores information about user settings and information for Google Maps.
3 months
NID
The NID cookie contains a unique ID that Google uses to store your preferences and other information.
6 months
CONSENT
This cookie tracks how you use a website to show you advertisements that may be of interest to you.
18 years
__Secure-3PAPISID
This cookie is used to support Google's advertising services.
2 years
__Secure-3PSID
This cookie is used to support Google's advertising services.
6 months
__Secure-3PSIDCC
This cookie is used to support Google's advertising services.
6 months