Bachelor Theses at the Chair of Cognitive Systems (Prof. Dr. Andreas Zell)

Students who want to take a bachelor thesis should have attended at least one lecture of Prof. Zell and passed it with good or at least satisfactory grades. They might also have obtained the relevant background knowledge for the thesis from other, similar lectures.

Open Topics

Threshold calibration tool for event cameras

Mentor: Thomas Gossard

Email: thomas.gossardspam prevention@uni-tuebingen.de

Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision. However, it is difficult to find the right settings for event cameras as we cannot judge directly of the quality of the generated events. Finding a method to automatically tune these settings would be very advantageous.

In this thesis, the student will focus on the on/off thresholds of the event camera. Indeed, for a same contrast, there should be the same number of on and off events generated. However, that is not often the case. The objective of this thesis is to develop a tool capable of finding the right on/off thresholds to make the number of on/off events match. The student will have to generate a printable pattern whose observation with the event camera will enalbe the tuning of the thresholds.

Requirements: Python

Real-Time Video Stabilization for Marine Applications

Mentor: Benjamin Kiefer
Email: benjamin.kieferspam prevention@uni-tuebingen.de

Immerse yourself in this Bachelor's thesis project focused on the development of real-time video stabilization techniques for boats. The research will tackle the creation of effective algorithms designed to provide stable video output amidst unpredictable marine conditions. We're looking for candidates with a background in computer vision and a strong interest in practical, real-time applications. This project provides an opportunity to contribute significantly to a largely uncharted field. Set sail with us on this exciting endeavor to improve the reliability of maritime video technology in real time.

Requirements: Python programming; basic robotics and computer vision is a plus

SAM for Multispectral Image Data

Mentor: Hannah Frank
E-Mail: hannah.frankspam prevention@uni-tuebingen.de

With Segment Anything [0], Meta AI recently introduced a powerful new approach to instance segmentation and mask generation. However, it is restricted to RGB data. The goal of this thesis is to adapt their method for direct usage on multispectral recordings. For this, the student needs to extend the source code of the Segment Anything Model (SAM) to be applied to multiple channels (e.g., RGB + 2 NIR channels), and potentially fine-tune the model subsequently on this special kind of data. The adapted model should be evaluated on an existing multispectral data set or, ideally, multispectral recordings from our current research project. Also, the impact and advantage of using additional spectral channels should be analyzed.

[0] Kirillov et al., "Segment Anything", ArXiv preprint, 2022. (https://arxiv.org/abs/2304.02643)

Requirements: Deep learning basics, Python programming

Towards a more realistic event simulator

Mentor: Andreas Ziegler

Email: andreas.zieglerspam prevention@uni-tuebingen.de

Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.

Currently, a practical obstacle to adoption of event camera technology is the high cost of several thousand dollars per camera, similar to the situation with early time of flight cameras. In a recent project [1] we developed an event simulator which takes frames from a conventional frame-based camera as input and outputs events in real-time. In the current state, this event simulator does not consider any noise or other artifacts which normal event-cameras do have.

In this thesis, the goal is to add some of these noise to the event-simulator while maintaining a real-time runtime. The different sources of noise and artifacts of event-cameras in the existing literature [2], [3] should be analyzed in a first step. In a second step, the student should evaluate which noise can be implemented in a real-time fashion. By comparing to the output of real event-cameras we will quantify the improvement of the simulation.

The student should to be familiar with „traditional“ Computer Vision. A good command of C++ or Python from previous projects would be beneficial.

[1] A. Ziegler, D. Teigland, J. Tebbe, T. Gossard, and A. Zell, “Real-time event simulation with frame-based cameras.” arXiv, Sep. 10, 2022, [Online]. Available: http://arxiv.org/abs/2209.04634

[2] Y. Hu, S-C. Liu, and T. Delbruck. v2e: From Video Frames to Realistic DVS Events. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), URL: https://arxiv.org/abs/2006.07722, 2021

[3] D. Joubert, A. Marcireau, N. Ralph, A. Jolley, A. van Schaik, and G. Cohen, “Event Camera Simulator Improvements via Characterized Parameters,” Front. Neurosci., vol. 15, p. 702765, Jul. 2021, doi: 10.3389/fnins.2021.702765.

Pushing an event-simulator towards its limit

Mentor: Andreas Ziegler

Email: andreas.zieglerspam prevention@uni-tuebingen.de

Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.

Currently, a practical obstacle to adoption of event camera technology is the high cost of several thousand dollars per camera, similar to the situation with early time of flight cameras. In a recent project [1] we developed an event simulator which takes frames from a conventional frame-based camera as input and outputs events in real-time.

The goal of this thesis is to evaluate the limits of this event simulator by applying it in different real-time use cases. Two such scenarios are real-time object tracking of a fast moving object and balancing a ball with a robot arm on a 2D plane.

The student should to be familiar with „traditional“ Computer Vision and Robotics. A good command of C++ or Python from previous projects would be beneficial.

[1] A. Ziegler, D. Teigland, J. Tebbe, T. Gossard, and A. Zell, “Real-time event simulation with frame-based cameras.” arXiv, Sep. 10, 2022. Accessed: Dec. 09, 2022. [Online]. Available: arxiv.org/abs/2209.04634

Asynchronous Graph-based Neural Networks for Ball Detection with Event Cameras

Mentor: Andreas Ziegler

Email: andreas.zieglerspam prevention@uni-tuebingen.de

Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.

State-of-the-art machine-learning methods for event cameras treat events as dense representations and process them with CNNs. Thus, they fail to maintain the sparsity and asynchronous nature of event data, thereby imposing significant computation and latency constraints. A recent line of work [1]–[5] tackles this issue by modeling events as spatio-temporally evolving graphs that can be efficiently and asynchronously processed using graph neural networks. These works showed impressive reductions in computation.

The goal of this thesis is to apply these Graph-based networks for ball detection with event cameras. Existing graph-based networks were designed for some more general object detection task [4], [5]. Since we only want to detect balls, in a first step, the student will investigate if a network architecture, targeted for our use case, could further improve the inference time.

The student should to be familiar with „traditional“ Computer Vision and Deep Learning. Experience with Python and PyTorch from previous projects would be beneficial.

[1] Y. Li et al., “Graph-based Asynchronous Event Processing for Rapid Object Recognition,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, Oct. 2021, pp. 914–923. doi: 10.1109/ICCV48922.2021.00097.

[2] Y. Deng, H. Chen, H. Liu, and Y. Li, “A Voxel Graph CNN for Object Classification with Event Cameras,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, Jun. 2022, pp. 1162–1171. doi: 10.1109/CVPR52688.2022.00124.

[3] A. Mitrokhin, Z. Hua, C. Fermuller, and Y. Aloimonos, “Learning Visual Motion Segmentation Using Event Surfaces,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, Jun. 2020, pp. 14402–14411. doi: 10.1109/CVPR42600.2020.01442.

[4] S. Schaefer, D. Gehrig, and D. Scaramuzza, “AEGNN: Asynchronous Event-based Graph Neural Networks,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, Jun. 2022, pp. 12361–12371. doi: 10.1109/CVPR52688.2022.01205.

[5] D. Gehrig and D. Scaramuzza, “Pushing the Limits of Asynchronous Graph-based Object Detection with Event Cameras.” arXiv, Nov. 22, 2022. Accessed: Dec. 16, 2022. [Online]. Available: arxiv.org/abs/2211.12324

Ball Detection with event-based asynchronous sparse convolutional networks

Mentor: Andreas Ziegler

Email: andreas.zieglerspam prevention@uni-tuebingen.de

Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.

In comparison to image frames from conventional cameras, data from event-based cameras is much sparser in most cases. If this sparsity is taken into account, a deep-learning based detector can benefit from this sparsity and achieve a reduced inference time. The goal of this thesis is to use Asynchronous Sparse Convolutional Layers [1] and apply it in a neural network to detect fast moving table tennis balls in real-time.

The student should to be familiar with „traditional“ Computer Vision, Machine Learning/Deep Learning and Python. Prior experience of PyTorch would be beneficial.

[1] N. Messikommer, D. Gehrig, A. Loquercio, and D. Scaramuzza, “Event-based Asynchronous Sparse Convolutional Networks,” European Conference on Computer Vision. (ECCV) 2020. [Online]. Available: arxiv.org/abs/2003.09148

Multi Object tracking via event-based motion segmentation with event cameras

Mentor: Andreas Ziegler

Email: andreas.zieglerspam prevention@uni-tuebingen.de

Event cameras are bio-inspired sensors that asynchronously report timestamped changes in pixel intensity and offer advantages over conventional frame-based cameras in terms of low-latency, low redundancy sensing and high dynamic range. Hence, event cameras have a large potential for robotics and computer vision.

Since event cameras report changes of intensity per pixel, their output resembles an image gradient where mainly edges and corners are present. The contrast maximization framework (CMax) [1] uses this fact by optimizing the sharpness of accumulated events to solve computer vision tasks like the estimation of motion, depth or optical flow. Most recent works on event-based (multi) object segmentation [2]–[4] applies this CMax framework. The common scheme is to jointly assign events to an objct and fit ting a motion model which best explains the data.

The goal of this thesis is to develop a real-time capable (multi) object tracking pipeline by applying multi object segmentation. After the student got familiar with the recent literature, a suitable multi object segmentation approach should be chosen and adjusted for our use case, namely a table tennis setup. Afterwards, different object tracking approaches should be developed, evaluated and compared against each other.

The student should to be familiar with „traditional“ Computer Vision. Experience with C++ and/or optimization from previous projects or coursework would be beneficial.

[1] G. Gallego, M. Gehrig, and D. Scaramuzza, “Focus Is All You Need: Loss Functions for Event-Based Vision,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 12272–12281. doi: 10.1109/CVPR.2019.01256.

[2] X. Lu, Y. Zhou, and S. Shen, “Event-based Motion Segmentation by Cascaded Two-Level Multi-Model Fitting.” arXiv, Nov. 05, 2021. Accessed: Jan. 05, 2023. [Online]. Available: http://arxiv.org/abs/2111.03483

[3] T. Stoffregen, G. Gallego, T. Drummond, L. Kleeman, and D. Scaramuzza, “Event-Based Motion Segmentation by Motion Compensation,” ArXiv190401293 Cs, Aug. 2019, Accessed: Jun. 14, 2021. [Online]. Available: http://arxiv.org/abs/1904.01293

[4] Y. Zhou, G. Gallego, X. Lu, S. Liu, and S. Shen, “Event-based Motion Segmentation with Spatio-Temporal Graph Cuts,” IEEE Trans. Neural Netw. Learn. Syst., pp. 1–13, 2021, doi: 10.1109/TNNLS.2021.3124580.

Dynamic model for planning table tennis stroke with an industrial robot arm

Mentor: Thomas Gossard

Email: thomas.gossardspam prevention@uni-tuebingen.de

In order for a robot arm to play table tennis, it needs to be able to reach high speeds. This requires an accurate dynamic model of the robot to either predict the torque required to perform a certain motion (inverse dynamic) or to see how the robot behaves when applying specific torque on its joint. Unfortunately, we do not have access to the dynamic model of the robot. The objective of this thesis would be to generate such model for our industrial Kuka robot arm and to use it with a path planner to achieve the required stroke.

Reference:

Requirements: Python, Mechanics basics (mostly dynamics)