Bachelor Theses at the Chair of Cognitive Systems (Prof. Dr. Andreas Zell)

Students who want to take a bachelor thesis should have attended at least one lecture of Prof. Zell and passed it with good or at least satisfactory grades. They might also have obtained the relevant background knowledge for the thesis from other, similar lectures.

Open Topics

Robust Monte Carlo Localization in Dynamic Environments

Mentor: Cornelia Schulz

Email: cornelia.schulzspam prevention@uni-tuebingen.de

Description: In mobile robotics, particle filters can be applied for localization in a known environment. They compare sensor measurements with a given map, using a so-called 'sensor model'. In this work, a recently proposed novel sensor model is to be implemented, which implicitly classifies sensor measurements into 'mapped' and 'unmapped' observations, which makes the pose estimation process more robust against disturbances like moving people.

Requirements: Linux, C++, ROS (helpful, but not mandatory)

Tree Detection with different Sparse and Dense 3D Sensors

Mentor: Cornelia Schulz

Email: cornelia.schulzspam prevention@uni-tuebingen.de

Description: For localization in outdoor environments, it would be interesting to detect trees and use them as landmarks. In this work, a robust and fast tree detection system shall be implemented, based on one or more of the 3D sensors mounted on our Summit XL outdoor robots (3D laser scanner, stereo system, RGB-D camera). Therefore, an existing recently published approach based on dense 3D sensors needs to be implemented as a baseline and may be extended, if needed.

Requirements: Linux, C++, ROS (helpful, but not mandatory)

Measuring and predicting the latency of PyTorch modules

Mentor: Kevin Laube

Email: kevin.laubespam prevention@uni-tuebingen.de

Description: In many applications, e.g. object detection, the latency of a Neural Network (NN) and its components is crucial and differs on different hardware. Being able to accurately predict the latency enables improvements in the context of Neural Architecture Search (NAS), where the topology of NNs is optimized e.g. to maximize accuracy while minimizing FLOPs/latency. The goal of this thesis the development of a measurement and prediction pipeline to accurately predict the latency of PyTorch modules on target hardware in our currently developed NAS framework.

Requirements: good grade in a Neural Network course, experience with PyTorch

Real-Time Ball Tracking via CNN for Table Tennis Robots

Mentor: Yapeng Gao

Email: yapeng.gao@uni-tuebingen.de

Description: The table tennis ball can be detected from cameras using a variety of traditional computer vision algorithms, like background subtraction, color thresholding, and blob detection. Recently, the CNN-based methods have achieved impressive performance in many tasks. In this topic, the student should develop a CNN and train it with a manually labeled dataset. It should be run in real-time and compared with the traditional algorithms.

Requirements: Python, English speaking

6D Racket Pose Tracking via Vision and IMU in Robotic Table Tennis

Mentor: Yapeng Gao

Email: yapeng.gao@uni-tuebingen.de

Description: By fusing a camera and an IMU, the student can directly detect and track the racket pose instead of the human pose in order to avoid deceptive actions of a human player. A pre-trained 2D object detector (YOLOv4) firstly can be used to extract the racket 2D bounding box. Then, the 6D racket pose can be estimated by fusing the camera and the IMU. To smooth and track the racket movements, the extended Kalman filter (EKF) should be used.

Requirements: Python, English speaking

Measuring the ripeness level of fruits by using Hyperspectral Imaging and Deep Learning - Simpler camera

Mentor: Leon Varga

Email: leon.varga.hdh@gmail.com

Description: Hyperspectral Imaging uses cameras, which can additionally record wavelengths outside the visible range. It can be regarded as spatial spectroscopy.

In previous research, we have shown that one can predict the ripeness level of avocados and kiwis with hyperspectral imaging and deep learning. The cameras used had many bands (number of wavelengths). It could be shown, that only some of the used wavelengths are important for the prediction. In this thesis, cameras with fewer bands should be used to reproduce the results and prove the idea that only some wavelengths are important. The thesis consists of experiment planning, measurements, data preprocessing, DNN network training, and data analysis.

Requirements: Image processing, neural networks, Python, C/C++, Linux

Measuring the ripeness level of fruits by using Hyperspectral Imaging and Deep Learning - Further Fruit

Mentor: Leon Varga

Email: leon.varga.hdh@gmail.com

Description: Hyperspectral Imaging uses cameras, which can additionally record wavelengths outside the visible range. It can be regarded as spatial spectroscopy.

In previous research, we have shown that one can predict the ripeness level of avocados and kiwis with hyperspectral imaging and deep learning. In this thesis, the process should be adapted for another fruit (for example apples). The thesis consists of experiment planning, measurements, data preprocessing, DNN network training, and data analysis.

Requirements: Image processing, neural networks, Python, C/C++, Linux

Actor-Critic performance for a robotic table tennis simulation

Mentor: Jonas Tebbe

Email: jonas.tebbe@uni-tuebingen.de

Description: Our table tennis robot learns to adapt its stroke using an actor-critic reinforcement learning model. The student is to adapt the algorithm to the OpenAI gym. On a simplified physical simulation, the performance of our custom model and the OpenAI baseline models shall be compared. In contrast to comparisons in the literature, the focus should be on learning accuracy using only a limited amount of data. In our case, that means a dataset of under 1000 table tennis strokes.

Requirements: Lecture on machine learning / neural networks.

Visualizing the responsiveness of a table tennis robot

Mentor: Jonas Tebbe

Email: jonas.tebbe@uni-tuebingen.de

Description: In order for our table tennis robot to hit the ball, the future trajectory of the ball is predicted. With more measurements of the flying ball, the prediction becomes more and more accurate. In order to analyze the bottlenecks for the hitting accuracy, the motion of the robot should be considered in a simplified fashion. The student should analyze the following questions: How does the accuracy of the prediction, improving over time, influence the final hitting result? How responsive is the final hitting result against sudden changes of the prediction accuracy for the better or for the worse?

Requirements: Lecture on robotics helpful.

Drawing People with a Robot Arm

Mentor: Mario Laux

Email: mario.lauxspam prevention@uni-tuebingen.de

Description: In his recently completed bachelor thesis Adrian Müller developed a system, which can take the image of person, perform line detection operators on the image, convert the binarized image to vector line segments and finally draw the line segments with a robot arm. While this works well for features with high contrast, features with low contrast, like the nose in frontal images, are not well detected. The aim of this thesis is to train a deep neural network to recognize typical features of a human face in an image and to convert it into a line drawing sketch, improving the existing system. The sketch should then be drawn on a whiteboard using a Franka Emika Panda robot arm.

Requirements: C++, DNN, ROS

Playing 9 Men's Morris (Mühle) with a Robot Arm with Suction Gripper

Mentor: Mario Laux

Email: mario.lauxspam prevention@uni-tuebingen.de

Description: The aim of this thesis is to create a working demo of a Franka Emika Panda robot arm playing "9 Men's Morris". One or several neural networks have to be trained to detect the stones and the board with a stationary video camera or a camera on the arm. Moves of the human opponent should be checked for correctness. An AI algorithm should determine a good move, which is then carried out using the robot arm's suction gripper. It must be checked with the camera, if the move was successful.

Requirements: C++, DNN, ROS

3D Object Reconstruction

Mentor: Daniel Weber

Description: The context of this thesis is the 3D reconstruction of objects. Various objects shall be recorded with a Kinect v2 from different angles. The goal of this work is then to construct a 3D model by combining the obtained point clouds / RGBD images. The implementation shall be tested with different objects as well as a different amount of input signal, e.g. only frontal records. The code shall be implemented in Python or C++ and run on Linux. The created 3D model should be suitable for training a neural network.

Requirements: Good programming skills

Salient Object Detection with Boolean Maps

Mentor: Daniel Weber

Email: daniel.weber@uni-tuebingen.de

Description: Computing saliency maps has been shown to be beneficial for applications like image segmentation. A simple and efficient bottom-up approach is the Boolean Map based Saliency model (BMS) proposed in [Zhang, ICCV2013]. The goal of this thesis is to implement BMS in Python or C++on Linux in order to detect unknown objects, where no data set is available and a neural network approach is not possible. Additionally it shall be tested if the approach presented in the paper can be extended from RGB images to RGBD images.

Requirements: Good programming skills

Hand Gestures following robot using Deep Learning

Mentor: Hamd ul Moqeet Riaz

Email: hamd.riaz@uni-tuebingen.de

Description: The aim of this thesis is to investigate the application of Deep learning techniques for recognizing human hand gestures. A turtlebot would follow the hand instructions predicted by the trained neural network. The robot must follow at least five hand gestures. Either hand gesture data can be collected or a pre-trained network can be employed. The trained network must be tested on a NVIDIA Jetson Tx-2 module for real time operation.

Requirements: Knowledge in Deep learning and computer vision, Programming in Python (Tensorflow/PyTorch), Basic understanding of ROS and mobile robots

Benchmarking FourierNet on limited hardware

Mentor: Hamd ul Moqeet Riaz

Email: hamd.riaz@uni-tuebingen.de

Description: FourierNet is an instance segmentation network, which utilizes a Fourier series to decode the shape (mask) of an object from compressed feature map. This thesis aims at training various configurations of FourierNet on different datasets and testing on limited hardware for real-time operation. FourierNet should be trained on PASCAL VOC and MS COCO with various input scales. The mean average precision (mAP) should be evaluated on both datasets. The speed of all these networks should be tested on an Nvidia Jetson TX2 or Xavier board, and a suitable candidate must be recommended.

Requirements: Knowledge in Deep learning and computer vision, Python (PyTorch)

Machine Learning for 6D Pose Estimation

Mentor: Timon Höfer

Email: timon.hoefer@uni-tuebingen.de

Description: One of the most important components of modern computer vision systems for applications such as mobile robotic manipulation and augmented reality is a reliable and fast 6D object detection module. 6D pose estimation is the task of detecting the 6D pose of an object, which include its location and orientation, where the D stands for degrees of freedom.

In this project, you will have a look at state of the art pose estimators and test them on different publicly available datasets. Optionally you could do tests with our own cameras and get qualitative results of different methods.

Requirements: Basic Python, experience with Github is useful

Bridging the gap between simulation and reality

Mentor: Timon Höfer

Email: timon.hoefer@uni-tuebingen.de

Description: In comparison to 2D bounding box annotation, the effort of labeling real images with full 6D object poses is magnitudes higher, requires expert knowledge and a complex setup. A way out of it is to train on synthetic images rendered from a 3D model, with pose labels free of charge. However, naive training on synthetic data does not typically generalize well to real test images. Therefore, a main challenge is to bridge the domain gap that separates simulated views from real camera images. In this project, you create different sets of synthetic data and compare the results of state of the art pose estimators trained on them.

Requirements: Basic Python

Camera Absolute Pose Estimation in Urban Environment for UAVs

Mentor: Chenhao Yang

Email: chenhao.yang@uni-tuebingen.de

Description: We have collected a UAV image database, covering a variety of typical outdoor environments in urban areas, which can be applied to deep learning training for UAV outdoor localization. The database contains the images collected by drones of different heights. Through the structure from motion method, the 6D camera poses (3D position, 3D orientation) corresponding to each image are obtained. Based on the database, we are planning to combine the camera relative pose estimation with the nearest neighbor image retrieval system to achieve a complete camera absolute pose estimation.

Requirements: deep learning experience, Python or C++ experience

On the Necessity of Anchors in Object Detection

Mentor: Martin Meßmer

Email: martin.messmerspam prevention@uni-tuebingen.de

For a long time, anchor-based object detection has been the non-plus-ultra in the research community. Many well-known one-shot detectors, like YOLOv2 and v3, SSD, and most recently EfficientDet, employed anchors with great success. Since FCOS (2019, Zhi Tian et al.) doubts formed about the necessity of anchors in object detection. In this thesis, the student should have a theoretical and a practical look at both and compare the two approaches.

Requirements: basic deep learning knowledge, Python, good English or German

An evaluation of multiple regression methods

Mentor: Valentin Bolz

Email: valentin.bolzspam prevention@uni-tuebingen.de

Description: There are multiple frameworks and libraries, which offer a variety of different regression tools. The goal of this thesis is to apply the most common techniques, such as linear and polynomial regression, random forests or neural networks, to datasets in which the target variables are continuous. Furthermore, a mathematical formulation of these techniques and an evaluation of their advantages and disadvantages among each other should be given.

Requirements: good mathematical skills, Python programming skills, basics in neural networks

Convolutions on graph-structered data

Mentor: Valentin Bolz

Email: valentin.bolzspam prevention@uni-tuebingen.de

Description: The application of convolution to image data is already well investigated. However, most of these techniques make explicit use of the grid-structured data. Great efforts have been made to transfer the basic ideas to arbitrarily graph-structured data, such as social networks. The goal of this thesis is to provide a comprehensive summary of the most important graph neural network approaches including their mathematical background as well as an application to publicly available datasets.

Requirements: strong mathematical foundation, Python programming skills, basics in neural networks

Accelerating DNN training Using Weight Extrapolation

Mentor: Benjamin Kiefer

Email: benjamin.kiefer@uni-tuebingen.de

Description: Nowadays, the Backpropagation Algorithm (BP) has proven to be a robust tool for training deep neural networks (DNNs). However, training times of DNNs grow larger and larger due to the increasing data set sizes. In this thesis, the student should explore an accelerating technique based on weight extrapolations. In particular, different ways of how extrapolations can be applied and their effects on various modern DNN architectures shall be studied.

Requirements: Basic knowledge in Deep Neural Networks. Basis knowledge in PyTorch or Tensorflow is beneficial.

Domain Adaptation for Object Detection on UAVs - TAKEN

Thesis already taken. If you'd like to work on something similar, contact me.

Mentor: Benjamin Kiefer

Email: benjamin.kiefer@uni-tuebingen.de

Description: Collecting labelled images and videos from UAVs is time-consuming and expensive. Instead, synthetic footage can be acquired much more easily, either from 3D video games (e.g. GTA V, RDR 2) or from professional simulations (e.g. DJI Flight simulator, Unreal Engine 4). However, training on synthetic images often fails to yield generalization to real images. In particular, images taken from UAVs cover wide areas with great variety and many features that may not be present in simulation software. Domain adaption is a field in Machine Learning that tries to narrow the performance gap when training and testing occur in different, but related domains (here synthetic vs. real data). In this thesis, the student should study certain domain adaption techniques in the context of UAV imagery. He/she should explore in how far different domain adaption techniques are promising for object detection from UAVs by comparing existing methods.

Requirements: Basic knowledge in Deep Neural Networks and PyTorch or Tensorflow.

Lightweight Object Detection App for Mobile Devices

Mentor: Ya Wang

Email: francis.wangspam prevention@uni-tuebingen.de

Description: A lightweight real-time object detection application is full of fun, easy to use, and important for our daily life using a mobile phone's embedded camera. Similar to the app iDetection, which currently uses YOLOv4 and YOLOv4-tiny for fast object detection, the Bachelor student needs to first test the current app version on the object detection results. Then try to change and improve the results by either using a different object detection model or combining YOLOv4 with the phone's imu or GPS.

Requirements: Java or Python, Android OS or iOS

Sensor Fusion with RGB-D Camera and LiDAR on Object Detection

Mentor: Ya Wang

Email: francis.wangspam prevention@uni-tuebingen.de

Description: Today, combing the advantages of different sensors is a smart way to get better performance in many fields. The RGB-D camera has a limited distance (5~10m) view, but can see nearby objects clearly, while LiDAR sensor has a large distance (30~35m) view but loses the details of nearby objects. The fusion of these sensors will bring higher accuracy to object detection. The student can first use KITTI object detection benchmark to train and test results. Later, he/she can do comparison experiments on benchmarks to prove that sensor fusion makes sense on improving object detection results quantitatively and qualitatively.

Requirements: Python, Tensorflow or PyTorch

Empirical upper bound in 3D object detection

Mentor: Nuri Benbarka

Email: nuri.benbakra@uni-tuebingen.de

Description: Object detection in 3D space is an essential task for mobile robots. In 3D object detection, there are usually three steps: generating 3D proposals, classification and bounding box regression of those proposals. In this thesis, we will try to focus on the second part only, where we will classify the ground truth bounding boxes using state-of-the-art networks for point clouds, like PointNet and others.

Requirements: Experience with Python and interest to learn PyTorch.

Implementation of Quantization Algorithms for Model Compression

Mentor: Rafia Rahim

Email: rafia.rahim@uni-tuebingen.de

Description: Deep Neural Networks based algorithms have brought huge accuracy improvements for stereo vision. However, they result in large models with long inference time. Our goal here is to implement algorithms for quantization and training of deep stereo vision algorithms for model compression. To this end, one part will involve writing algorithms for quantization and compression of existing state of the art deep stereo algorithms during training. The second part will focus on how to exploit model quantization and compression during inference time.

Requirements: good programming skills, deep learning knowledge.

Knowledge Distillation for the training of Lean Student Stereo Network

Mentor: Rafia Rahim

Email: rafia.rahim@uni-tuebingen.de

Description: Knowledge distillation is a way of transferring model capabilities from a deep computationally expensive network to a lean, compact and computationally efficient student network. The goal here is to explore knowledge distillation methods for training a lean student stereo network by distilling knowledge of state of the art 3-D teacher network. To this end, one will experiment with different knowledge distillation experiments for training of student networks.

Requirements: good programming skills, deep learning knowledge.

Exploring Deep Semi-Supervised Learning Methods on non-typical datasets

Mentor: Axel Fehrenbach
Email: axel.fehrenbachspam prevention@uni-tuebingen.de

Description: In recent years Deep Semi-Supervised Learning has been very successful in reducing the amount of labeled data needed while still retaining good results on image classification tasks. In this thesis, the student would compare Deep Semi-Supervised methods on a task and dataset outside the usual scope of image classification and datasets such as CIFAR-10.

Requirements: basic deep learning knowledge, basic python knowledge

Feature selection in IR Spectral data for Quality Control

Mentor: Axel Fehrenbach
Email: axel.fehrenbachspam prevention@uni-tuebingen.de

Description: Infrared spectral data analysis is a successful approach for quality control, from plastics over plants to pharmaceuticals. While traditionally a small number of wavelength bands are selected and features are computed, which are fed into a classifier (like SVM, ANN, Random Forest, kNN etc.), deep convolutional neural networks (DCNNs) usually work well on the raw input data. This thesis should investigate at least three different methods of wavelength feature selection on plastics data, and compare them with DCNNs, in training time, recall time and accuracy.

Requirements: basic deep learning knowledge, basic python knowledge

Fusion of point clouds obtained from two depth sensors

Mentor: Dr. Faranak Shamsafar

Email: faranak.shamsafarspam prevention@uni-tuebingen.de

Description: In the last decade, different types of depth cameras have been introduced for various applications. The goal of this thesis is to develop an efficient algorithm to fuse the point clouds obtained by two cameras from two different viewpoints in order to have a better depth perception. Both of the depth cameras are a recent active TOF ranging device, which is called "Azure Kinect DK". For this, at least two existing point cloud registration strategies should be tested on small plastic parts, and the better one should be further improved. The improvement of the combined point clouds versus the individual ones should be evaluated.

Requirements: Good programming skills in Python, C++ (helpful), Experience in DNN (helpful)

Investigating depth perception with three cameras

Mentor: Dr. Faranak Shamsafar

Email: faranak.shamsafarspam prevention@uni-tuebingen.de

Description: Depth perception of a scene is a popular practice that can be performed using two images which is called stereo vision. There has been an abundant amount of research to infer the depth from two images, using both traditional methods and deep learning pipelines. This thesis aims to investigate the influence of an extra image as the third resource of information on the accuracy of the estimated depth. This addition has the potential to alleviate the problem of missing points due to occlusion. In this thesis, firstly, a collection of images with three cameras should be captured. Following that, one of the most popular stereo vision methods (SGM) should be deployed on the captured images to infer the depth. Finally, the obtained three depth images (from three pairs of cameras) should be fused to obtain a better understanding of the scene.

Requirements: Good programming skills in Python, C++ (helpful)