Bachelor of Science (B.Sc.) & Master of Science (M.Sc.):

Available Thesis Topics

Both Bachelor and Master theses are extensive pieces of work that take full-time committment over several months. They are thus also deeply personal decisions by each student. This page lists a few topics we are currently seeking to address in the research group. If you find one of them appealing, please contact the person listed here for each project. If you have an idea of your own — one that you are passionate about, and which fits into the remit of the Chair for the Methods of Machine Learning — please feel invited to pitch your idea. To do so, first contact Philipp Hennig to make an appointment for a first meeting.

AlgoPerf Submissions (M.Sc. Project)

Supervisor: Frank Schneider

At the moment, training a contemporary deep neural network is a time-consuming and resource-intensive process, largely due to the many crucial decisions practitioners must make
regarding the underlying training algorithms (e.g. should I use SGD, ADAM, or SHAMPOO? What learning rate (schedule)? Weight decay? etc.). ALGOPERF, a benchmark (competition) measuring training speed, provides a platform to compare various algorithmic choices and thus guide practitioners meaningfully. This project aims to prepare a submission for the ALGOPERF benchmark by either implementing and tweaking existing methods or by innovatively combining them.

Possible approaches are:

Developing a self-tuning submission based on the winning external tuning submission, SHAMPOO.
Improving the winning self-tuning submission, SCHEDULE-FREE ADAMW,
e.g. by sequentially scheduling a second hyperparameter config for training the IMAGENET RESNET workload.
Benchmarking new algorithms that did not yet have a submission, e.g. ADEMAMI

Prerequisites:

(basic) experience in PyTorch or JAX
(basic) knowledge of deep learning

Uncertainty Disentanglement (Project)

Supervisor: Bálint Mucsányi

The field of uncertainty disentanglement (UD) has been steadily gaining traction in recent years. UD aims to construct multiple uncertainty estimators that are each tailored to one and only one source of uncertainty (e.g., epistemic and aleatoric uncertainty). Recently, it has been found that the most widely used uncertainty decomposition formulas fail to disentangle uncertainties. One possible reason is that the different uncertainties are all captured in the output space, which leads to high correlations. This project explores breaking these correlations by calculating the estimates at different points of the computation graph. The primary objective is to equip weight-space-based UQ methods – such as the Laplace approximation – with disentanglement capabilities by pushing the uncertainty forward to an intermediate representation for epistemic uncertainty and measuring aleatoric uncertainty in the output
space. The new approaches can be evaluated on an existing UD benchmark and compared to latent density methods.

Prerequisites:

experience with PyTorch

Creating a custom metric for CV tasks (B.Sc. / M.Sc. Thesis)

Supervisor: Carla Sagebiel

The Fréchet inception distance (FID) is one of the most important tools to assess the quality of Generative models in Computer Vision today. However, this metric is not without flaws [1] and also highly biased by the dataset the Inception Model was trained on. The goal of this project is to train a custom Inception network based on some specialized dataset and then evaluate the resulting FID, for example by measuring its ability to differentiate between original data and different test datasets (i.e. artificially distorted data or out of distribution examples), and by creating heat maps using GradCAM like in the mentioned paper. Special care should be used to keep the training of the model and the evaluating of
the FID separate.

Prerequisites:

experience in Python, basic experience with PyTorch
enough knowledge in ML to read the scientific literature mentioned above

[1] http://arxiv.org/abs/2203.06026

Efficient exploration of synthetic datasets [Project (+ MSc)]

Supervisor: Tobias Weber

Synthetic datasets are increasingly recognized as valuable for training deep architectures. The emergence of potentially infinite data spaces introduces novel opportunities, particularly in dynamically sampling new data based on the current model’s performance. In this thesis, we will explore the use of uncertainty-aware data sampling strategies in synthetic data generation and selection by focusing on the replacement of the traditional data loader in machine learning pipelines with a data simulator. In doing so, we aim to connect this approach to the recent discussion on whether, and in which cases, synthetic data leads to model collapse. While the project ideally starts with a literature review and iteration over small-scale experiments, conducting a research project first might be beneficial, though not strictly necessary.

Prerequisites:

Excitement for reading research papers.
Experience with setting up Deep Learning experiments in Jax or PyTorch.

Sampling-free Bayesian deep learning for classification [Project]

Supervisor: Nathaël da Costa

Standard neural architectures for classification typically comprise of a neural network mapping to a Euclidean space (the logits), followed by a softmax activation function. Such architectures are then trained using the cross entropy loss. Bayesian deep learning then attempts to build a Gaussian distribution over the parameters over the neural network. For each input to the network, this results in a Gaussian distribution over the logit space, which must be pushed forward through the softmax and integrated to obtain predictive probabilities. This last step cannot be done tractably in closed form, and is thus approximated through Monte-Carlo sampling, which is computationally costly and noisy. This is an exploratory project to test different choices of last-layer activations (to replace the softmax) and losses (to replace the cross-entropy loss) that allow for closed-form approximate distributions over the predictive space, bypassing Monte-Carlo sampling.

Prerequisites:

experience with PyTorch or Jax
experience with deep learning

A JAX Library for Practical Hessian Approximations (B.Sc. Project, starting earliest 1 January 2025)

starting earliest 1 January 2025
Supervisor: Joanna Sliwa

A Hessian of a model stores second-order partial derivatives of a loss function with regards to the models' weights. In its pure form, it is computationally infeasible. This project aims to create a library focused on efficiently approximating Hessian matrices in JAX. The library will implement key approximation techniques, such as Generalized Gauss Newton, Kronecker Factored Approximation Curvature, and the diagonal of a Fisher Information matrix. The primary goals include achieving computational and memory efficiency while maintaining a user-friendly design. By providing practical and accessible implementations within the JAX framework, this project seeks to offer a valuable tool for researchers and practitioners in various applications, including continual learning and model training.

Prerequisites:

experience in JAX
knowledge of deep learning
knowledge of linear algebra