Bachelor of Science (B.Sc.) & Master of Science (M.Sc.):

Available Thesis Topics

Both Bachelor and Master theses are extensive pieces of work that take full-time committment over several months. They are thus also deeply personal decisions by each student. This page lists a few topics we are currently seeking to address in the research group. If you find one of them appealing, please contact the person listed here for each project. If you have an idea of your own — one that you are passionate about, and which fits into the remit of the Chair for the Methods of Machine Learning — please feel invited to pitch your idea. To do so, first contact Philipp Hennig to make an appointment for a first meeting.

Solving PDEs on the Sphere with Gaussian Processes (M.Sc. Project)

Supervisors: Tim Weiland & Nathaël da Costa

Partial differential equations (PDEs) are mechanistic models describing many physical processes. For instance, in climate applications, PDEs describe the evolution of wind and temperature through time on the surface of the Earth. Many PDEs of interest cannot be solved analytically and require instead the use of numerical methods. Gaussian process regression has been shown to be a powerful and versatile approach to such problems, generalising classical numerical methods. The aim of this project is to adapt existing Euclidean Gaussian process linear PDE solvers to spherical geometry. The method will be tested using several different Gaussian process priors on the sphere, which will be compared. This project offers an opportunity to learn about probabilistic inference and basic differential geometry, and will serve as a stepping stone to the design of probabilistic solvers for climate applications.

Prerequisites:

  •  experience with Julia or Python
  •  basic knowledge of Gaussian processes

Curvature-based step sizes for SGD (B.Sc. Project)

Supervisor: Lukas Tatzel

Finding appropriate learning rates remains an open problem for deep learning optimizers like SGD and Adam. Their updates have two components: a direction and a step size. The direction is given by a negative estimate of the loss function's gradient. The step size is a hyperparameter that typically requires tuning. In this project, we investigate the use of curvature information to adaptively set the step size used by SGD. We do so based on a (stochastic) quadratic approximation of the loss landscape. Under this approximation, the optimal step size for a given direction can be computed (cheaply) in closed form. We measure the performance of these "optimal'' step sizes and investigate an approach where the direction and the step size are computed based on two different mini-batches (which eliminates a certain bias).

Prerequisites:

  •  (basic) experience in PyTorch
  •  (basic) knowledge of deep learning

Leveraging Physical Structures in Probabilistic Linear Solvers (B.Sc. or M.Sc. Project)

Supervisor: Tim Weiland

Partial Differential Equations (PDEs) are a powerful tool to express mechanistic knowledge about the real world.
Gaussian processes (GPs) can be used to solve PDEs numerically by conditioning on linear operator observations involving the PDE.
The textbook method of computing a GP posterior involves a Cholesky factorization, which naively scales cubically in the number of data points and quickly becomes prohibitively expensive for real-world applications.
An alternative is to solve the associated linear system iteratively using a Probabilistic Linear Solver.
In each iteration, such solvers update their belief over the solution of the linear system using a projection of the data called an action.
Prior research suggests that leveraging the structure of PDE systems in particular when choosing the actions may save computations and memory or yield better uncertainty estimates.
The goal of this project is to attempt to design action policies specifically for physics simulations.
For you as a student, it's a great opportunity to dive into the exciting intersection of probabilistic inference, linear algebra, and physics.


Prerequisites:

  • Intuition for and appreciation of linear algebra
  • (basic) knowledge of GP regression

AlgoPerf Submissions (B.Sc. or M.Sc. Project)

Supervisor: Frank Schneider

At the moment, training a contemporary deep neural network is a time-consuming and resource-intensive process, largely due to the many crucial decisions practitioners must make regarding the underlying training algorithms (e.g.~should I use SGD, Adam, or Shampoo? What learning rate (schedule)? Weight decay? etc.).
AlgoPerf, a benchmark (competition) measuring training speed, provides a platform to meaningfully compare various algorithmic choices and thus guide practitioners.
This project aims to prepare submissions for the AlgoPerf benchmark by either implementing and tweaking existing methods or by innovatively combining them.
 

Prerequisites:

  • (basic) experience in PyTorch or JAX
  • (basic) knowledge of deep learning

Efficient exploration of synthetic datasets (M.Sc. Project)

Supervisor: Tobias Weber

Synthetic datasets are increasingly recognized as invaluable for training deep architectures. The emergence of potentially infinite data spaces introduces novel opportunities, particularly in strategically sampling new data based on the current model's performance. This thesis seeks to investigate the incorporation of uncertainty-aware strategies for sampling data batches, focusing on the replacement of the traditional data loader with a data simulator in machine learning pipelines. The primary objective is to explore diverse strategies in comparison to the baseline of uniform sampling on an existing toy example - a simple sequence prediction using a transformer architecture. The relevant literature covers aspects of active learning and automated curriculum learning. An additional goal could be to design larger experiments, to gain intuition whether the identified strategies might scale and generalize. 

Prerequisites:

  • experience with PyTorch or/and Jax

Structured noise in Diffusion Models (M.Sc. Thesis)

Supervisor: Jonathan Schmidt

So far, in score-based generative modeling, the (forward) diffusion process of choice is a quite simple Gauss–Markov process resulting in a standard normal or at most isotropic Gaussian distribution in the "latent" noise space. Basically, the simplicity comes from the assumption that the diffusion matrix is the identity. With some knowledge about what Gauss–Markov processes have to offer, this framework could be extended such that the forward process results in a Gaussian distribution with interesting covariance structure (e.g. a spatial Matérn process). The problem comes with a multitude of challenging tasks:

  •     The more complex backward process is not time-invariant anymore, so an efficient sampling process has to be figured out (possible, but difficult)
  •     in high-dimensional spaces (e.g. large-ish RGB images), we have to deal with computational complexity issues (also solvable, e.g. with the RRKF, but also difficult)
  •     figure out whether this actually has benefits and - if so - which ones, and how to leverage them

Prerequisites:

  • deep knowledge about SDEs and Gauss–Markov processes/ -inference
  • deep knowledge and understanding of the inner workings of diffusion models
  • good programming skills (in Python or Julia preferrably)

 

SPDE Priors for Probabilistic PDE and Inverse Problem Solvers (M.Sc. Project or M.Sc. Thesis)

Supervisor: Marvin Pförtner

Partial differential equations (PDEs) are the de-facto standard formalism for encoding knowledge about physical processes.

In practice, their solutions are approximated using highly-specialized numerical routines, which trade off approximation error for computational resources.
Recently, probabilistic numerical methods for PDEs which aim to quantify the approximation error of their classical counterparts have emerged.
However, since they are mostly based on Gaussian Processes (GPs), these methods tend to scale prohibitively with the fidelity of the simulation.
In spatial statistics, an established solution for the scalability issues of GP regression is the use of spatial and spatio-temporal Gauss-Markov random field (GMRF) priors -- solutions to linear stochastic PDEs (SPDEs).
Such priors naturally yield sparse precision operators, which can be used for efficient inference in information form.
The aim of this project is to generalize GMRF methods for classical GP regression to speed up inference in probabilistic linear PDE and/or linear inverse problem solvers.


Prerequisites:

  • solid coding skills (ideally in Julia)
  • solid knowledge of GP Regression
  • (ideally) basic knowledge of Galerkin/Finite Element Methods and/or Functional Analysis

Efficient training of Fourier neural operators (B.Sc. or M.Sc. Project)

Suprvisor: Emilia Magnani

There is a growing interest in using data-driven deep learning models to solve PDEs, with significant applications in weather forecasting and climate modeling. 
Once trained, they have been shown to be faster than traditional numerical methods for solving PDEs. 
However, such data-driven models, such as Fourier neural operators (FNOs), can be expensive to train. 
FNOs are especially promising because they learn complex mappings between input and output spaces while leveraging the advantages of Fourier representations. 
Typically, the training process for FNOs requires many (functional) observations in time, each on dense grids.
The model then extrapolates in time with the prediction. This process can be expensive and sometimes unfeasible due to the cost of collecting the training data. 
This project aims to investigate how to choose the inputs for FNOs in a smart way, with the goal of reducing the amount of training data required, i.e.  to achieve the same performance at a lower computational cost. This may involve studying the form of its Laplace tangent kernel to examine the learning process or considering linear observations of the data.
Moreover, the project poses questions on how to deal with functional data on sparser grids. 

Prerequisites:

  • (basic) experience in PyTorch
  • (basic) knowledge of DNNS 

A JAX Library for Practical Hessian Approximations (B.Sc. Project)

Supervisor: Joanna Sliwa

A Hessian of a model stores second-order partial derivatives of a loss function with regards to the models' weights. In its pure form, it is computationally infeasible. This project aims to create a library focused on efficiently approximating Hessian matrices in JAX. The library will implement key approximation techniques, such as Generalized Gauss Newton, Kronecker Factored Approximation Curvature, and the diagonal of a Fisher Information matrix. The primary goals include achieving computational and memory efficiency while maintaining a user-friendly design. By providing practical and accessible implementations within the JAX framework, this project seeks to offer a valuable tool for researchers and practitioners in various applications, including continual learning and model training.

Prerequisites:

  • experience in JAX
  • knowledge of deep learning
  • knowledge of linear algebra