Bachelor of Science (B.Sc.) & Master of Science (M.Sc.):

Available Thesis Topics

Both Bachelor and Master theses are extensive pieces of work that take full-time committment over several months. They are thus also deeply personal decisions by each student. This page lists a few topics we are currently seeking to address in the research group. If you find one of them appealing, please contact the person listed here for each project. If you have an idea of your own — one that you are passionate about, and which fits into the remit of the Chair for the Methods of Machine Learning — please feel invited to pitch your idea. To do so, first contact Philipp Hennig to make an appointment for a first meeting.

Curvature-based step sizes for SGD (B.Sc. Project)

Supervisor: Lukas Tatzel

Finding appropriate learning rates remains an open problem for deep learning optimizers like SGD and Adam. Their updates have two components: a direction and a step size. The direction is given by a negative estimate of the loss function's gradient. The step size is a hyperparameter that typically requires tuning. In this project, we investigate the use of curvature information to adaptively set the step size used by SGD. We do so based on a (stochastic) quadratic approximation of the loss landscape. Under this approximation, the optimal step size for a given direction can be computed (cheaply) in closed form. We measure the performance of these "optimal'' step sizes and investigate an approach where the direction and the step size are computed based on two different mini-batches (which eliminates a certain bias).

Prerequisites:

  •  (basic) experience in PyTorch
  •  (basic) knowledge of deep learning

Leveraging Physical Structures in Probabilistic Linear Solvers (B.Sc. or M.Sc. Project)

Supervisor: Tim Weiland

Partial Differential Equations (PDEs) are a powerful tool to express mechanistic knowledge about the real world.
Gaussian processes (GPs) can be used to solve PDEs numerically by conditioning on linear operator observations involving the PDE.
The textbook method of computing a GP posterior involves a Cholesky factorization, which naively scales cubically in the number of data points and quickly becomes prohibitively expensive for real-world applications.
An alternative is to solve the associated linear system iteratively using a Probabilistic Linear Solver.
In each iteration, such solvers update their belief over the solution of the linear system using a projection of the data called an action.
Prior research suggests that leveraging the structure of PDE systems in particular when choosing the actions may save computations and memory or yield better uncertainty estimates.
The goal of this project is to attempt to design action policies specifically for physics simulations.
For you as a student, it's a great opportunity to dive into the exciting intersection of probabilistic inference, linear algebra, and physics.


Prerequisites:

  • Intuition for and appreciation of linear algebra
  • (basic) knowledge of GP regression

AlgoPerf Submissions (M.Sc. Project)

Supervisor: Frank Schneider

At the moment, training a contemporary deep neural network is a time-consuming and resource-intensive process, largely due to the many crucial decisions practitioners must make regarding the underlying training algorithms (e.g.~should I use SGD, Adam, or Shampoo? What learning rate (schedule)? Weight decay? etc.).
AlgoPerf, a benchmark (competition) measuring training speed, provides a platform to meaningfully compare various algorithmic choices and thus guide practitioners.
This project aims to prepare submissions for the AlgoPerf benchmark by either implementing and tweaking existing methods or by innovatively combining them.
 

Prerequisites:

  • (basic) experience in PyTorch or JAX
  • (basic) knowledge of deep learning

Efficient training of Fourier neural operators (B.Sc. or M.Sc. Project)

Suprvisor: Emilia Magnani

There is a growing interest in using data-driven deep learning models to solve PDEs, with significant applications in weather forecasting and climate modeling. 
Once trained, they have been shown to be faster than traditional numerical methods for solving PDEs. 
However, such data-driven models, such as Fourier neural operators (FNOs), can be expensive to train. 
FNOs are especially promising because they learn complex mappings between input and output spaces while leveraging the advantages of Fourier representations. 
Typically, the training process for FNOs requires many (functional) observations in time, each on dense grids.
The model then extrapolates in time with the prediction. This process can be expensive and sometimes unfeasible due to the cost of collecting the training data. 
This project aims to investigate how to choose the inputs for FNOs in a smart way, with the goal of reducing the amount of training data required, i.e.  to achieve the same performance at a lower computational cost. This may involve studying the form of its Laplace tangent kernel to examine the learning process or considering linear observations of the data.
Moreover, the project poses questions on how to deal with functional data on sparser grids. 

Prerequisites:

  • (basic) experience in PyTorch
  • (basic) knowledge of DNNS 

A JAX Library for Practical Hessian Approximations (B.Sc. Project, starting earliest 1 January 2025)

starting earliest 1 January 2025
Supervisor: Joanna Sliwa

A Hessian of a model stores second-order partial derivatives of a loss function with regards to the models' weights. In its pure form, it is computationally infeasible. This project aims to create a library focused on efficiently approximating Hessian matrices in JAX. The library will implement key approximation techniques, such as Generalized Gauss Newton, Kronecker Factored Approximation Curvature, and the diagonal of a Fisher Information matrix. The primary goals include achieving computational and memory efficiency while maintaining a user-friendly design. By providing practical and accessible implementations within the JAX framework, this project seeks to offer a valuable tool for researchers and practitioners in various applications, including continual learning and model training.

Prerequisites:

  • experience in JAX
  • knowledge of deep learning
  • knowledge of linear algebra