Bachelor of Science (B.Sc.) & Master of Science (M.Sc.):

Available Thesis Topics

Both Bachelor and Master theses are extensive pieces of work that take full-time committment over several months. They are thus also deeply personal decisions by each student. This page lists a few topics we are currently seeking to address in the research group. If you find one of them appealing, please contact the person listed here for each project. If you have an idea of your own — one that you are passionate about, and which fits into the remit of the Chair for the Methods of Machine Learning — please feel invited to pitch your idea. To do so, first contact Philipp Hennig to make an appointment for a first meeting.

Accelerating Hessian-free Optimization for Physics-informed Neural Networks (M.Sc. Thesis Project)

Supervisors: Marius Zeinhofer (ETH Zürich), Felix Dangel (Vector Institute), Lukas Tatzel

The loss landscape of physics-informed neural networks (PINNs) is notoriously hard to navigate for first-order optimizers like SGD and Adam. Second-order methods can significantly outperform them on small- and medium-sized problems. In particular, the Hessian-free optimizer represents a strong baseline that requires almost no tuning. However, as the problem size grows, the Hessian-free optimizer only achieves a few hundred steps within a given budget, diminishing its benefits. The goal of this project is to accelerate the Hessian-free optimizer through three aspects: (i) numerical tricks to speed up matrix-vector products and enable efficient pre-conditioning, (ii) revisiting recommendations from the seminal work, and (iii) correcting for a recently discovered bias in mini-batch quadratics.

Prerequisites:

Experience with PyTorch and numerical optimization
Interest in PDEs and automatic differentiation

Exploring conditional diffusion models for solving spatio-temporal inverse problems (M.Sc. Thesis)

Supervisor: Jonathan Schmidt

Building on:
- [1] https://arxiv.org/abs/2306.10574
- [2] https://arxiv.org/abs/2412.15361

Set up a setting for an inverse problem:
- (a) prior dynamics +
- (b) observational constraints
- Training a diffusion model on (a), i.e. simulations / spatio-temporal dynamics (as done in [1])
- Come up with all sorts of interesting tasks (b) in the context of our setup and see how well these can be solved using the conditioned diffusion model (as done in [2])

The (scientific) setting and tasks to be tackled in this setting are reasonably open.
The Thesis can focus more on the application (with the goal to solve an inverse problem to a satisfying degree) or more on the method (with the goal to improve the conditioning mechanism) or both.

Requirements:
- Good understanding of Bayesian inference
- Good coding skills, especially PyTorch (or similar frameworks)
- Good understanding and some experience in training neural networks and managing experiments
- Good understanding of Score-Based Modeling (Diffusion Models)
- A good understanding of Gaussian Processes could prove useful when focusing on the methodological side of the project.

Benchmarking Probabilistic Approaches to Regression (MSc Project)

Supervisor: Thomas Christie

A central promise of probabilistic approaches to regression is the ability to make well-informed decisions based on uncertainty. At the core of typical approaches to such decision-making tasks lies a surrogate model for the function(s) of interest, which yields a distribution over function values at each point in the domain. In recent years, a plethora of model classes and approximations within these classes have been proposed, with examples including Gaussian processes (GPs), Bayesian neural networks (BNNs) and neural processes.

This has led to an increasingly complicated landscape, which is difficult for people within the field of machine learning to assess, let alone the practitioners we would like to use our models! The goal of this project is to provide a central benchmark for evaluating probabilistic models in the context of regression. We would initially focus on obtaining a standardised setup for the UCI benchmark, with a leaderboard, then add some challenging, (ideally) real-world Bayesian optimisation (BO) problems. We will then use the results to answer some questions of interest. Contact Thomas for more details.

Prerequisites:

Familiarity with PyTorch, and more generally experience with working with large codebases.
Nice to have - familiarity with at least one of GPs, BNNs, or BO.

Physics-Informed Gaussian Operators (Project or MSc)

Supervisor: Tim Weiland

The year is 2025. Big tech companies churn out humongous deep learning models on what feels like a weekly basis. In physical applications, people are slowly buying into the idea that all they need is yet another stack of layers in their neural networks. In these dark times, one small and humble hero resists: the Gaussian process. Gaussian processes are naturally able to enforce physical conservation laws, and as such, are often marketed as a great fit for physical applications. Yet at the same time, it is undeniable that the ”amortization aspect” of neural operators (i.e. the property that they ”learn” from related simulations) is incredibly powerful. The goal of this project is to combine both of these properties in one, through a deep, physics-informed Gaussian operator. Interested? Reach out to Tim for the details.

Prerequisites:
Solid prior knowledge in Probabilistic Machine Learning, particularly GPs
Prior experience in PyTorch / JAX / …; pick your poison

Interactive ODE Filter Visualisation (BSc)

Supervisor: Paul Fischer

Ordinary differential equations (ODEs) are fundamental to modeling dynamical systems. However, most cannot be solved analytically and thus require numerical approximations. ODE filters are a recent class of probabilistic numerical method that reinterprets the problem of finding solutions to ODE initial value problems as a Bayesian inference task. This can then be solved using filtering and smoothing algorithms. Unlike classical solvers, ODE filters quantify uncertainty, making them particularly appealing for real-world applications. However, due to their recent development, there are not yet many accessible resources or intuitive visual explanations available. In this project, you will address this issue by developing an interactive ODE filter visualization similar to the following visualizations of Gaussian processes: ( https://smlbook.org/GP/ and https://www.infinitecuriosity.org/vizgp/ ). By doing so, you will contribute to this emerging research field, gain experience in data visualization, and learn about probabilistic machine learning.

Requirements:

Interest in probabilistic machine learning
Interest in data visualization

Optimal Structural Design with Probabilistic PDE Solvers (Project / MSc)

Supervisor: Bernardo Fichera

Airplane design requires selecting the appropriate structural configuration and materials. Here, appropriate means achieving the desired performance across the intended flight regime. From an aerodynamic perspective, performance is strongly influenced by the structural deformations of the aircraft in flight. The central design challenge, then, is: what structural configuration (e.g., wing length) and material properties (e.g., Young’s modulus) will produce the desired deformed shape during flight? Addressing this challenge amounts to solving the inverse aeroelastic problem -- that is, determining the structural and material design from the coupled interaction between aerodynamics and structural deformation. Both the structural and aerodynamic aspects of this problem involve solving PDEs. We aim to leverage probabilistic PDE solvers to tackle such inverse problems.

This approach offers two main advantages:

Natural incorporation of uncertainty quantification into the inverse solution.
A framework that recasts the inverse problem as a form of Bayesian optimization.

Applications of Practical Hessian Approximations in JAX (M.Sc. Project)

Supervisor: Joanna Sliwa

A Hessian matrix captures the second-order partial derivatives of a model’s loss function with respect to its parameters. While it provides valuable information for optimization, calibration, and uncertainty estimation, computing the exact Hessian is infeasible for modern deep networks. This project focuses on evaluating scalable Hessian approximations in JAX. Using the laplax library, we will conduct exploratory studies in some applications (depending on interest) such as continual and transfer learning, curvature-aware model merging and decomposition, and LoRA fine-tuning for large language models. The objective is to both showcase and extend the practical use cases of Hessian-based methods by contributing implementations of common applications directly into the laplax library.

Prerequisites:
> Prior experience with JAX
> Familiarity with neural network training and optimization
> Some exposure to second-order methods beneficial but not strictly required

Towards Full Deep Learning Hessians (B.Sc./ M.Sc.)

Supervisor: Andres Fernandez

The deep learning Hessian is a valuable and interesting linear operator, with applications spanning optimization, pruning, uncertainty quantification and loss landscape analysis. Unfortunately it is intractable for most neural networks, due to its gargantuan size, and practicioners must resort to tractable approximations that aren’t necessarily accurate (last-layer, diagonal, GGN, Kronecker). Instead, research suggests that deep learning Hessians may follow a low-rank structure. The aim of this project is to leverage recent advances in sketched methods from randomized linear algebra to obtain accurate low-rank approximations of Hessians at scale. This will involve working with deep learning setups and performing large-scale experiments on the cluster, comparing different Hessian approximations.

Prerequisites:

> Python and affinity for scalable and maintainable code
> Rudiments of Linear algebra
> Rudiments of deep learning
> (Ideally) Rudiments of optimization (convex and non-convex)