Bachelor of Science (B.Sc.) & Master of Science (M.Sc.):

Available Thesis Topics

Both Bachelor and Master theses are extensive pieces of work that take full-time committment over several months. They are thus also deeply personal decisions by each student. This page lists a few topics we are currently seeking to address in the research group. If you find one of them appealing, please contact the person listed here for each project. If you have an idea of your own — one that you are passionate about, and which fits into the remit of the Chair for the Methods of Machine Learning — please feel invited to pitch your idea. To do so, first contact Philipp Hennig to make an appointment for a first meeting.

AlgoPerf Submissions (M.Sc. Project)

Supervisor: Frank Schneider

At the moment, training a contemporary deep neural network is a time-consuming and resource-intensive process, largely due to the many crucial decisions practitioners must make
regarding the underlying training algorithms (e.g. should I use SGD, ADAM, or SHAMPOO? What learning rate (schedule)? Weight decay? etc.). ALGOPERF, a benchmark (competition) measuring training speed, provides a platform to compare various algorithmic choices and thus guide practitioners meaningfully. This project aims to prepare a submission for the ALGOPERF benchmark by either implementing and tweaking existing methods or by innovatively combining them.

Possible approaches are:

  • Developing a self-tuning submission based on the winning external tuning submission, SHAMPOO.
  • Improving the winning self-tuning submission, SCHEDULE-FREE ADAMW,
    e.g. by sequentially scheduling a second hyperparameter config for training the IMAGENET RESNET workload.
  •  Benchmarking new algorithms that did not yet have a submission, e.g. ADEMAMI

Prerequisites:

  • (basic) experience in PyTorch or JAX
  • (basic) knowledge of deep learning

Uncertainty Disentanglement (Project)

Supervisor: Bálint Mucsányi

The field of uncertainty disentanglement (UD) has been steadily gaining traction in recent years. UD aims to construct multiple uncertainty estimators that are each tailored to one and only one source of uncertainty (e.g., epistemic and aleatoric uncertainty). Recently, it has been found that the most widely used uncertainty decomposition formulas fail to disentangle uncertainties. One possible reason is that the different uncertainties are all captured in the output space, which leads to high correlations. This project explores breaking these correlations by calculating the estimates at different points of the computation graph. The primary objective is to equip weight-space-based UQ methods – such as the Laplace approximation – with disentanglement capabilities by pushing the uncertainty forward to an intermediate representation for epistemic uncertainty and measuring aleatoric uncertainty in the output
space. The new approaches can be evaluated on an existing UD benchmark and compared to latent density methods.

Prerequisites:

  •  experience with PyTorch

Accelerating Hessian-free Optimization for Physics-informed Neural Networks (M.Sc. Thesis Project)

Supervisors: Marius Zeinhofer (ETH Zürich), Felix Dangel (Vector Institute), Lukas Tatzel

The loss landscape of physics-informed neural networks (PINNs) is notoriously hard to navigate for first-order optimizers like SGD and Adam. Second-order methods can significantly outperform them on small- and medium-sized problems. In particular, the Hessian-free optimizer represents a strong baseline that requires almost no tuning. However, as the problem size grows, the Hessian-free optimizer only achieves a few hundred steps within a given budget, diminishing its benefits. The goal of this project is to accelerate the Hessian-free optimizer through three aspects: (i) numerical tricks to speed up matrix-vector products and enable efficient pre-conditioning, (ii) revisiting recommendations from the seminal work, and (iii) correcting for a recently discovered bias in mini-batch quadratics.

Prerequisites: 

  • Experience with PyTorch and numerical optimization
  • Interest in PDEs and automatic differentiation
     

A JAX Library for Practical Hessian Approximations (B.Sc. Project, starting earliest 1 January 2025)

starting earliest 1 January 2025
Supervisor: Joanna Sliwa

A Hessian of a model stores second-order partial derivatives of a loss function with regards to the models' weights. In its pure form, it is computationally infeasible. This project aims to create a library focused on efficiently approximating Hessian matrices in JAX. The library will implement key approximation techniques, such as Generalized Gauss Newton, Kronecker Factored Approximation Curvature, and the diagonal of a Fisher Information matrix. The primary goals include achieving computational and memory efficiency while maintaining a user-friendly design. By providing practical and accessible implementations within the JAX framework, this project seeks to offer a valuable tool for researchers and practitioners in various applications, including continual learning and model training.

Prerequisites:

  • experience in JAX
  • knowledge of deep learning
  • knowledge of linear algebra