Bachelor of Science (B.Sc.) & Master of Science (M.Sc.):

Available Thesis Topics

Both Bachelor and Master theses are extensive pieces of work that take full-time committment over several months. They are thus also deeply personal decisions by each student. This page lists a few topics we are currently seeking to address in the research group. If you find one of them appealing, please contact the person listed here for each project. If you have an idea of your own — one that you are passionate about, and which fits into the remit of the Chair for the Methods of Machine Learning — please feel invited to pitch your idea. To do so, first contact Philipp Hennig to make an appointment for a first meeting.

Contemporary Optimizer Benchmark (B.Sc. Thesis)

Supervisor: Frank Schneider 

DeepOBS is a recently developed benchmarking suite for Tensorflow and PyTorch which allows to rapidly evaluate the performance of an optimizer on real-world deep learning problems. In this thesis DeepOBS should be used to benchmark all currently relevant deep learning optimizers. This list includes but is not limited to SGD [1], SGD with Momentum [2], SGD with Nesterov's Acceleration [3], Adagrad [4], Adadelta [5], RMSProp [6], Adam [7], AMGSGrad [8] and AdamW [9]. If necessary these optimization methods have to implemented in Tensorflow or PyTorch.

Additionally, all optimizers can be analysed in terms of their trajectory, their ability to escape saddle points, etc. Tools like *mode connectivity* [10] can be used to investigate if the optimizers behave in structurally different ways.

- [1] H. Robins and S. Monro, “A stochastic approximation method”
- [2] B. Polyak, "Some methods of speeding up the convergence of iteration methods"
- [3] Y. Nesterov, "A method for unconstrained convex minimization problem with the rate of convergence o(1/k2)"
- [4] J. Duchi, E. Hazan, & Y. Singer, "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization"
- [5] M. Zeiler, "ADADELTA: An Adaptive Learning Rate Method"
- [6] T. Tieleman and G. Hinton, "RMSProp: Divide the gradient by a running average of its recent magnitude."
- [7] D. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization"
- [8] S. Reddi, "On the Convergence of Adam and Beyond"
- [9] I. Loshchilov and F. Hutter, "Decoupled Weight Decay Regularization"
- [10] T. Garipov et al., "Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs"

 

Extending DeepOBS with GANs, NMT, etc. (B.Sc./M.Sc. Thesis)

Supervisor: Frank Schneider 

DeepOBS is a recently developed benchmarking suite for Tensorflow and PyTorch which allows to rapidly evaluate the performance of an optimizer on real-world deep learning problems. Currently, DeepOBS relies mostly on image data sets and especially image classification tasks. In this thesis, the benchmark suite should be extended to include a greater variety of real-world deep learning problems. Possible extensions of DeepOBS could be the implementation of Generative Adversarial Networks (GANs) [1, 2], Neural Translation with Attention [3], Speech Recognition tasks [4], sentiment classifiers or reinforcement learning problems.

The selection and setup of the test problems should take into account the trade-off between providing state-of-the-art problems and allowing evaluations within a reasonable time and hardware budget. Ideally, all test problems should be implemented in both Tensorflow and PyTorch.

- [1] I. Goodfellow et al., “Generative Adversarial Networks”
- [2] A. Radford, L. Metz and S. Chintala, "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks"
- [3] https://www.tensorflow.org/alpha/tutorials/text/nmt_with_attention
- [4] https://www.tensorflow.org/tutorials/sequences/audio_recognition
 

 

Making DeepOBS a Development Tool (B.Sc./M.Sc. Thesis)

Supervisor: Frank Schneider

DeepOBS is a recently developed benchmarking suite for Tensorflow and PyTorch which allows to rapidly evaluate the performance of an optimizer on real-world deep learning problems. This thesis should extend the use-case of DeepOBS from a pure benchmarking tool to a development tool for stochastic optimization methods. A literature review should determine trackable statistics (e.g. gradient magnitudes, angles of subsequent gradients, height from valley floor, etc.) and methods (e.g. mode connectivity, loss surface visualizations, etc.) to reveal the inner workings of the novel optimization method. These tools should then be implemented in DeepOBS in both Tensorflow and PyTorch.

 

Laplace Approximations for Exponential Families (M.Sc. Thesis)

Supervisor: Philipp Hennig & N.N.

Laplace approximations are a basic (and historical) approximate inference method, and among the only currently available tools for assigning uncertainty to deep learning models. They are usually employed to approximate a non-Gaussian distribution with a Gaussian one. The goal of this Master project is to generalize this notion (also beyond Laplace's original intent) to a broader family of probability distributions. This project is primarily of a theoretical nature, but also has a connection to applications. Done well, it can lead to a publication in a high-impact venue. Prior experience with approximate inference (in particular, participation in the Lecture Course on Probabilistic Machine Learning) is a pre-requisite. 

Flexible and scalable variational Bayesian neural networks (M.Sc. Thesis)

Spervisor: Agustinus Kristiadi 

Variational inference is one of the most well-known methods for inferring the posterior distribution of a neural network. Due to computational and mathematical convenience, strong assumption of the variational posterior is usually employed, e.g. diagonal Gaussian [1]. This makes the variational posterior fails to approximate the true posterior well and subsequently makes non-desirable behavior emerges [2]. The goal of this project is to construct a more flexible but still scalable variational posterior which can then approximating the true posterior better. This topic is a mix of theory and application. Moreover, it is open-ended and the student will be guided to grow his/her ideas.

[1] Blundell, C., Cornebise, J., Kavukcuoglu, K. & Wierstra, D.. (2015). Weight Uncertainty in Neural Network. Proceedings of the 32nd International Conference on Machine Learning, in PMLR 37:1613-1622. http://proceedings.mlr.press/v37/blundell15.html
[2] Foong, Andrew YK, et al. "'In-Between'Uncertainty in Bayesian Neural Networks." arXiv preprint arXiv:1906.11537 (2019). https://arxiv.org/abs/1906.11537
 

Scalable Laplace approximation for neural networks (M.Sc. Thesis)

Supervisor: Agustinus Kristiadi

Laplace approximation works by fitting a Gaussian at one of the modes of the true posterior. Due to the space complexity of this Gaussian's covariance matrix, which scales like O(n^2) where n is the number of network's parameters, exact Laplace approximation for large networks is intractable. In this project, we are interested in making Laplace approximation more scalable, while also be more fine-grained (in term of its covariance matrix) than the recently proposed methods [1]. This topic is a mix between theory and application, and it is an intersection between Bayesian inference, deep learning, and second-order optimization, with a particular emphasis on linear algebra. This topic is open-ended and the student will be guided to grow his/her own ideas.

[1] Ritter, Hippolyt, Aleksandar Botev, and David Barber. "A scalable laplace approximation for neural networks." (2018). https://openreview.net/forum?id=Skdvd2xAZ