Large Language Models (LLMs) have rapidly become one of the most relevant scientific topics, both as theories of language and cognition, but also as tools for research. This trend has accelerated even more since applications like ChatGPT are available to the public. This course will first give a basic introduction to LLMs, outlining the core components of their architecture and how they are trained. For this we will take a bird’s eye perspective, focussing on the key concepts rather than mathematical details: The training data, the training objective (predicting words in contexts), basics of (deep) neural network models, simple language models, the central components of the transformer architecture that underlies common LLMs, and the difference between unidirectional and bidirectional models. This part of the course will end with an overview of how users can interact with LLMs beyond chatbot interfaces (for example, via API calls), and how we can access not only their outputs, but also some of their internal representations and weights.
The second part of the course will focus on use cases of LLMs in research, in their role as “research assistant systems” that allow us to automatize and “outsource” labor-intensive tasks. This will include using LLMs
• as “pseudo-participants” to pilot or simulate behavioral studies
• to automatically generate and simulate new datasets (or extrapolate and scale up existing datasets)
• to automatically annotate data and automatize qualitative analyses
• to analyze open text responses, thus taking a quantitative rather than a qualitative approach to text analysis
• for systematic literature reviews
Prerequisites: The course is targeted towards a general audience with no specific pre-existing expertise in the topic. General coding skills are useful, but knowledge of a specific programming language is no requirement for participation.