Uni-Tübingen

News

02.03.2023

Learning the Language of the Past: Historical Linguistics, Natural Language Processing and Machine Learning

Colloquium by Dr. Christin Beck

­

Time: Thursday, 2nd March 2023 at 1pm (sharp)

Location: Rümelinstraße 23, Room 602 or via Zoom

Speaker: Dr. Christin Beck

Title: Learning the Language of the Past: Historical Linguistics, Natural Language Processing and Machine Learning

Abstract: 
With the recent advances and success of contextualized language models (LMs) in NLP, there has also been a surge of interest in applying such models for investigations into historical language change. Yet, while these models have been successfully applied to research on lexical semantic change, i.e., changes in word meaning over time, it remains unclear as to how the models can be leveraged for historical linguistic research more broadly, e.g., for investigations into syntactic change, i.e., changes involving more functional language structures. In this talk, I aim at shedding light on the usability of  LMs for historical linguistic research. For one, I will present my own work showing how a contextualized language model can be used for investigating lexical semantic change at the word level. For another, I will discuss some of the limitations of these language models in terms of what they really learn, i.e., which kind of linguistic information is captured by the contextualized embeddings generated by LMs. This discussion is supported by my own (collaborative) research which investigates whether LMs are able to model the functional nature of function words adequately by examining more closely how function words are contextualized, i.e., captured in the embeddings, and how (or if) functionality is learned during training. Lastly, with these limitations in mind, I will present my research proposal for my upcoming fellowship at the DFG Center 'Words, Bones, Genes, Tools', where I intend to investigate lexical semantic change across language stages, developing a new methodology which combines LMs with phylogenetic methodologies for cognate identification.

We welcome you all to join us via Zoom and we will send around the specific link on the day before the talk.

Back