Automatic Loanword Identification using Tree Reconciliation

Colloquium by Marisa Köllner

Title: Automatic Loanword Identification using Tree Reconciliation

Time: Tuesday, 15th December 2020, 1pm sharp, Zoom link will be announced on Monday 14th of December.

Speaker: Marisa Köllner

The availability of machine readable data accelerated the adaptation and development of phylogenetic methods in historical linguistics. While some methods addressing the evolution of languages are successfully integrated, scarcely any attention has been paid to methods analyzing horizontal transmission. Inspired by the parallel between horizontal gene transfer and borrowing, this work aims at adapting horizontal transfer methods into computational historical linguistics to identify borrowing scenarios along with the transferred loanwords. Computational methods modeling horizontal transfer are based on the framework of tree reconciliation. The methods attempt to detect horizontal transfer by fitting the evolutionary history of words to the evolution of their corresponding languages, both represented in phylogenetic trees. The discordance between the two evolutionary scenarios indicates the influence of loanwords due to language contact. The tree reconciliation framework is introduced in a linguistic setting along with an appropriate algorithm, which is applied to linguistic trees to detect loanwords. While the reconstruction of language trees is scientifically substantiated, little research has so far been done on the reconstruction of concept trees, representing the words’ histories. One major innovation of this work is the introduction of various methods to reconstruct reliable concept trees and determine their stability in order to achieve reasonable results in terms of loanword detection. The results of the tree reconciliation are evaluated against a newly developed gold standard and compared to three methods established for the task of language contact detection in computational historical linguistics.