Using new statistical methods to detect mismatches between linguistic and genetic histories in East Timor, preliminary findings

Colloquium by Prof. Dr. Marian Klamer and Dr. Phillip Endicott


Time: Tuesday, 8th June 2021, 11am sharp

Speaker: Prof. Dr. Marian Klamer and Dr. Phillip Endicott

Title: Using new statistical methods to detect mismatches between linguistic and genetic histories in East Timor, preliminary findings.

Zoom: https://zoom.us/j/96024533724?pwd=RFNiMGJ4Ky9iU29WaGJ1MGZMWWsrZz09

Meeting-ID: 960 2453 3724
Password: 188918

A compelling motivation for combining the study of population genetics and linguistics is that languages have communities of speakers, and demographic information about those communities can provide crucial insights into the process of language change. To be informative about the history of interaction between different language communities the model must take into account components of horizontal transmission in linguistic and genetic data sets, via the analogous processes of contact-induced change and admixture. 

Contact linguistics recognises a continuum of horizontal transfer scenarios, from simple cultural contact to bilingualism and language shift, each of which should have a demographic correlate amenable to study using population genetics.

The best place to study these processes is in an ongoing contact zone between different language families, where sharp lexical and grammatical contrasts make identification of borrowed words and structures relatively easy to detect. One such place is East Timor, home to a bewildering diversity of languages of two distinct families and where the relative isolation of these small-scale, oral, societies is anticipated to lead to the retention of genetic signals that can distinguish mismatches to linguistic histories.

Here, we explore the application of Directed-Acyclic-Graphs (DAGs), to both genetic and linguistic data from four Timor-Alor-Pantar and five Malayo-Polynesian speaking communities of East Timor. These statistical models can overcome the limitations of methods based on bifurcating trees, which are often an incorrect representation of population and language histories. DAGs are computationally feasible for a large number of populations enabling detection of significant discrepancies between genetic and linguistic histories that can then be investigated using expert modelling.