Seminar für Sprachwissenschaft

Language Tools


NorthEuraLex is a large-scale lexicostatistical database which is being compiled within the Evolaemp Project. It is unique among databases for providing lexical data from more than twenty language families in a unified IPA encoding, which is generated automatically from the orthographies or standard transcriptions and will continue to be improved in the future. The database can be found here.


This website accompanies the paper Using support vector machines and state-of-the-art algorithms for phonetic alignment to identify cognates in multi-lingual wordlists by Jäger, List and Sofroniev. The repository contains both the data and the source code used in the paper's experiment. You can find them here.

PMI Distances

Here you can find the results from different PMI distance based studies in the project.