Quantitative Comparison of Natural Languages and Other Sequences

Colloquium by Dr. Christian Bentz

Time: Tuesday, 2nd March 2021, 1pm sharp

Zoom Link
Meeting ID: 981 8606 8141
Passcode: 984465

Dr. Christian Bentz, who is a former fellow and external member of the Center.

Quantitative Comparison of Natural Languages and Other Sequences

Geometric signs (dots, lines, crosses, etc.) are abundant in the Eurasian prehistory. They are present throughout the Upper Paleolithic, the Mesolithic, and Neolithic, before the earliest writing systems such as Sumerian cuneiform evolved around 5000BC. A crucial question is how early geometric signs differ from ancient and modern day writing, and whether the evolution between these stages proceeded in a linear fashion or rather in sudden bursts. To start addressing these question in a statistical manner, I here use data from a collection of paleolithic mobile objects displaying geometric signs (SignBase), a collection of texts written in ancient scripts, as well as a collection of currently more than 20K texts from close to 100 typologically diverse modern day languages (100 Language Corpus). In a preliminary analysis, I illustrate how geometric signs, ancient cuneiform, and modern day characters can be teased apart in terms of information-theoretic properties.   


We welcome you all to join us via Zoom and we will send around the specific link on the day before the talk.