Quantitative Comparison of Natural Languages and Other Sequences

Colloquium by Dr. Christian Bentz

Time: Tuesday, 2nd March 2021, 1pm sharp

Dr. Christian Bentz

Geometric signs (dots, lines, crosses, etc.) are abundant in the Eurasian prehistory. They are present throughout the Upper Paleolithic, the Mesolithic, and Neolithic, before the earliest writing systems such as Sumerian cuneiform evolved around 5000BC. A crucial question is how early geometric signs differ from ancient and modern day writing, and whether the evolution between these stages proceeded in a linear fashion or rather in sudden bursts. To start addressing these question in a statistical manner, I here use data from a collection of paleolithic mobile objects displaying geometric signs (SignBase), a collection of texts written in ancient scripts, as well as a collection of currently more than 20K texts from close to 100 typologically diverse modern day languages (100 Language Corpus). In a preliminary analysis, I illustrate how geometric signs, ancient cuneiform, and modern day characters can be teased apart in terms of information-theoretic properties.   


