Computational Linguistics Tutorial

Topic: Information-Theoretic Analyses of Natural Languages

  Tuesday, 22.02.2022, 10 am - 17 pm

 The tutorial is open to all registered participants of the annual conference. A separate registration is not required.

The langage of the workshop is Englisch.

Christian Bentz (University of Tübingen) & Ximena Gutierrez-Vasques (University of Zürich)

Languages transmit information. They are used to send messages across meters, kilometers, and around the globe. To better understand their information carrying potential, we can harness information theory. In fact, one of its first applications, back in the early 1950s, was a study estimating the amount of uncertainty in English text. Since then, information-theoretic measures have been applied in a multitude of quantitative, computational, and psycholinguistic studies of natural languages. This workshop will, firstly, give a brief introduction to the conceptual underpinnings of information-theoretic measures such as entropy, conditional entropy, and mutual information. Secondly, some problems, pitfalls, and possible solutions for their estimation are discusses. Thirdly, we will give some hands-on exercises for using these measures in research on natural languages. The workshop will provide all relevant data and code online. It will not require students to have a strong programming background.

Christian Bentz (chrisspam prevention@christianbentz.de)
Ximena Gutierrez-Vasques (ximena.gutierrezvasques@uzh.ch)