Tutorium der Sektion Computerlinguistik

Thema: Information-Theoretic Analyses of Natural Languages

  Dienstag, 22.02.2022, 10:00 - 17:00 Uhr

 Das Tutorium kann von allen angemeldeten Teilnehmern der Jahrestagung besucht werden. Eine gesonderte Anmeldung ist nicht erforderlich.

Die Workshop-Sprache ist Englisch.

Christian Bentz (Universität Tübingen) & Ximena Gutierrez-Vasques (Universität Zürich)

Languages transmit information. They are used to send messages across meters, kilometers, and around the globe. To better understand their information carrying potential, we can harness information theory. In fact, one of its first applications, back in the early 1950s, was a study estimating the amount of uncertainty in English text. Since then, information-theoretic measures have been applied in a multitude of quantitative, computational, and psycholinguistic studies of natural languages. This workshop will, firstly, give a brief introduction to the conceptual underpinnings of information-theoretic measures such as entropy, conditional entropy, and mutual information. Secondly, some problems, pitfalls, and possible solutions for their estimation are discusses. Thirdly, we will give some hands-on exercises for using these measures in research on natural languages. The workshop will provide all relevant data and code online. It will not require students to have a strong programming background.

Christian Bentz (chrisspam prevention@christianbentz.de)
Ximena Gutierrez-Vasques (ximena.gutierrezvasques@uzh.ch)