Seminar für Sprachwissenschaft

The following is a list of the projects worked on and completed by Quantitative Linguistics. The current projects can be found under Projects.

BMBF-EML

Wide Incremental learning with Discrimination nEtworks

Cluster of Excellence - Machine Learning for Science (Cluster speaker: Philipp Berens, Cluster speaker: Ulrike von Luxburg)

Website

Details to BMBF-EML

Innovation Fund Project 1 in research area A - Beyond Prediction, Towards Understanding

In research area A, we will design algorithms that reveal complex structure and causal relationships from data in order to integrate machine learning into the scientific discovery process. Project 1 investigates "Enhancing Machine Learning of Lexical Semantics with Image Mining".

Members

  • Hendrik Lensch (Principal investigator)
  • R. Harald Baayen (Principal investigator)
  • Zohreh Ghaderi (Phd student)
  • Hassan Shahmohammadi (Phd student)

DFG-ART

Wide Incremental learning with Discrimination nEtworks

Principal Investigator: R. Harald Baayen

Homepage

Details to DFG-ART

Project description

Sub-project ART: The articulation of morphologically complex words

ART was a subproject of the research unit „Spoken Morphology: Phonetics and phonology of complex words“ funded by the Deutsche Forschungsgemeinschaft (DFG) that investigated the articulation of morphologically complex words with the help of electromagnetic articulography.

Members

  • R. Harald Baayen (Principal Investigator)
  • Benjamin V. Tucker (Mercator Fellow)
  • Fabian Tomaschek (Postdoc)
  • Motoki Saito (Research assistant)

ERC-WIDE

Wide Incremental learning with Discrimination nEtworks

Principal Investigator: R. Harald Baayen

Webpage

Details to ERC-WIDE

Project description

This five-year project aimed to deepen our understanding of how we produce and understand words in everyday speech.

Words in day-to-day conversational speech may differ substantially from how they appear in writing: German "würden" is often pronounced as "wün," Dutch "natuurlijk" ('naturally') can reduce to "tk", and Mandarin 要不然 (jao pu zan, 'otherwise') to “ui." Current theories assume that the sound waves that reach our ears are reduced to sequences of abstract sound units, much like the sequences of letters that make up written words. However, how to align highly reduced forms such as "wün", "tk" and "ui" with their full unreduced variants, the supposed gatekeepers to meaning, is an unsolved computational problem.

The WIDE project made the radical proposal to eliminate letter-like sound units altogether, and instead to zoom in on the rich details of the speech signal itself. Given tens of thousands of smart features representing the richness of the speech signal, it is anticipated that artificial neural networks can learn, by trial and error, to identify which meanings are conveyed. Previous research funded by the Alexander von Humboldt foundation allowed to provide a first proof of concept. In the WIDE project, this approach has been developed further and extended from German to other languages, including Mandarin Chinese (a tone language) and Estonian (a complex language with 28 to 40 different forms for a given noun). The WIDE project also targeted a computational model without sound units for the articulation of words in speech production.

The project's name, "WIDE", highlights a second aspect in which this project made a radical departure from current trends in linguistics and natural language processing. Instead of making use of deep learning networks, the project focused on the potential of ‘wide' two-layer networks with tens of thousands of input and output units.

Members

  • R. Harald Baayen (Professor, Principal Investigator)

  • Yu-Ying Chuang (Postdoc)

  • Maja Linke (PhD)

  • Jessie Nixon (Postdoc)

  • Maria Heitmeier (PhD)

  • Tino Sering (PhD)

  • Elnaz Shafaei Bajestan (PhD)

  • Kun Sun (Postdoc)


BMBF-AvH-NDL

Alexander von Humboldt Professorship: Naive discrimination learning

Principal Investigator: R. Harald Baayen

Homepage

Details to BMBF-AvH-NDL

Project description

BMBF-AvH-NDL is a project funded by the Alexander von Humboldt Foundation and the Federal Ministry of Education and Research (BMBF)  that investigates language processing form the perspective of Naive Discrimination Learning.

Naive discrmination learning is a computational modeling approach that exploits the rich co-occurrence information in language form to discriminate between different meanings. This project focuses on lexical processing in silent reading, reading aloud, in listening, and in speech production, but also branches out to language acquisition, bilingualism, syntax, and language change.

Presentations:           NDL-Humboldt Presentations

Publications:              NDL-Humboldt Publications

Team members

  • R. Harald Baayen (project leader)

  • Michael Ramscar (senior researcher)

  • Denis Arnold (post-doctoral researcher)

  • Peter Hendrix (PhD student)

  • June Hendrix-Sun (PhD student)

  • Koji Miwa (post-doctoral researcher)

  • Fabian Tomaschek (post-doctoral researcher)

  • Karlina Denistia (PhD student)

  • Konstantin Sering (PhD student)

  • Ben Tucker (external collaborator)

  • Tineke Baayen-Oudshoorn (administration)

  • Ryan Callihan (research assistant)

  • Rachel Dockweiler (research assistant)

  • Gina Hermann (student assistant)

  • Theresa Schmitt (student assistant)

  • Marc Weitz (student assistant)


SSHRC RESEARCH GRANT

Lexical processing in discourse: a corpus- based approach

Main applicant: R. Harald Baayen

SSHRC Standard Research Grant

Details to SSHRC RESEARCH GRANT

Project description

The SSHRC Research Grant ran from 2008 to 2011.

Team members

  • R. Harald Baayen
  • Peter Hendrix

NWO KLEIN PROGRAMMA

Morphophonological adaption in spoken Dutch: An examplar-based approach

Main applicant: R. Harald Baayen

NWO Homepage

Details to NWO KLEIN PROGRAMMA

Project description

The central hypothesis of this research project is that morphophonological adjustment processes manifest themselves especially and more pervasively in higher-frequency words.

Team members

  • R. Harald Baayen
  • Mirjam Ernestus
  • Mark Pluymakers

NWO PIONEER

The balance of storage and computation in the lexicon

Principal investigator: R. Harald Baayen

NWO Research Reports

Details to NWO PIONEER

Project description

The NWO PIONEER project was on the balance of storage and computation in the lexicon. The award has funded six scientists.

Team members

  • R. Harald Baayen
  • Mirjam Ernestus
  • Nivja de Jong
  • Rachèl Kemps
  • Andrea Krott
  • Fermín Moscoso del Prado Martín