Project aims
Central to this research project is the observation that there are regularities and systematicities in the spoken language that escape our awareness, that are shielded from us by linguistic traditions and cultural conventions embodied in writing systems, but that nevertheless are detected by our brains, albeit subliminally, and used to optimize lexical processing.
Philosophers such as Emmanual Kant, Edmund Husserl, and Maurice Merleau-Ponty, and more recently the cognitive scientist Hoffman, have called attention to how our perception of reality is shaped by and filtered through our minds and bodies. According to Hoffman, mathematically, fitness beats truth: our perceptions of the world are tuned to our survival. Writing systems are culturally evolved technologies that also hide from our eyes and ears the truth about what we really hear and say. Obviously, in order to work, writing systems must abstract away from the full richness of the spoken word. However, many features of our speech that are masked by writing systems, are nevertheless exploited by our cognitive system when we listen or speak. For native speakers, mismatches between speech and writing are relatively unproblematic. For second language acquisition, however, mismatches can render learning unnecessarily difficult.
The research programme addresses this issue for Mandarin Chinese. Two kinds of mismatches will be investigated, using state-of-the-art methods in computational modeling, distributional semantics, and statistical analysis: subliminal mismatches between what written words are supposed to sound like, and how they are actually spoken, and subliminal mismatches between how the writing system is supposed to work, and how it actually functions and, as a semiotic system of its own, influences thought. These investigations will inform the applied goal of this project: developing ways to enhance vocabulary learning of Mandarin Chinese as a second language.
Tseng, Y.-H., Chen, P.-E., Lian, D.-C., and Hsieh, S.-K. (2024). The Semantic Relations in LLMs: An Information-theoretic Compression Approach. In Dong, T., Hinrichs, E., Han, Z., Liu, K., Song, Y., Cao, Y., Hempelmann, C. F., Sifa, R. (Eds.), Proceedings of the Workshop: Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning (NeusymBridge) @ LREC-COLING-2024, Italy, 8-21. Torino, Italy: ELRA and ICCL.
Chuang, Y.-Y., Baayen, R. H., and Bell, M. (2023). Do words sing their own tunes? Word-specific pitch realizations in Mandarin and English. In Skarnitzl , R., and Volín, J. (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences, Czech Republic, 1603-1607. Prague, Czech Republic: Guarant International.
Tseng, Y. H. and Baayen, R. H., Investigating forgetting curves with learning rule-derived interferences, The 31st Annual ACT-R Workshop, Tilburg, the Netherlands, July 23, 2024.
Baayen, R. H., and Heitmeier, M., Linear Discriminative Learning, Workshop at the International Word Processing Conference (WoProc 2024), Belgrade, Serbia, July 6, 2024.
Chuang, Y.-Y., Bell, M. J., Tseng, Y.-H., and Baayen, R. H., Word-specific tonal realizations in Mandarin. International Word Processing Conference (WoProc 2024), Belgrade, Serbia, July 5, 2024.
Tseng, Y.-H., Chen, P.-E., Lian, D.-C., and Hsieh, S.-K., The Semantic Relations in LLMs: An Information-theoretic Compression Approach, Workshop: Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning (NeusymBridge), Torino, Italy, May 21, 2024.
Baayen, R. H., Modeling Mandarin tones on two-word compounds, Colloquium English Language and Linguistics, Düsseldorf, Germany, January 19, 2024.
Baayen, R. H., Frequency-Informed Learning, Colloquium Out of Our Minds, Birmingham, United Kingdom, October 11, 2023.
Yang, Y., Measure words in Mandarin, 2nd Joint Workshop on Chinese Lexical Semantic Change, 2nd Joint Workshop on Chinese Lexical Semantic Change, Tübingen, Germany, September 6, 2023
Tseng, Y.-H., Lian, D.-C., and Watty, D., Modeling diachronic semantic change of (Pre-Modern) Mandarin Chinese with contextualized embeddings & Word2Vec, 2nd Joint Workshop on Chinese Lexical Semantic Change, Tübingen, Germany, September 6, 2023
Yang, Y., and Baayen, R. H., Exploring semantic organization across mental lexicons: Perception verbs in Mandarin and English, International Cognitive Linguistics Conference (ICLC16), Düsseldorf, Germany, August 8, 2023 (poster presentation).
Chuang, Y.-Y., Baayen, R. H., and Bell, M., Do words sing their own tunes? Word-specific pitch realizations in Mandarin and English, 20th International Congress of Phonetic Sciences (ICPhS), Prague, Czech Republic, August 7, 2023 (poster presentation).
R. Harald Baayen (Professor, Principal Investigator)
Xiaoyun Jin (Doctoral researcher)
Yuxin Lu (Doctoral researcher)
Maziyah Mohamed (Postdoctoral researcher)
Motoki Saito (Postdoctoral researcher)
Yu Hsiang Tseng (Postdoctoral researcher)
Yi Yang (Post-Doctoral researcher)
Former members