Das Oberseminar bietet Vorträge von eingeladenen Referentinnen und Referenten oder Kolleginnen und Kollegen aus der Abteilung an. Die Referierenden stellen aktuelle Forschungsergebnisse zu allen für die allgemeine und theoretische Linguistik relevanten Bereichen vor. Jeder ist herzlich eingeladen. Studierende werden besonders ermutigt, teilzunehmen, um forschungsbezogene Vorträge von Spezialisten aus erster Hand zu erleben.
Weitere Informationen zu Zeit und Ort der Oberseminare in diesem Semester finden sich unter den Reitern des jeweiligen Datums.
Jennifer Hu (MIT)
Zeit: 16.15 - 17.45 Uhr
Access link: https://zoom.us/my/michael.franke.tuebingen
Title: Investigating ad-hoc scalar implicatures
Abstract: Scalar implicature (SI) is traditionally considered a hallmark of generalized conversational implicature (Levinson, 2000). This view has been challenged by recent studies demonstrating variance in SI rates within and across scales (e.g., Doran et al. 2009; Degen 2015; van Tiel et al. 2016), suggesting that SI depends on context and the structure of the scale itself. However, these studies focus on scales ordered by entailment (e.g., <some, all>, <warm, hot>), while little attention has been given to SIs arising from ad-hoc ordering relationships (Hirschberg 1985), which are common in naturalistic communication. In this ongoing work, we test the hypothesis that SI rates — across both entailment-based and ad-hoc scales — depend on the listener’s confidence in the underlying scale, separately from the relationship between the weak and strong scalar items. First, we use an artificial language model to test whether uncertainty about the scale predicts SI for a set of entailment-based scales (van Tiel et al., 2016). Next, we conduct a pilot behavioral experiment using novel materials featuring ad-hoc scales in naturalistic sentences. Our preliminary results suggest that scalar uncertainty may predict SI for this broad range of scales. This suggests a potential unified mechanism underlying entailment-based and ad-hoc SI, further challenging the distinction between generalized and particularized conversational implicatures.
Colin Twomey (University of Pennsylvania)
Zeit: 16.15 - 17.45
Access link: https://zoom.us/j/95896132814
Meeting ID: 958 9613 2814
Title: What we talk about when we talk about colors
Abstract: Names for colors vary widely across languages, yet color categories are remarkably consistent. Shared mechanisms of color perception help explain these shared patterns and have been the focus of past work. But the mappings from colors to words are far from identical across languages, which may reflect differences in communicative needs – how often speakers must refer to objects of different colors. A link between compression and categorization in natural language gives us a new way to look for the key factors shaping color vocabularies in 130 languages around the world. We introduce a new approach to inference using this link, and reveal a hidden diversity in communicative needs across linguistic communities. We show that the extensive variation in needs can be explained in part by differences in geographic location and local biogeography, while commonalities in the color regions of greatest need are correlated with the colors of salient objects, including ripe fruits in primate diets. Our work reconciles opposing theories of color naming, while opening new directions to study cross-cultural variation in communicative needs and its impact on the cultural evolution of color categories.
Elisa Kreiss (Universität Stanford)
Zeit: 19.15 - 20.45 Uhr
Access link: https://zoom.us/my/michael.franke.tuebingen
Title: Towards Context-Sensitive Image Descriptions with a Purpose
Abstract: Images are pervasive across the Web, and they are generally tightly connected to natural language that is intended to serve a variety of purposes -- e.g., the articles and tweets they appear in, or the language read out by screen readers to make those images non-visually accessible. We use the term image-based Natural Language Generation to summarize the efforts to artificially generate this class of image-based texts. While these different kinds of texts share a connection to images, summarizing them as a single task as is suggested in much of the "image captioning'' literature is highly misleading. In this work, we argue that for developing models that can usefully generate this variety of texts in practice, image-based Natural Language Generation needs to be guided by two central considerations: (1) The communicative purpose the specific type of text needs to serve, and (2) the context the image is embedded in. In this talk, I'll describe a new publicly available Wikipedia-based dataset called Concadia and first model results showing the benefit for considering both components with a special focus on image descriptions for accessibility. Finally, I'll discuss the opportunities of integrating Questions Under Discussion (QUDs) in the description generation process and the challenges that come with such an approach.
Zeit: 16.15 - 17.45
Access link: https://zoom.us/j/95896132814
Meeting ID: 958 9613 2814
Title: Blue and green, or grue? The color lexicon at the intersection of the environment, genetic and culture
Abstract: The great diversity of the world’s language is reflected in the way people label the color spectrum. As an example, while some languages have separate words for “green” and “blue”, others lump the two together (“grue”). Earlier proposals suggest that the diversity of the cultural and physical environments in which populations live, as well as individual differences in color perception, may cause this variation. However, the exact role of these factors and their interaction has remained unclear. During this talk, we present the results of analyzing 142 populations, showing that color lexicon is affected by several aspects of the environment: cultural complexity, distance to water, and - of particular importance - the amount of ultraviolet light. Ultraviolet light negatively affects the lens of the eye cumulatively across the lifespan, resulting in less blue light reaching the retina. Across generations, this ontogenetic effect may generate negative selection against the genetically-determined abnormal red-green vision. Together, this shows that languages can only be understood in the context of their cultural, biological, and physical environments. Also, it put forward the idea that differences in physiology might generate biases that, amplified by the repeated use and transmission of language, may contribute to shaping languages.
Sina Zarrieß (Universität Bielefeld)
Zeit: 16.15 - 17.45 Uhr
Access link: https://zoom.us/j/99368857478
Meeting ID: 993 6885 7478
Title: Linguistic Variability and Communicative Effectiveness: Challenges in Neural Language Generation
Abstract: When speakers converse, their utterances are remarkably diverse and, at the same time, remarkably precise and effective. For instance, in a widely used corpus of human descriptions of images showing common objects, Devlin et al. (2015) find that 99% of the image captions are unique. Other work that has collected such descriptions in interactive, game-based settings found that speakers often only need a few words to unambiguously refer to objects or generally make themselves understood in a rich, communicative context.
Handling the variability of utterances that speakers are able to use with such a high degree of effectiveness is still a central challenge in conversational systems that generate natural language. In this talk, I will discuss recent attempts in NLG (natural language generation) at making systems more diverse or more effective, showing that these objectives are often at odds. I will discuss some of our recent work on decoding methods for neural NLG that aims at overcoming trade-offs between quality, diversity and communicative effectiveness.
Zeit: 16.15 - 17.45 Uhr
Access link: https://zoom.us/j/95896132814
Meeting ID: 958 9613 2814
Title: The emergence of words from iterated sound imitations
Abstract: There has been an ongoing debate concerning the modality in which language has emerged (cf. Wacewicz and Żywiczyński 2020). The gestural scenarios of language origins are given more prominence, with experimental studies typically focusing on the role of visual iconicity in the evolution of gestural communication and sign languages (Armstrong 2007; Fay et al. 2014; Goldin-Meadow 2016; Zhang 2016; Silva et al. 2020; Fröhlich 2019).
The current research contributes to the language origins debate as it addresses an empirically underexplored vocal-auditory modality and the power of imitation in the development of first words and follows the novel line of research on spoken languages and sound symbolism (cf. Ćwiek et al., 2019; Edmiston et al., 2018; Perlman et al., 2015; Pernis & Vigliocco, 2014; Imai & Kita, 2014).
Following Edmiston et al. (2018), we use iterated learning paradigm to scrutinize the emergence of words from uninstructed repeated one another’s vocal productions of environmental sounds, e.g., glass breaking, clock ticking (cf. the children’s “Telephone game”). Our project aims at understanding the mechanism of the evolution of the sound lexicon; thus, answering the question: Is the emergence of words through iterated vocal imitations of environmental sounds universal, that is, language-independent?
The currently available data show that vocal imitations may stabilize in form and function (as measured by an increase of acoustic and orthographic similarity). Yet, we cannot generalize over other language groups than investigated English language speakers. The populations tested in our study are Polish, German, and Chinese native speakers which will allow us to verify whether the emergence of words through iterated vocal imitations of environmental sounds is a universal phenomenon.
Armstrong, D. F., & Wilcox, S. E. (2007). The gestural origin of language. Oxford University Press.
Brown, S. (1999). The ‘Musilanguage’ model of language evolution. in The Origins of Music, eds. S. Brown, B. Merker, and N. L. Wallin. 271–300. Cambridge, MA: MIT Press.
Blasi, D. E., S. Wichmann, H. Hammarström, P. F. Stadler & M. H. Christiansen. (2016). Sound–meaning association biases evidenced across thousands of languages. Proceedings of the National Academy of Sciences, 113(39), 10818–10823.
Ćwiek, A., Draxler, Ch., Fuchs, S. Kawahara, S., Winter, B. & Perlman, M. (2019) Comprehension of Non-Linguistic Vocalizations across Culture Presentation at ICLC 2019, Japan.
Edmiston, P., Perlman, M., & Lupyan, G. (2018). Repeated imitation makes human vocalizations more word like. Proceedings of the Royal Society B, 285, 20172709.
Fay et al. (2014). Creating a communication system from scratch: gesture beats vocalization hands down. Frontiers in Psychology, 5(354). doi: 10.3389/fpsyg.2014.00354.
Fröhlich, M., Sievers, C., Townsend, S. W., Gruber, T., & van Schaik, C. P. (2019). Multimodal communication and language origins: integrating gestures and vocalizations. Biological Reviews, 94(5), 1809-1829. https://doi.org/10.1111/brv.12535
Goldin-Meadow, S. (2016). What the hands can tell us about language emergence. Psychonomic Bulletin & Review, 24(1), 1–6.
Imai M, Kita S. (2014) The sound symbolism bootstrapping hypothesis for language acquisition and language evolution. Phil. Trans. R. Soc.B 369, 20130298.
Perniss P, Vigliocco G. (2014) The bridge of iconicity: from a world of experience to the experience of language. Phil. Trans. R. Soc. B 369, 20130300.
Perlman, M., Dale, R., & Lupyan, G. (2015). Iconicity can ground the creation of vocal symbols. Royal Society Open Science, 2(8), 150152–16.
Wacewicz, S., & Żywiczyński, P. (2020) Pantomimic Conceptions of Language Origins. In A. Lock, C. Sinha, N. Gontier (eds.), Oxford Handbook of Human Symbolic Evolution, 2nd edition, Oxford University Press, in press.
Zhang, E. Q. 2016. Why did vocalizations come to predominate gestures in the evolution of language? Ducog 2016, Dubrovnik, Croatia.
Steven Moran (Université de Neuchâtel)
Zeit: 10.15 - 11.45 Uhr
Access link: https://zoom.us/j/93972792983
Meeting ID: 939 7279 2983
Title: Human speech sounds: three evolutionary timelines
Abstract: In order to understand the evolutionary origins of speech, and to perhaps shed light on the origins of language, we need to identify the biological and cultural factors that shape it. In my presentation, I will discuss how we can investigate the biological and cultural pressures on speech sound production with respect to three evolutionary timelines. First, what parallels exist between human and nonhuman vocalizations? Second, how can we investigate which speech sounds extinct hominins, such as Neanderthals, could have produced? Third, which speech sounds today are determined by our biology and which are due to cultural pressures? Since very little is known about the origin of language, including whether it evolved suddenly or gradually during the evolution of our species, any insights from studying the evolution of speech may help pinpoint when and how it evolved.
Zeit: 10.15 - 11.45 Uhr
Access link: https://zoom.us/j/97040596620
Meeting ID: 970 4059 6620
Title: Grammatical restructuring and population dynamics in northwestern Bantu gender systems
Abstract: This talk investigates animacy-based semantic restructuring in Bantu gender systems and the non-linguistic factors that may favor it. Our focus is on variability in patterns of gender agreement, and we present the results of two studies based on a sample of 179 northwestern Bantu (henceforth NWB) languages. Our dataset consists of information on the kind of marking (syntactic, animacy-based, both, or none) that is found on a set of fifteen agreement targets (such as adnominal modifiers, various kinds of pronouns, and predicates). We first present a bottom-up typology of gender systems in NWB. We find that highly eroded gender systems can be explained as a result of the evolutionary dynamics by which animacy-based semantic agreement rises and spreads in more conservative languages. Our data also confirm that animacy-based semantic agreement is more likely to first appear on predicates and independent personal pronouns before spreading to different types of adnominal modifiers, which is in line with the predictions of the Agreement Hierarchy (Corbett 1979, 2006). The second part of the talk focuses on testing the hypothesis that animacy-based agreement and heavily restructured gender systems in NWB can be explained as a function of population history – at least in part. We find that animacy-based gender systems are more common (1) in NWB languages of wider communication (animacy-based agreement is favored in virtue of its higher learnability) and (2) in NWB languages that are closely situated to non-Bantu languages (animacy-based restructuring is due to intense language contact and shift). To our knowledge, this is the first quantitative cross-linguistic study that confirms the oft-repeated claim that situations of intense language contact favor the restructuring and erosion of grammatical gender.
Zeit: 10.15 - 11.45 Uhr
Access link: https://zoom.us/j/97560603153
Meeting ID: 975 6060 3153
Title: Contact-tracing in cultural evolution: a Bayesian mixture model to detect geographic areas of language contact
Abstract: When speakers of different languages interact, they are likely to influence each other: contact leaves traces in the linguistic record, which in turn can reveal geographic areas of past human interaction and migration. However, other factors may contribute to similarities between languages. Inheritance from a shared ancestral language and universal preference for a linguistic property may both overshadow contact signals. How can we find geographic contact areas in language data, while accounting for the confounding effects of inheritance and universal preference? To approach this issue we developed sBayes, an algorithm for Bayesian clustering in the presence of confounding effects. The algorithm learns which similarities are better explained by confounders, and which are due to contact effects. Contact areas are free to take any shape or size, but an explicit geographic prior ensures their spatial coherence. In a first study, we tested sBayes on simulated data and applied it in two case studies to reveal language contact in South America and the Balkans.
Ryan Cotterell (ETH Zürich)
Zeit: 16.00 - 17.30 Uhr
Access link: https://zoom.us/j/97572048759
Meeting ID: 975 7204 8759
Title: Meaning to Form: Measuring Systematicity as Information
Abstract: My will focus on a longstanding debate in semiotics centers on the relationship between linguistic signs and their corresponding semantics: is there an arbitrary relationship between a word form and its meaning, or does some systematic phenomenon pervade? For instance, does the character bigram ‘gl’ have any systematic relationship to the meaning of words like ‘glisten’, ‘gleam’ and ‘glow’? We offer a holistic quantification of the systematicity of the sign using mutual information and recurrent neural networks. We employ these in a data-driven and massively multilingual approach to the question, examining 106 languages. We find a statistically significant reduction in entropy when modeling a word form conditioned on its semantic representation. Encouragingly, we also recover well-attested English examples of systematic affixes. We conclude with the meta-point: our approximate effect size (measured in bits) is quite small—despite some amount of systematicity between form and meaning, an arbitrary relationship and its resulting benefits dominate human language.
Gerrit Bauch (Universität Bielefeld)
Zeit: 10.15 - 11.45 Uhr
Access link: https://zoom.us/j/92671301312
Meeting ID: 926 7130 1312
Title: Effects of Noise on the Grammar of Voronoi Languages
Abstract:We study a signaling game of common interest in which a stochastic noise is perturbing the communication between sender and receiver. Despite this inhibiting factor efficient languages still exist. In any equilibrium, sender uses a tessellation consisting of convex cells while receiver uses Bayesian estimators as interpretations. Low levels of error that respect the distance between words lead to concise interpretations in the decoding process. Comparative statics for increasing noise describe robustness of different grammatical structures. Evolutionary modeling approaches converge to equilibria, but not every equilibrium is stable.
Shravan Vasishth (Universität Potsdam)
Zu diesem Oberseminar gibt es einen Mitschnitt.
Zeit: 11.00 - 12.30 Uhr
Access link: https://zoom.us/j/96320858663
Meeting ID: 963 2085 8663
Title: Individual differences in cue-weighting in sentence comprehension: An evaluation using Approximate Bayesian Computation
Abstract: Cue-based retrieval theories of sentence processing assume that syntactic dependencies are resolved through a content-addressable search process. An important recent claim is that in certain dependency types, the retrieval cues are weighted such that one cue dominates. This cue-weighting proposal aims to explain the observed average behavior. We show that there is systematic individual-level variation in cue weighting. Using the Lewis and Vasishth cue-based retrieval model, we estimated individual-level parameters for processing speed and cue weighting using data from 13 published reading studies; hierarchical Approximate Bayesian Computation (ABC) with Gibbs sampling was used to estimate the parameters. The modeling reveals a nuanced picture about cue-weighting: we find support for the idea that some participants weight cues, but not all do; and only fast readers tend to have the predicted cue weighting, suggesting that reading proficiency might be associated with cue weighting. A broader achievement of the work is to demonstrate how individual differences can be investigated in computational models of sentence processing using hierarchical ABC.
Johann-Mattis List (MPI für Menschheitsgeschichte Jena)
Zeit: 11.00 - 12.30 Uhr
Access link: https://zoom.us/j/94224749668
Meeting ID: 942 2474 9668
Title: Cross-linguistic language technologies. Data, Methods, and Analysis
Abstract: In the past two decades the amount of digitally available linguistic datasets has been constantly increasing and many new methods have been proposed in order to analyze large cross-linguistic datasets. Unfortunately, however, only a small fraction of the digitally available data is also integrated in the sense that the data for one particular problem or the data from one particular source can be directly compared with data compiled for different problems or from different sources. In order to deal with this problem, new cross-linguistic language technologies are needed. These technologies help to integrate linguistic datasets across many different languages by providing (a) detailed standards for cross-linguistic data formats, (b) new methods with which data can be converted into the required data formats, and (c) new analyses that target cross-linguistically integrated data. In the talk, I will briefly introduce what has been done so far with respect to data integration and cross-linguistic language technologies and then discuss chances and challenges for future work.
Michael Franke (Universität Osnabrück)
Zu diesem Oberseminar gibt es einen Mitschnitt.
Zeit: 11.00 - 12.30 Uhr
Access link: https://zoom.us/j/92255632073
Meeting ID: 922 5563 2073
Title: Theory-driven probabilistic modeling of language use: a case study on quantifiers, logic and typicality [joint work with Bob van Tiel (Nijmegen) and Uli Sauerland (Berlin)]
Abstract: Theoretical linguistics postulates abstract structures that successfully explain key aspects of language. However, the precise relation between abstract theoretical ideas and empirical data from language use is not always apparent. Here, we propose to empirically test abstract semantic theories through the lens of probabilistic pragmatic modelling. We consider the historically important case of quantity words (e.g., `some', `all'). Data from a large-scale production study seem to suggest that quantity words are understood via prototypes. But based on statistical and empirical model comparison, we show that a probabilistic pragmatic model that embeds a strict truth-conditional notion of meaning explains the data just as well as a model that encodes prototypes into the meaning of quantity words.
Natalia Levshina (MPI for Psycholinguistics Nijmegen)
Zu diesem Oberseminar gibt es einen Mitschnitt.
Zeit: 11.00 - 12.30 Uhr
Access link: https://zoom.us/j/92536167298
Meeting ID: 925 3616 7298
Title: From binary trade-offs to a causal network of Subject and Object cues: A typological corpus-based study based on Universal Dependencies
Abstract: In recent years, the notion of efficient trade-offs between different motivations and types of cues has become very popular in functional and typological approaches to language. The present paper is a case study of different cues that are used to express the core grammatical relationships of Subject and Object, which convey “who did what to whom”. These cues are case marking, rigid word order of Subject and Object, tight semantics and verb-medial order. They are inferred from online language corpora in thirty languages, annotated with the Universal Dependencies. Correlational and causal analyses show that the cues are not used very efficiently. Some of the cues are positively correlated, and the relationships between different cues are not bidirectional. This study suggests that the opportunities given to language users for exhibiting rational behaviour and using their language efficiently are very limited, and shaped by factors very different from rationality and communicative efficiency.
Bill Thompson (Princeton University)
Zeit: 16.00 - 17.30 Uhr
Access link: https://zoom.us/j/92895214463
Meeting ID: 928 9521 4463
Title: How Translatable are Common Words? Some Answers from Distributional Semantics
Abstract: We analysed the semantic networks of 1,016 concepts in 41 languages using distributional models of lexical semantics. We examined which semantic domains (e.g. animals, emotions, body parts and numbers) show the most and least alignment between different languages, and whether alignment is greater for more concrete terms (it is not). We examined how alignment varies for different parts of speech, and how it relates to human judgements of similarity and to lexical factors such as frequency and neighborhood density. The alignment between one language and another is statistically related to the cultural and historical relatedness of the languages, offering insights into the processes of cultural evolution that influence natural language semantics.
T. Mark Ellison (Universität zu Köln)
Zeit: 11.00 - 12.30 Uhr
Access link: https://zoom.us/j/97783628419
Meeting ID: 977 8362 8419
Title: A Bayesian Model of Prominence in Language
Abstract: This talk comes in two parts. In the first half, I will talk about modelling language in ways that focus on the probabilistic relationship between language and speakers. On the one hand, language arises when two speakers come together and interact. The properties of those speakers determine the distributions of language behaviours that happen between them. On the other hand, we can think of individual speakers as links between their conversations: for example, bilinguals link the speech communities to which they belong, becoming a conduit for influences between their languages. I will present some Bayesian studies of language aimed at showing the history of languages, or of their speaker groups.
The second half of the talk focusses on one feature of language in interaction that arises from the nature of the participants, namely, the use of code prominence. I introduce a simple model of communication in which the listener performs Bayesian inference to determine the intended meaning of a communicated form. The speaker internally simulates listener interpretation - in line with the forward modelling account of Pickering & Garrod (2013) - and varies their choice of linguistic form to ensure that the listener’s confidence in their interpretation remains above a certain threshold. If we assume that speakers ensure this communicative fidelity while nevertheless minimising their articulatory effort, we can explain a number of phenomena associated with prominence. The talk concludes with discussion of how such prominence affects language over time.
Oliver Bott (Universität Bielefeld)
Zeit: 14.15 - 15.45 Uhr
Title: Remention Biases Affect the Choice of Anaphoric Form (joint work with Torgrim Solstad)
Abstract: The choice of anaphoric forms (e.g. Mary vs. she) depends on a number of factors such as grammatical function, order of mention or topicality (Arnold 2008). For semantic/pragmatic re-mention biases however, which also impact referent salience, recent research has found conflicting results. Thus, Implicit Causality verbs of Stimulus-Experiencer (e.g., fascinate) and Experiencer-Stimulus type (e.g., admire) display strong preferences for subsequent explanations about the Stimulus argument (Ferstl et al. 2011). Yet, Fukumura & van Gompel (2010) and Rohde & Kehler (2014) found no effect of Implicit Causality on anaphoric form. Kehler & Rohde (2013) a.o. thus claim that the production of anaphoric form is dissociated from the likelihood of mention. On the other hand, Rosa & Arnold (2017) found that Transfer of Possession verbs, with a re-mention bias for goal arguments (e.g., the indirect object of give or the subject of get) do influence the choice of anaphoric form.
Rohde & Kehler (2014), improving on Fukumura & van Gompel's (2010) paradigm, point out that the choice of anaphoric form is especially important in contexts with two same-gender referents as a strategy to avoid ambiguity (cf. Levinson's (1987) m-implicatures). However, unlike Fukumura & van Gompel 2010 and Rosa & Arnold 2017, Rohde & Kehler did not use a forced-reference paradigm, in which participants are prompted to provide continuations for one particular referent. This is of particular importance when comparing bias congruent and bias-incongruent continuations, though. The present study presents a direct comparison of the effects of the two re-mention biases across a total of four experiments, applying a combination of the forced-reference paradigm with both same-gender and different-gender conditions. We ran the experiments in German, as opposed to the just reviewed ones, which all tested for influences of pragmatic biases on anaphor production in English. Whereas English has a rather restricted inventory of anaphoric forms available for coreference (Gundel et al. 1993), German has both personal (e.g., er/sie 'he/she') and demonstrative pronoun (e.g., dieser/diese 'this one' and jener/jene 'that one') paradigms and we hypothesized that this richness in forms could facilitate the elicitation of form-based effects
In this talk I will present data from four story continuation experiments employing a forced reference paradigm looking more closely into the underlying factors that drive form selection of referential expressions. In sum, the results of our first three experiments show that – modulated by well-known effects of audience design – referential biases affect reference form production across verb classes, including Implicit Causality verbs. This finding adds to the evidence in Rosa & Arnold 2017 and speaks against proposals assuming a general dissociation between likelihood of mention and choice of anaphoric form (Kehler & Rohde 2013). However, given the conflicting evidence from a fourth experiment with respect to the anaphoric category targeted by remention biases (demonstrative pronouns vs. proper names), our experiments call for a pragmatic explanation taking into account not only audience design but also the broader pragmatic context created by the experiment.
Ralf Vogel (Universität Bielefeld)
Title: Syntactic inventions
Abstract: It is a consequence of the widely assumed frequentist approach to grammaticalisation that some inventions that speakers make occur too rarely to induce language change. Still, these inventions are synchronic phenomena that at the same time are based on the language system in its current state, and do not follow from it (completely). I will show with case studies from several phenomena of German morphosyntax (verbal complexes, reflexivisation and case conflicts in German) that this slightly paradoxical idea may help to improve our understanding of those phenomena, and perhaps linguistic competence more broadly.
Of special interest for linguistic theory is the fact that such rare inventions are not arbitrary and do not pose any comprehension problems on the side of the addressee. They are based on the linguistic system in its current historical state, and created by speakers applying general mechanisms in a manner that is transparent to the addressee. These general mechanisms are part of linguistic competence and underlie the speakers’ ability to create new items, words or constructions.
I use the term ad hoc constructions for these special morphosyntactic phenomena.
What makes them particularly interesting, are the following properties: 1. they have unexpected features that do not follow from the linguistic units they are based on; 2. they can be shown to occur indeed too rarely to have a chance to grammaticalise; 3. they are nevertheless preferred systematically over alternative variants and have a sufficiently high acceptability rating. Each of these points will be shown to hold for the phenomena I am discussing on the basis of experimental and corpus studies.
The grammatical analysis will combine tools from the theory of generalised conversational implicature, construction grammar and optimality theory.
Elke Teich (Universität des Saarlandes)
Title: Conventionalization in diachronic linguistic change: the case of Scientific English
Abstract: The topic of this talk is conventionalization (i.e. the longer-term linguistic effects of repeated interaction) and its benefits for communication (i.e. message transmission). Widely acknowledged as a relevant process in language change, conventionalization provides a prerequisite for innovation (de Smet 2016) and may lead to grammaticalization as well as the formation of registers (Weinreich et al. 1968, Harris 1991). I will elaborate the idea that conventionalization is a cornerstone in in changing language use because it serves the maintenance of communication function by inducing significant surprisal and entropy-reducing effects. To show this, we pursue an exploratory, corpus-based approach, focusing on scientific writing (Degaetano-Ortlieb and Teich 2019), a well-studied and fairly controlled domain, and its evolution across 250+ years from the mid-17th century onwards. The data set we use is the Royal Society Corpus¹. To capture lexical and syntactic aspects of changing language use, we employ computational language models; and to evaluate the observed effects, we apply various measures of information content (surprisal, entropy, relative entropy). We find for instance that diachronically, within the scientific domain, relative entropy on n-gram models overall decreases pointing to converging language use over time but this is more pronounced for the grammatical level than for the lexical level. For qualitative interpretation, we inspect the linguistic items that significantly contribute to the observed trends looking at their (average) surprisal in syntagmatic context and the entropy of their paradigmatic context over time.
Degaetano-Ortlieb S. and E. Teich, 2019. Towards an optimal code for communication: the case of scientific English. Corpus Linguistics and Linguistic Theory (open access), DOI: doi.org/10.1515/cllt-2018-0088
De Smet, H. 2016. How gradual change progresses: The interaction between convention and innovation. Language Variation and Change, 28:83–102.
Harris, Z., 1991. A theory of language and information: A mathematical approach. Clarendon Press, Oxford. Weinreich U., W. Labov and M. I. Herzog, 1968.
Empirical foundations for a theory of language change. In W.P. Lehmann and Y. Malkiel (eds.), Directions for Historical Linguistics. University of Texas Press, Austin, Texas, pp. 95-195.
Natasha Korotkova (Universität Konstanz)
Title: Find, must, and conflicting evidence (joint work with Pranav Anand)
Abstract: Recent years have seen a lot of interest in the so-called subjective attitudes: English "find" and its counterparts in, e.g., French, German, or Norwegian. Unlike vanilla doxastics (i.e. "think"), find-verbs have been argued to only allow matters of opinion, rather than fact, in their complements. In this talk, we consider one underanalyzed class of expressions in find-complements: epistemic modals. Those modals have been often analyzed as subjective expressions, which makes them prime candidates for embedding under "find". However, epistemics are prohibited in subjective attitude complements. The "find+must" ban has been attributed to "must" not being subjective in the right way. We argue instead that the real culprit is a matter of evidence: find-verbs require their subject to have direct evidence for the complement, while "must" and its counterparts in other languages require a lack of direct evidence. Therefore, the "find+must" combination yields an evidential clash. We support our claim by novel cross-linguistic data on find-verbs and a range of indirect expressions, including bona fide evidentials, and analyze the "find+must" ban as a semantic contradiction.
Chundra Cathcart (Universität Zürich)
Title: Horizontal and vertical pressures in language change: fleshing out admixture models
Abstract: This talk presents preliminary results from a handful of studies using Bayesian admixture models to analyze cross-linguistic patterns with an eye to understanding shared history and historical contact between languages. Admixture models (such as the Structure algorithm from population genetics) have enjoyed less use in linguistics than phylogenetic methods; they have the advantage of directly modeling historical language contact, but do not explicitly model diachronic transitions between linguistic states. I focus on two extensions to this basic model which work towards bridging this gap: a neural model of sound change across Indo-Aryan dialects which accounts for both historical dialect group-level trends as well as individual language-level idiosyncrasies, and a model of typological distributions which models both spread and match factors by enhancing the admixture model with Markovian dynamics.
Johanna Nichols (University of California, Berkeley)
Title: Better characters/variables for historical linguistics
Abstract: Both wordlist-based and typology-based comparisons have become widely used in the last decade or so, and there has been much discussion of methods and interpretation. Here I discuss three problems I consider more fundamental: (1) The quality and usefulness of the characters or variables themselves; I propose several that promise much more as comparanda. (2) The coding strategies, especially the treatment of "no" answers on yes/no variables, can have major impact on distance measures and NeighborNet trees. (3) Handling synonymy in typologically-defined wordlist studies, where synonymy is common. Preliminary findings show that a combination of etymological and typological variables can yield good results for both typological and historical analysis.
Title: Error-driven Learning in Modeling Spoken Word Recognition
Abstract: Effective linguistic communication relies on the recognition of words (McQueen, 2007). Although spoken word recognition (SWR) is a vital task in speech comprehension, psycholinguists are still debating some fundamental assumptions decades after the first cognitive theories of SWR (Marslen-Wilson and Welsh, 1978) and initial computational modelings (McClelland and Elman, 1986; Norris, 1994). I present the current state of two projects in which we investigated the theory of error-driven learning, outlined by Rescorla and Wagner (1972) for animal and human learning, as a theory of SWR. Computational modelings of a excised word recognition task were carried out using the naive discriminative learning (NDL) and the linear discriminative learning (LDL) frameworks. First, Arnold et al. (2017) and Shafaei-Bajestan and Baayen (2018) applied NDL-based models that iteratively learn to classify German and English words, respectively, from their acoustic representations and reported model performance comparable to human performance. Second, Baayen et al. (2019) estimated the linear mappings between words’ feature vectors and the words’ semantic vectors directly and achieved superior accuracy in recognition of words compared to NDL. Assumptions in the models, issues in model implementation, initial results, and plans for future work are discussed.
Title: Linguistic politeness as strategic behavior: (some) costs and benefits of polite language use
Abstract: Behavioral ecology explains the behavior of animals by treating them as rational agents driven by a maximisation of their benefit-cost differentials. On the uncontroversial assumptions that humans are animals, and that speaking is a type of behavior, this overall approach should - at least in principle - generalise to language use: it should be possible to understand speakers as “rational decision-makers who make tradeoffs between costs and benefits”.
Despite the obvious difficulties involved in determining both the relevant types behavior (language use) and the relevant costs and benefits, there has been some success in applying this reasoning to “pockets” of linguistic behavior such as indirect speech (Pinker, Nowak & Lee 2006) or politeness (Quinley 2012). Most recently, the Responsibility Exchange Theory (RET; Chaudhry and Loewenstein 2019) effectively provides a proof of concept of this general approach for a narrow class of dyadic interactions (assignment of responsibility for a positive/negative outcome): it establishes functional classes of linguistic behavior (apologising, thanking) and works out a compelling theory of the associated social costs/benefits. In my talk, I build on the conceptual foundation proposed by Chaudhry and Loewenstein (2019) and look into ways of extending their approach.
Title: Factive presuppositions? An empirical challenge
Abstract: A long-standing and widely-held assumption is that the content of the complement of factive predicates like “know” is presupposed whereas that of non-factive predicates like “think” is not. There is, however, disagreement in the literature about which properties define factive predicates and whether the contents of the complements of particular predicates exhibit the properties attributed to factive predicates. The resulting disagreement about which predicates are factive is troublesome because the distinction between factive and non-factive predicates has played a central role in the study of presuppositions. This talk, which is based on joint work with Judith Degen (Stanford University), investigates properties of the contents of the complements of clause-embedding predicates with the goal of understanding how such predicates can be classified. We argue that predicates presumed factive are more heterogeneous than previously assumed and that there is little empirical support for the assumed categorical distinction between factive and non-factive predicates. We conclude by discussion the implications for future research on and analyses of presuppositions and other projective content.
Title: Why do we have to say certain things? On the obligatorification of dependents
Abstract: A common feature of the grammaticalization of function words is that they develop the requirement for obligatory dependents. For instance, English the does not occur except when followed by a nominal construction. This talk offers an account of the historical development of obligatorification - how dependents develop from optional extras to required accompaniments. I will show how the process leading to obligatorification is driven by universal communicative requirements. In this sense, the development in question sets in before grammaticalization “proper”, rather than being a result of grammaticalization as has been widely (if only implicitly) assumed. In fact, specific semantic structures create the need for overt hosts at every synchronic stage of every language. Only rarely does this requirement for an overt dependent develop into a syntactic requirement as a result of grammaticalization. I illustrate this with both diachronic and synchronic examples from diverse parts of speech that stem from several languages.
Title: Pronouns, Descriptions, Bridging
Abstract: Recent work on coreference and non-coreference anaphora makes no clear distinctions between different kinds of non-coreference anaphora and also seems to imply that coreference anaphora is a special kind of non-coreference anaphora. In this talk I take a closer look at the interpretation processes for anaphoric pronouns and definite descriptions from an interpretation-theoretic perspective. I identify three strategies for the interpretation of pronouns and descriptions. The first strategy always leads to coreference, the second can produce coreference effects and the third normally does not, although it may involve coreference in certain special cases. None of these three strategies can be reduced to either of the two others.
The formal framework in which the investigation is conducted is a version of DRT in which definite noun phrases (definite descriptions and pronouns among them) are treated as triggers of ‘identification presuppositions’ – presuppositions whose resolution identifies the referents of their triggers. The framework makes it possible to describe the strategies in precise and unambiguous terms and to ask precise questions about the ways they are related. A good part of the talk will be devoted to discussing this approach to the semantics and pragmatics of definite noun phrases and to showing how it works for anaphoric pronouns and descriptions.
Time permitting, we will have a look at a potential problem for the analysis: English descriptions with head nouns that denote ‘inalienable relations’, like the mouth, the father, the weight. Given the definition of bridging I will be using these descriptions ought to be perfect bridging descriptions. But in fact they are not. For instance, in normal contexts the sentence ‘No one mentioned the weight’ can’t be understood as meaning that no one mentioned his or her own weight. Nor can in most contexts: ‘Susan grew up like an orphan. She never even met the father’ be used felicitously to say that Susan never met her own father). The solution of this puzzle has to do with the different processes that are available for reinterpreting relational nouns as non-relational and non-relational nouns as relational.
Title: Linguistic diversity within Chibchan
Abstract: Chibchan languages are spoken at the very heart of the Americas, on the isthmus connecting both continents and in adjacent regions (Panama, Costa Rica, Colombia). Among the language families of Central and South America, the Chibchan family is particularly diverse in typological terms (Adelaar 2007). For instance, the Rama language of Nicaragua has only three phonemic vowels, /a/, /i/, /u/ (Craig 1989: 37), whereas Bribri (Costa Rica) has fourteen (Chevrier 2017: 56). In the domain of verbal person marking, some Chibchan languages use unbound elements, whereas others use prefixes, suffixes, or both (Pache 2015). This talk aims to discuss the following questions: (1) which are the domains of particular variability/relative uniformity within Chibchan? (2) What could have been factors triggering family-internal variability?
Title: Presuppositions, scales, and adjective order
Abstract: Many accounts of language assume that communication is inherently Gricean, and thus that contextually enriched meanings depend in part on a sensitivity to speaker states. However, current models of how core semantic phenomena interact with context often ignore speaker-specific information. For example, in the case of gradable adjectives, research on this topic has mainly focused on the role of extra-linguistic context, such as the distribution of a feature across a domain, or informativeness.
Here, we ask whether, in addition to statistical distributions, listener’s standards of comparison for adjectives are also sensitive to thresholds communicated by (a) existential presuppositions, to investigate whether listeners accommodate individual differences, and (b) different adjective orderings, to test whether the compositional operations involved in understanding AAN-sequences can help comprehenders decide whether scalar thresholds are affected by the speaker’s statements. The results are jointly informative for recent discussions of scalar vs. absolute adjectives, the question of how scalar thresholds are computed, and the compositional semantics of multi-adjective sequences.
Hizniye Isabella Boga
Title: The languages of Italy - Measuring the similarity between close and distant varieties
Abstract: One of the oldest questions in dialectology is how to define a “language” as opposed to a “dialect” (Gooskens 2018). The theoretical definition of a language as the standardised form and dialects as sub-categorical varieties of “inferior” character have been assumed for a very long time. Only with J. K. Chambers and Peter Trudgill’s introduction of the definition “language as a collection of mutually intelligible dialects”, an equality of varieties was emphasised.
The task of my thesis research revolved around measuring distances and similarities of 58 Romance varieties with a focus on the Italian varieties. The goal is to determine which varieties are closer to each other and can hence be seen as dialects of the same language or whether they are distant enough to be considered independent languages.
The methods at hand are the Levenshtein Distance Normalized Divided (LDND) and the Needleman-Wunsch algorithm Normalized Divided (NWND) with a built-in scorer system of PMI distances. With the resulting distances and similarities determined by the LDND and the NWND method, I used a model-based clustering method to allocate similar varieties into one cluster, dissimilar varieties into another cluster and varieties of mixed and unclear affiliation into a further one. Within those clusters, it is visible which varieties are close enough to be varieties of the same language and which varieties are distant enough to be independent languages.
Jakub Szymanik (University of Amsterdam)
Title: Ease of learning explains semantic universals
Abstract: Despite extraordinary differences between natural languages, linguists have identified many semantic universals – shared properties of meaning – that are yet to receive a unified explanation. We analyze universals in a domain of content words (color terms) and a domain of function words (quantifiers). Using tools from machine learning, we show that meanings satisfying attested universals are easier to learn than those that are not. Thus, ease of learning can explain the presence of semantic universals in many different linguistic domains.
Ramon Ferrer i Cancho (Universitat Politècnica de Catalunya)
Title: An emerging theory of word order
Word order is a fascinating phenomenon. During decades, researchers have been collecting many word order regularities that have fed theory. Some of these regularities are the Greenbergian universals of word order, consistent branching or the low number of dependency crossings in the syntactic dependency structures of sentences. Here we will argue these regularities can be regarded as adaptations to the limited resources of the human brain with the help of an emergent theory of word order that provides a unified explanation to word variation and word order change. We will discuss the negative consequences of denying or neglecting the role of functional pressures for the construction of a parsimonious theory of language.
An apetizer: here
Torgrim Solstad (ZAS Berlin)
Title: Predictive Language Processing: The View from Implicit Causality
Prediction in language (Kamide 2008; DeLong et al. 2014), whereby we understand the incorporation of possible (and likely) future information states into processing, still hasn't attracted much attention in theoretical linguistics despite the central status of prediction in human cognition (Bubic et al. 2010; Clark 2013). Bringing together insights from experimental and theoretical research for one particular phenomenon, Implicit Causality, I want to argue that much could be gained by attempting to bridge this gap.
Implicit Causality verbs (e.g. Garvey/Caramazza 1974; Brown/Fish 1983) have been at the core of psycholinguistic research on predictive processing. Selecting for two animate arguments, such verbs display a strong preference for an explanation focusing on one argument, as shown in numerous sentence continuation experiments:
(1) JOHN annoyed Mary because... HE was rude.
(2) John admired MARY because... SHE was clever.
Although there is good evidence as to the processing profile of Implicit Causality, its predictive nature is still insufficiently understood. Some important questions include:
- What is predicted: Is it a particular word (e.g. HE/SHE in (1)/(2)), a referent, or a type of explanation (e.g. a property of John's in (1), and one of Mary's in (2))?
- What triggers the prediction: Is it encoded in lexical semantics (annoy/admire in (1)/(2)) or part of world knowledge?
Based on a formal-semantic theory of Implicit Causality (Bott/Solstad 2014), results from previous experimental research (e.g. Koornneef/van Berkum 2006; Pykkönen/Järvikivi 2010; Featherstone/Sturt 2010) and recent insights into the nature of predictive processing in general (e.g. Kuperberg/Jaeger 2016; Yan et al. 2017), I will propose a framework for predictive processing of Implicit Causality. By bringing together insights from theoretical and experimental research, we can delineate more precisely the top-down and bottom-up processes generating and validating predictions: Which linguistic levels are involved and how do they interact?
Although limited to one particular phenomenon, I contend that approaching predictions in this manner has the potential to mutually benefit both psycholinguistics and theoretical linguistics. On the one hand, a number of aspects concerning prediction processes may be better understood if they are based on more elaborate theoretical linguistic analysis, if only for constraining the possible hypothesis space, thus allowing for more precise experimental predictions and better control of experimental design and materials. On the other hand, research on prediction extends an invitation to reconsider or expand theoretical linguistic assumptions to accommodate the results obtained in experimental research, potentially offering a broader empirical base for linguistic studies, connecting phenomena previously assumed to be unrelated.
Susanne Dietrich (Tübingen)
Title: Processing of presuppositions during speech perception: a functional magnetic resonance imaging (fMRI) study
Abstract: Discourse structure enables us to generate expectations based upon linguistic materials that has already been introduced. The present functional magnetic resonance imaging (fMRI) study addresses auditory perception of test-sentences in which discourse coherence was manipulated by using presuppositions (PSP) that either correspond or fail to correspond to items in preceding context-sentences. Thereby, in- and definite determiners referring to either (non-) uniqueness or (not) existence of an item were used as PSP triggers. Discourse violation within the (non-) uniqueness subset yielded hemodynamic activation within the pre-supplementary motor area (pre-SMA) and bilateral inferior frontal gyrus (IFG). Considering the existence subset, these regions occurred only, if subjects accommodated the discourse. These findings indicate involvement of (i) the working memory (IFG) referring the PSP to contextual information and (ii) a regulator (pre-SMA) managing the process of comprehension by signaling detected errors to the system. This enables the system to continue the process of comprehension, for example, by up-dating the context or tolerating slight errors.
Shirley-Ann Rueschemeyer (York)
Title: Perspective taking during language comprehension
Abstract: Humans are constantly engaged in social interactions, and many of these interactions are supported by language. In this talk I will be presenting a series of studies investigating how language and social cognitive mechanisms interact in order to facilitate communication. I will start by showing that embodied lexical-semantic representations are activated by words in a flexible manner that reflects both linguistic and pragmatic constraints. Secondly, I will show the results of studies that suggest that when pragmatic constraints affect semantic processing, this is supported by interactions between neural language and mentalizing systems. Lastly, I will suggest that language comprehension is affected by assumptions we hold about other co-listeners as well as speakers. One key mechanism supporting perspective taking between co-listeners may be simulation. Together the studies presented in this talk provide insight into how high level language and social cognitive processes work in concert during successful communicative acts.