Institut für Medienwissenschaft

The Answering Machine

Social Bots - Maschinen, die für uns als Sozialpartner interagieren - sind in unserem Alltag immer häufiger anzutreffen. Was bedeutet das für uns als Interaktionspartner? Und wie werden sich unsere Interaktionen mit den Maschinen weiterentwickeln? Das Projekt nutzt die Theaterbühne als künstlerisch-wissenschaftliche Experimentierplattform, um Fragen aus der Perspektive des angewandten Anthropomorphismus - in diesem Fall die Zuschreibung menschlicher Eigenschaften auf die Maschine - zu untersuchen. Auf der Bühne kommen die vier kooperierenden Disziplinen - Medienwissenschaft, Computerlinguistik, Theaterwissenschaft und Psychologie - zusammen, um spezifische Fragen zu testen und eine Sensibilität für die Herausforderungen zu schaffen, die das Zusammenleben von Menschen und KI in der Zukunft mit sich bringt. In einer vierjährigen Reihe von Experimenten und Aufführungen werden Schauspieler mit Social Bots interagieren. Ziel des Teams ist es, emotionale, behaviorale und kognitive Muster zu identifizieren, um die Bedingungen der Anthropomorphisierung im Detail zu verstehen. Die Erkenntnisse werden in psychologischen Trainings eingesetzt, die medientheoretischen Implikationen werden reflektiert und Simulationen für eine allgemeine Mensch-Maschine-Koevolution werden erstellt und in den öffentlichen Diskurs eingebracht.

Prof. Dr. Susanne Marschall

Universität Tübingen
Medienwissenschaft
Wilhelmstraße 50
72074 Tübingen
Raum 018

 +49 7071 29-72354
Fax: +49 7071 29-4656
susanne.marschallspam prevention@uni-tuebingen.de 

Jun Zhang

Eberhard Karls Universität Tübingen
Institut für Medienwissenschaft
Wilhelmstraße 50
72074 Tübingen
Raum 210

 +49 7071 29-72327
jun.zhangspam prevention@uni-tuebingen.de

Weitere Projekt-Teilnehmer

Universität Stuttgart: Computerlinguistik

Prof. Dr. Jonas Kuhn (PI)

Simone Beckmann Escandón (Doktorandin)

Züricher Hochschule für Künste: Theaterwissenschaft

Dr. Gunter Lösel (PI)

Work Plan

Year 1: Setting up the social bot: Data collection and pipelines

MS1 will focus on the voice-based interplay of human-machine interactions and concepts of liveness and presence for invisible forms of embodiment on stage. By creating a framework for capturing the performances in a 2-dimensional and a volumetric way the MS will provide high quality material which will be used for an extensive analysis of human-machine encounters on stage.  For this purpose, an elaborated framework for studying the perception of “Applied Anthropomorphism'' and as an interplay of TTS algorithms, synthetic voices and concepts of presence and liveness on stage will be established. To elaborate on the concept of “Applied Anthropomorphism” on a voice-based level from various perspectives we will prepare a publication on “Media Voice, Presence and Liveness” in MS2. This book will combine findings from various disciplines on the core aspects of voice-based communication with non-human entities.  MS will create and conduct qualitative interviews about the audience’s perception of the performance. In a close exchange with PSY, TS and CL the conception of the anthology on “Media Voice, Presence and Liveness” will be initiated to deepen the interrelations between all areas. Through the exploration of the historical background in the field of media studies and the role of the synthetic voice and affective computing a close exchange with TS for staging the human-machine encounters will be established. MS will furthermore explore and adjust state-of-the-art TTS software and explore synthetic voice software e.g., sonatic.io and aravoice for the performance situation in year 3.

Output: qualitative research framework for analyzing liveness, presence and the voice-based interplay of HMI, theoretical embedding of synthetic voice into the concept of "Applied Anthropomorphism", 2D and 3D performance capturing.

Year 2: Staging the social bot: Rules and formalisms

MS2 will introduce an interdisciplinary concept on the mediality of AI based chatbots and synthetic voices called “Invisible embodiment”. This concept will extend applied anthropomorphism on a speech-based level for analyzing the role of synthetic voice modalities for voice-based HMI. Furthermore, the anthology “Media Voice, Presence and Liveness” with a focus on the socio-historical meaning of artificial and synthetic voices e.g. talking puppets, voice-body interrelation, meaning of the voice as the first medium to communicate and to hear, audio related Uncanny Valley effect (aUV) will be published. Depending on the findings from MS1 the performance capture methods and the synthetic voice parameters will be adjusted. By analyzing the data from MS1 and the rule-based chatbots modalities we will adjust the study framework in order to check the rule compliances from CL2, TS2 during the performance. We will modify the synthetic voice datasets in order to customize the voice - chatbot dataset appearances according to the findings of year 1.

Output: qualitative research framework for deepening theoretical concepts on the mediality of AI, theoretical concept of "Invisible embodiment" for embedding the aUV, 2D and 3D performance capturing, publication of the anthology on "Media Voice, Presence and Liveness".

Year 3: Refining the social bot: Facets of being human

MS3 will focus on the voice-related Uncanny Valley effect and the impact of social cues on the audience’s perception of presence on stage. For exploring the limitations of the “invisible embodiment” concept and for deepening the analytical aspects regarding the perception of AI-based HMI we will pursue a strong focus on social cues, voice pitch, speed, and the sound space in our interviews. These synthetic voice-based modalities will be further analyzed for exploring gender-based perception biases of synthetic voices. In order to facilitate science communication discussions about the ethical limitations of voice-based AI interactions we will also focus on comparing the vocal scopes of natural and synthetical voices in our performance settings. The datasets for TTS and its spatial arrangements on the stage and during performance capturing will be adjusted and incorporated into the performance. For science communication reasons there will be expert interviews conducted and the captured performances from year 1, 2, and 3 placed as juxtapositions. Furthermore, the final publication on the whole research project will be conceptualized and initiated by MS.

Output: Theoretical concepts on synthetic voice by incorporating modifications of voice perception, adjustments of TTS and synthetic voice frameworks for HMI, 2D and 3D performance capturing, concept for project publication.

Year 4: Co-evolution with social bots: projections, needs and affordances

MS4 will deepen the role of social cues and the limits of voice anthropomorphism for the final project publication. After finalizing the data acquisition, interdependencies from TS, CL, PSY and MS will be brought together and published in the final project publication. Furthermore, there will be a practice-based PhD thesis on the topic of „Creative Artificial Intelligence on Stage” completed which will sum up and present the potential of staging voice-based human-AI encounters using hermeneutics as a qualitative research approach.

Output: Contribution to project publication, dissertation.