Carl Friedrich von Weizsäcker-Zentrum

Social Sciences and Computer Science: A Dialogue Around Data and Tools

International Event

Date: Thursday November 3rd, Friday November 4th

Venue: Hörsaal (2nd floor), Carl Friedrich von Weizsäcker Center, Doblerstr.33 Tübingen

Join via Zoom: https://zoom.us/j/92952164056?pwd=cnQ1K2I4Wk51OFB4b3dqZnVQWVJIUT09

Organizer: Maël Pégny, for the Carl Friedrich von Weizsäcker Center

 

From the digitalization of pre-existing archives to digitally native material, and going through new software tools to collect and process data, social sciences have been transformed by the digitalization of information. On the other hand, some recent evolutions have brought to computer scientists’ attention questions that used to be typical of social sciences. For instance, recent interrogations on algorithmic fairness, especially in Machine Learning, have underlined the importance of social biases in data, and the possibility to reproduce those bias, and the power imbalances that go with them, via uncritical application of learning techniques.

The aim of this international event is to gather computer and social scientists in order to build up a common language to talk about digital data and its processing, how it changes the practice of social sciences, and how social sciences may contribute to change computer practices.

Program

Thursday, November 3rd 2022

11:30-12:00 Introduction by the organizer

12:00-12:45 Bob Williamson (Tübingen), “Data, Capta and Constructa”

14:15-15:00 Sébastien Plutniak (Tours), “Past human intelligence meets artificial intelligence: pieces of the history of AI in the historical sciences”

15:00-15:45 Julien Villain (Université d’Evry). "What historians (try to) do with computers. Transforming analog data into digital data: a case study"

15:45-16:15 Break

16:15-17:00 Round Table 1: What Social Sciences Can Do for Computer Science

 

Friday, November 4th 2022

10:00-10:45 Christoph Schommer (Uni. Luxemburg), „Artificial Creativity and Art Production”

10:45-11:30 Edgar Lejeune (Angers) “How did historians scholarly edit for IBM punched cards? A comparison between two French collectives (1966-1984)”.

11:30-11:45 Break

11:45-12:30 Nicolas Chachereau, Bhargav Srinivasa Desikan (EPF Lausanne). “Digital Analysis of Historical US Patents: Field Notes From An Interdisciplinary Project”

14:00-14:45 Simon Dumas Primbault (EPF Lausanne). “ ’Methodological Criss-Crossing’ A Mixed Method Ethnography of A Digital Library”

14:45-15:45 Round Table 2: What Computer Science Can Do for Social Sciences

Talk Abstracts

 
Bob Williamson: "Data, Capta and Constructa"

I will examine the Machine Learning Community’s (largely) uncritical use of data, and explore the continuum of data (as given or found), capta (sought or chosen) or constructa (manufactured). 

 

Julien Villain: "What Historians (Try to) Do with Computers. Transforming Analog Data into Digital Data: A Case Study"

My presentation will be based on some research I carried out about credit, consumption and business activities of shopkeepers in 18th Century France. I will present a number of sources (e.g. account books or bankruptcy inventories of shopkeepers) and show how I have used them in my research work.

I will try to explain that archives (i.e. documents elaborated for a specific purpose, with all the biases that this entails, and then classified) only become sources for historical knowledge if used as informational media to answer a question (a historical issue). It supposes to assess the significance and the representativeness of the source, and to specify what information is relevant to answer the question asked.

Analyzing historical data may imply to computerize them, possibly in a serial way. Transforming analog data into digital data supposes however a certain level of standardization of the information: the establishment of an entry protocol (what is entered and how?) implies accepting a certain loss of information, particularly in qualitative terms. It is therefore necessary to determine what information the researcher is willing to lose, and what intellectual cost (for the intelligibility of historical phenomena) this may have.

 

Simon Dumas Primbault: "'Methodological Criss-Crossing' A Mixed Method Ethnography of a Digital Library"

At the crossroads of the ethnography of scholarly practices and digital humanities, this study proposes to consider the exploratory aspect of the scientific activity as an intellectual path through a corpus made accessible by an online platform. My case study focuses on the reading paths of Gallica users by combining a qualitative approach—semi-directive interviews and research situations—with a quantitative approach—the analysis of one year of server logs of the platform by extracting paths, modelling them by Markov chains, and identifying typical paths by topological analysis of the data. This method shows the great complementarity of the qualitative approach and the quantitative approach: by confronting the material conditions of the possibilities of navigation with its real material practices of appropriation, in specific social and cultural contexts, this project strikes a middle ground between technological determinism and constructivism. This exploratory study has led to promising results on the navigation of researchers within the Dewey classification, on the role of “pivotal literature” in moving from one discipline to another, and on a first sketch of a variety of “regimes of navigation” grouped according to their topological characteristics. These initial results have made it possible to document in detail the intricacy of material practices and virtual interfaces, as well as the navigation strategies developed by the researchers, which testify to their desire to find “original branches”. I have therefore been able to highlight the non-linearity and indeterminacy of scholarly practices and the fundamental nature of navigation as an orientation practice. But such an understanding of digital libraries as spaces, and their navigation, must not be without a strong epistemological and methodological critique. Recognizing, on the one hand, the reification—or the effet de réel—produced by visual representation, digital libraries cannot be considered as uncharted territories just waiting to be mapped. Distrusting, on the other hand, the renewed fantasies of a “cyberspace” that could be constructed from scratch, I would like to argue that it is possible to find a middle way between the referential illusion and the demiurgic fantasy and, in this dehiscence, to deploy a practical use of cartography as an object that performs a space of knowledge, but that performs it under constraint and in a critical and reflexive way.

 

Edgar Lejeune: "How Did Historians Scholarly Edit for IBM Punched Cards? A Comparison between Two French Collectives (1966-1984)"

Between 1961 and 1989, numerous computer-assisted historical studies were conducted in France. Historians involved in these projects were members of various historiographical programs (history of the mentalities, social history, economic history). They also adopted a wide range of computational methods, used by other social sciences (demography, linguistic, sociology) and they dealt with several types of historical sources (political tracts, charters, censuses, etc.). Moreover, these studies took place in different types of institutions (universities, CNRS, EHESS, laboratories, etc…), which shaped the way in which these historians had access to computers and to computer scientists. However, in these diverse configurations, one element appears to be shared by a large number of these scholars : the storage device used for the recording of the data, IBM punched cards.
My communication aims at comparing two computer-assisted projects on the basis of the text editing methods they developed in relation to that storage device technology. The first one is an international cooperation conducted by Christiane Klapisch-Zuber and David Herlihy between 1966 and 1978. It aimed at creating an edition for the computer of a gigantic late medieval Italian archive: the Catasto Fiorentino of 1427. The second one is a project conducted at the university Paris 1 Panthéon-Sorbonne by a small group of medievalists, who focused on English political texts from the 13th to the early 16th century. I will show how four different types of elements took shape in the conception of these text editing practices: 1) the collective organization of these groups; 2) the type of documents on which the study relied and 3) the research goals that these medievalists pursued.

 

Nicolas Chachereau: "Digital Analysis of Historical US Patents: Field Notes from an Interdisciplinary Project"

Patents have long been an obvious and important source for documenting technological change. For instance, economists and economic historians have counted patents granted in certain countries or certain industries, while historians of science and technology have selected and read some of them (e.g. to study the thinking process of inventors). In recent years, the digitization of historical patents has opened up new opportunities, allowing to ‘read’ at large scale, i.e. study many patents but still use their full textual content rather than only count them. Our own digital history project analyzes a very large corpus, all 1.3 million patents issued in the United States between 1836 and 1920, with two major aims: categorizing patents into coherent technical categories; identifying discourses of safety, reflexivity, and environmental concern in technological innovation. In our presentation, we will discuss the rationale for our research questions, describe our experiments with various methods of natural language processing, and our early results. We will then move to an open discussion on the specific dynamic of our interdisciplinary project. Indeed, our team brings together historians with technical skills and computer scientists with an interest or a background in the humanities and social sciences. As such, we are in an ideal position to reflect upon the interaction between the two research strands. In our project, digital methods have influenced the historical approach and made themes such as source criticism more prominent than in many other historical research projects. Conversely, the need for interpretability of the results, as well as the specific nature of the historical documents constrained the use of the digital methods.

 

Christoph Schommer: "Artificial Creativity and Art Production"

Far-reaching developments in the field of machine learning, the collection of (big) data using modern human-computer interfaces including their processing, and increased attention to AI-related topics ('AI for the social good'; 'responsible AI', and others ) have led to many new applications, including a new form of art production. In particular, we should mention here Deep Learning, collections of artificial neural systems, which are increasingly serving as a way and tool for artists to experiment or explore new artistic paths. In this context, let us just think of novel works in the field of style transmission, music composition, poetry writing and others. However, not everyone welcomes this development trend, because the artificial creativity and the associated arbitrariness and diversity for practically everyone also meets with criticism. The presentation will shed light on this topic and also on the issue of artificial and natural creativity; some examples will be used to show current developments. A short report on our contribution to Esch2022, the European Capital of Culture finalizes the presentation.