Data Quality for AI Applications (KitQar)

Funded by the BMAS (Federal Ministry of Labor and Social Affairs), the joint project KITQAR is developing quality requirements for AI training data in the digital work and knowledge society. Together with scientific & technical partners from computer sciences, laws, standardization technology and practice, the IZEW is developing ethical standards for AI training data. The aim of the project is to develop a scientifically sound and practically applicable framework for testing, validation and training data quality for artificial intelligence.

IZEW Team

PD Dr. Jessica Heesen (principal investigator)
Dr. Wulf Loh (principal investigator)
Dr. Simon David Hirsbrunner (operative lead)
Dr. Lea Watzinger (researcher)

Duration

01 Dec. 2021 – 31 Dec. 2023

Funding

Denkfabrik des BMAS (Bundesministerium für Arbeit und Soziales)

Partners

Association of Electrical Engineering Electronics and Information Technology in Germany VDE e.V. (Project leadership)

Dr. Sebastian Hallensleben & Team:

https://www.ai-ethics-impact.org/en/kontakt

Hasso-Plattner-Institute/Information Systems

Prof. Dr. Felix Naumann & Team: https://hpi.de/naumann/people/felix-naumann.html

University of Cologne, Lehrstuhl für Strafrecht, Strafprozessrecht, Rechtsphilosophie und Rechtsvergleichung

Prof. Dr. Dr. Frauke Rostalski & Team: https://rostalski.jura.uni-koeln.de/

The project

High-quality training data is central to establishing trustworthy AI systems, as they rely on large amounts of high-quality training data. But what does 'quality' mean in this context? What dimensions of quality are relevant in the context of AI? And what requirements arise specifically for the operational use context of AI-driven systems?

Biased training data is identified as one of the causes of algorithmic discrimination. The quality of training data is therefore one of the most important prerequisites for an ethically and legally sound application of AI that neither impairs fundamental rights nor causes security risks.

In the project, the IZEW is particularly concerned with ethical perspectives on training data quality. This includes, for example, algorithmic discrimination, the establishment of transparency and explainability, questions of liability for wrong or discriminatory training data, or free access to data. Other topics include the value of co-determination and self-determination, data protection and privacy.

The associated conflicts of goals and values not only concern the trade-off between increasing the quality of training data and the rights of those affected (e.g. employees or consumers) to privacy and informational self-determination. They also pose a challenge to the operationalization of ethical principles in everyday business. In order to establish a clear application reference, data sets from different contexts as well as synthetic data will be used. The framework to be developed in the project will thus make various aspects of data quality measurable and testable. One of the models for the KITQAR framework is the report of the AI Ethics Impact Group (AIEIG), under which IZEW and partner institutions presented the first practically applicable concept for operationalizing AI ethics. KITQAR is led by VDE (Association of Electrical Engineering Electronics and Information Technology in Germany) and implemented by IZEW together with partners from laws (Europa-Universität Viadrina) and computer science (Hasso Plattner Institute).