Zum Hauptinhalt springen

ExamAI – Testing and Auditing of AI systems

Developing recommendations for action to improve the auditability and controllability of AI-based systems based on use cases in the fields of collaborative industrial production and individual work careers

Artificial Intelligence (AI) and AI methods are increasingly being used in workplaces: AI-based matching systems or automized personality assessments can be used in human resources departments and intelligent collaborative robots and automated guided vehicles are used in production or logistics environments. Hence, the reliability, trustworthiness, safety and fairness of such systems needs to be tested and certified.

The research project "ExamAI - AI Testing & Auditing" (March 2020–November 2021), coordinated by the German Informatics Society (GI), investigated the question of effective control and test procedures for AI systems. The interdisciplinary team comprised experts of the University of Saarland, the Technical University of Kaiserslautern, the Fraunhofer Institute IESE and the Stiftung Neue Verantwortung.

Within the application areas "human-machine cooperation in industrial production" and "AI systems in human resources and talent management and in recruiting", the team looked at eleven use cases, which were identified at the beginning of the project (work package 1). The use cases deal, for example, with the misbehaviour of intelligent cobots (collaborative robots) and automated guided vehicles or with AI-based background checks of applicants and automated assessments of CVs. In addition to the main focus on possibilities for testing, auditing and certifying AI systems (work packages 4 and 5), the project also focused on the current state of standardisation in the field of AI (work package 2) and on the legal framework conditions that apply to the use of AI systems (work package 3). At a final stage the European proposal for an AI Act was examined in particular with regard to its legal requirements for developers and users. At the end of the project, policy recommendations were discussed in interdisciplinary expert workshops and have been published in two policy briefs (work package 6).

The project is funded by the Federal Ministry of Labour and Social Affairs‘ Policy Lab Digital, Work & Society as part of the Observatory for Artificial Intelligence in Work and Society (AI Observatory) project. The Policy Lab Digital, Work & Society is a new organisational unit within the “Digitalisation and the Labour Market Department” at the Federal Ministry of Labour and Social Affairs. It is a new, interdisciplinary organisational unit that observes technological, economic and social trends and helps to shape change together with academia, business and social partners.

Research papers and project reports published in English:

Becker, N., Waltl, B. (2022). Auditing and Testing AI – A Holistic Framework. In: Duffy, V.G. (eds) Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Health, Operations Management, and Design. HCII 2022. Lecture Notes in Computer Science, vol 13320. Springer, Cham. doi.org/10.1007/978-3-031-06018-2_20

Jöckel, L., Bauer, T., Kläs, M., Hauer, M. P., Groß, J. (2021). Towards a Common Testing Terminology for Software Engineering and Data Science Experts, 22nd International Conference on Product-Focused Software Process Improvement (Profes 2021), Turin, Italy, 2021.

Kläs, M., Adler, R., Jöckel, L., Gross, J., Reich, J., (2021). Using Complementary Risk Acceptance Criteria to Structure Assurance Cases for Safety-Critical AI Components, AISaftey 2021 at International Joint Conference on Artifical Intelligence (IJCAI), Montreal, Canada, 2021.

Kläs, M., Adler, R., Sorokos, I., Jöckel, L., Reich, J., (2021). Handling Uncertainties of Data-Driven Models in Compliance with Safety Constraints for Autonomous Behaviour, Proceedings of European Dependable Computing Conference (EDCC 2021), Munich, Germany, IEEE, 2021.

Hauer, M. P., Adler, R., & Zweig, K. (2021). Assuring Fairness of Algorithmic Decision Making. 2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), 2021. doi.org/10.1109/icstw52544.2021.00029

Krafft, T. D., Zweig, K. A., König, P. D. (2020). How to regulate algorithmic decision‐making: A framework of regulatory requirements for different applications. Regulation & Governance, 2020.

Becker, N., Junginger, P., Martinez, L., Krupka, D., Beining, L. (2021). Mitigating safety and discriminatory risk with technical standards, ExamAI – KI Testing & Auditing. Berlin: Gesellschaft für Informatik e.V.

Krafft, T. D., Hauer, M. P., Zweig, K. A. (2020). Why Do We Need to Be Bots? What Prevents Society from Detecting Biases in Recommendation Systems In: Boratto L., Faralli S., Marras M., Stilo G. (eds) Bias and Social Aspects in Search and Recommendation. BIAS 2020. Communications in Computer and Information Science, vol 1245. Springer, Cham.