May 27th, 2025
from 13:30 to 15:00
Francesca Doneda
AI vs. Human Candidates: A Performance Comparison in Logic Questions
This study explores intelligence and fallibility by comparing human and AI performance on logic-based assessments. As logic and psychometric tests become key tools for evaluating cognitive abilities, this research builds on Alan Turing’s behavioral perspective to analyze their effectiveness.
Specifically, we examine a case study using data from two editions of the National School of Administration’s pre-selection test, where AI and human candidates answered the same logic questions.
The central question is: What can we learn from directly comparing human and AI performance on identical logic problems? By analyzing these results, we aim to provide a deeper understanding of each group's strengths and limitations in logic-based problem-solving, offering insights into their potential applications and ethical considerations.
Biography: Francesca Doneda is a PhD-Student in the doctoral programme "The human mind and its explanations: Language, brain and reasoning" offered by Università degli studi di Milano, IUSS di Pavia and Scuola Normale di Pisa and she is a member of the Logic, Uncertainty, Computation and Information Lab (LUCI) in the Department of Philosophy at Università degli studi di Milano.
Her research interests include the development of logical models that have an impact on issues of social relevance: symbolic reasoning models for disinformation detection, sources trustworthiness assessment and strategies to analyze the use of logic in the recruitment procedures of the Italian public administration.
Giuseppe Primiero
From trust evaluation to trust preservation over copies for ML systems
(joint work with Leonardo Ceragioli)
A common practice of ML systems development concerns the training of the same model under different data sets, and the use of the same (training and test) sets for different learning models. The first case is a desirable practice for identifying high quality and unbiased training conditions. The latter case coincides with the search for optimal models under a common dataset for training. These differently obtained systems have been considered akin to copies. In the quest for responsible AI, a legitimate but hardly investigated question is how to verify that trustworthiness is preserved by copies. In this paper we introduce a calculus to model and verify probabilistic com-
plex queries over data and define distinct notions of trustworthiness, and how they compose with each other and under logical operations. The aim is to offer a computational tool to check the trustworthiness of possibly complex systems copied from an original whose behavour is known.
Biography: Giuseppe Primiero is Professor of Logic with the Logic, Uncertainty, Computation and Information Lab in the Department of Philosophy at the University of Milan, Italy. He acts as Scientific Director for PHILTECH, Research Center for The Philosophy of Technology and as Programme Leader for the Master's Degree in Human-Centered AI. Giuseppe works in the formal modelling and verification of multi-agent systems with applications to symbolic and sub-symbolic AI. His preferred tools are proof-systems, modal and computational logics.