While language-based AI is becoming increasingly popular, ensuring that these systems are socially responsible is essential. Despite their growing impact, large language models (LLMs), the engines of many language-driven applications, remain largely in the black box. Concerns about LLMs reinforcing harmful representations are shared by academia, industries, and the public. In professional contexts, researchers rely on LLMs for computational tasks such as text classification and contextual prediction, during which the risk of perpetuating biases cannot be overlooked. In a broader society where LLM-powered tools are widely accessible, interacting with biased models can shape public perceptions and behaviors, potentially reinforcing problematic social issues over time. This study investigates harmful representations in LLMs, focusing on ethnicity and gender in the Dutch context. Through template-based sentence construction and model probing, we identified potentially harmful representations using both automated and manual content analysis at the lexical and sentence levels, combining quantitative measurements with qualitative insights. Our findings have important ethical, legal, and political implications, challenging the acceptability of such harmful representations and emphasizing the need for effective mitigation strategies.
By Claes de Vreese, Gabriela Trogrlic, Natali Helberger, and Zilin Lin for ACM Digital Library on June 23, 2025