TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories

Millions of users across the globe turn to AI chatbots for their creative needs, inviting widespread interest in understanding how they represent diverse cultures. However, evaluating cultural representations in open-ended tasks remains challenging and underexplored. In this work, we present TALES, an evaluation of cultural misrepresentations in LLM-generated stories for diverse Indian cultural identities. First, we develop TALES-Tax, a taxonomy of cultural misrepresentations by collating insights from participants with lived experiences in India through focus groups (N=9) and individual surveys (N=15). Using TALES-Tax, we evaluate 6 models through a large-scale annotation study spanning 2,925 annotations from 108 annotators with lived experience and native language proficiency from across 71 regions in India and 14 languages. Concerningly, we find that 88% of the generated stories contain misrepresentations, and such errors are more prevalent in mid- and low-resourced languages and stories based in peri-urban regions in India. We also transform the annotations into TALES-QA, a standalone question bank to evaluate the cultural knowledge of models.

Indian Institute of Science, Bengaluru, Karnataka, India

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Indian Institute of Science, Bangalore, India

Cornell University, Ithaca, New York, United States

Google DeepMind, Bangalore, India

Indian Institute of Science, Bengaluru, Karnataka, India

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 113

6 件の発表

開始日時2026-04-17 20:15:00

終了日時2026-04-17 21:45:00

お気に入り

あとで読む

コレクション

TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories

要旨

著者

会議: CHI 2026

セッション: Margins