Do Entropic Measurements of the Diversity of AI-generated Images Match Human Judgement?

要旨

This paper proposes that the ability to generate diverse outputs in response to a single prompt is necessary for text-to-image models to become more effective creativity support tools. It formalises the problem of measuring the diversity of generated text and images, with an emphasis on interactive, exploratory use in open-ended and creative tasks. It suggests, motivated by research in the psychology of creativity, that diversity should sit alongside image quality and fit-to-prompt as critical measures in this setting. The paper adapts several diversity measures from the literature to this task, then explores how they compare to human diversity ratings. These evaluations show that algorithmic measures of diversity can be a useful proxy for human ratings, with both declining in accuracy as the difficulty of the task increases. The paper concludes with an exploratory qualitative analysis of the factors involved in human diversity judgments to guide future research in this emerging area.

著者
Kazjon Grace
The University of Sydney, Sydney, Australia
Francisco J. Ibarrola
The University of Sydney, Sydney, Australia
Jody Watts
The University of Sydney, Sydney, Australia
Shu Takahashi
The University of Sydney, Sydney, Australia
Parth Bhargava
The University of Sydney, Sydney, Australia
Eduardo Velloso
The University of Sydney, Sydney, New South Wales, Australia

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Liars & Deepfakes

P1 - Room 119
7 件の発表
2026-04-15 20:15:00
2026-04-15 21:45:00