Do Entropic Measurements of the Diversity of AI-generated Images Match Human Judgement?

This paper proposes that the ability to generate diverse outputs in response to a single prompt is necessary for text-to-image models to become more effective creativity support tools. It formalises the problem of measuring the diversity of generated text and images, with an emphasis on interactive, exploratory use in open-ended and creative tasks. It suggests, motivated by research in the psychology of creativity, that diversity should sit alongside image quality and fit-to-prompt as critical measures in this setting. The paper adapts several diversity measures from the literature to this task, then explores how they compare to human diversity ratings. These evaluations show that algorithmic measures of diversity can be a useful proxy for human ratings, with both declining in accuracy as the difficulty of the task increases. The paper concludes with an exploratory qualitative analysis of the factors involved in human diversity judgments to guide future research in this emerging area.

The University of Sydney, Sydney, Australia

The University of Sydney, Sydney, New South Wales, Australia

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 119

7 件の発表

開始日時2026-04-15 20:15:00

終了日時2026-04-15 21:45:00

お気に入り

あとで読む

コレクション

Do Entropic Measurements of the Diversity of AI-generated Images Match Human Judgement?

要旨

著者

会議: CHI 2026

セッション: Liars & Deepfakes