ImageExplorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards Imperfect AI-Generated Image Captions

要旨

Blind users rely on alternative text (alt-text) to understand an image; however, alt-text is often missing. AI-generated captions are a more scalable alternative, but they often miss crucial details or are completely incorrect, which users may still falsely trust. In this work, we sought to determine how additional information could help users better judge the correctness of AI-generated captions. We developed ImageExplorer, a touch-based multi-layered image exploration system that allows users to explore the spatial layout and information hierarchies of images, and compared it with popular text-based (Facebook) and touch-based (Seeing AI) image exploration systems in a study with 12 blind participants. We found that exploration was generally successful in encouraging skepticism towards imperfect captions. Moreover, many participants preferred ImageExplorer for its multi-layered and spatial information presentation, and Facebook for its summary and ease of use. Finally, we identify design improvements for effective and explainable image exploration systems for blind users.

著者
Jaewook Lee
University of Illinois at Urbana-Champaign, Urbana, Illinois, United States
Jaylin Herskovitz
University of Michigan, Ann Arbor, Michigan, United States
Yi-Hao Peng
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Anhong Guo
University of Michigan, Ann Arbor, Michigan, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3501966

動画

会議: CHI 2022

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)

セッション: Captioning Images, Videos and Applications

293
4 件の発表
2022-05-05 01:15:00
2022-05-05 02:30:00