ImageExplorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards Imperfect AI-Generated Image Captions

Blind users rely on alternative text (alt-text) to understand an image; however, alt-text is often missing. AI-generated captions are a more scalable alternative, but they often miss crucial details or are completely incorrect, which users may still falsely trust. In this work, we sought to determine how additional information could help users better judge the correctness of AI-generated captions. We developed ImageExplorer, a touch-based multi-layered image exploration system that allows users to explore the spatial layout and information hierarchies of images, and compared it with popular text-based (Facebook) and touch-based (Seeing AI) image exploration systems in a study with 12 blind participants. We found that exploration was generally successful in encouraging skepticism towards imperfect captions. Moreover, many participants preferred ImageExplorer for its multi-layered and spatial information presentation, and Facebook for its summary and ease of use. Finally, we identify design improvements for effective and explainable image exploration systems for blind users.

University of Illinois at Urbana-Champaign, Urbana, Illinois, United States

University of Michigan, Ann Arbor, Michigan, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

University of Michigan, Ann Arbor, Michigan, United States

https://dl.acm.org/doi/abs/10.1145/3491102.3501966

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)

293

4 件の発表

開始日時2022-05-05 01:15:00

終了日時2022-05-05 02:30:00

お気に入り

あとで読む