Captioning Images, Videos and Applications

会議の名前
CHI 2022
Watch It, Don't Imagine It: Creating a Better Caption-Occlusion Metric by Collecting More Ecologically Valid Judgments from DHH Viewers
要旨

Television captions blocking visual information causes dissatisfaction among Deaf and Hard of Hearing (DHH) viewers, yet existing caption evaluation metrics do not consider occlusion. To create such a metric, DHH participants in a recent study imagined how bad it would be if captions blocked various on-screen text or visual content. To gather more ecologically valid data for creating an improved metric, we asked 24 DHH participants to give subjective judgments of caption quality after actually watching videos, and a regression analysis revealed which on-screen contents’ occlusion related to users’ judgments. For several video genres, a metric based on our new dataset out-performed the prior state-of-the-art metric for predicting the severity of captions occluding content during videos, which had been based on that prior study. We contribute empirical findings for improving DHH viewers’ experience, guiding the placement of captions to minimize occlusions, and automated evaluation of captioning quality in television broadcasts.

著者
Akhter Al Amin
Rochester Institute of Technology, Rochester, New York, United States
Saad Hassan
Rochester Institute of Technology, Rochester, New York, United States
Sooyeon Lee
Rochester Institute of Technology, Rochester, New York, United States
Matt Huenerfauth
Rochester Institute of Technology, Rochester, New York, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517681

動画
Remotely Co-Designing Features for Communication Applications using Automatic Captioning with Deaf and Hearing Pairs
要旨

Deaf and Hard-of-Hearing (DHH) users face accessibility challenges during in-person and remote meetings. While emerging use of applications incorporating automatic speech recognition (ASR) is promising, more user-interface and user-experience research is needed. While co-design methods could elucidate designs for such applications, COVID-19 has interrupted in-person research. This study describes a novel methodology for conducting online co-design workshops with 18 DHH and hearing participant pairs to investigate ASR-supported mobile and videoconferencing technologies along two design dimensions: Correcting errors in ASR output and implementing notification systems for influencing speaker behaviors. Our methodological findings include an analysis of communication modalities and strategies participants used, use of an online collaborative whiteboarding tool, and how participants reconciled differences in ideas. Finally, we present guidelines for researchers interested in online DHH co-design methodologies, enabling greater geographically diversity among study participants even beyond the current pandemic.

受賞
Honorable Mention
著者
Matthew Seita
Rochester Institute of Technology, Rochester, New York, United States
Sooyeon Lee
Rochester Institute of Technology, Rochester, New York, United States
Sarah Andrew
Rochester Institute of Technology, Rochester, New York, United States
Kristen Shinohara
Rochester Institute of Technology, Rochester, New York, United States
Matt Huenerfauth
Rochester Institute of Technology, Rochester, New York, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3501843

動画
A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android Apps
要旨

We present the first large-scale longitudinal analysis of missing label accessibility failures in Android apps. We developed a crawler and collected monthly snapshots of 312 apps over 16 months. We use this unique dataset in empirical examinations of accessibility not possible in prior datasets. Key large-scale findings include missing label failures in 55.6% of unique image-based elements, longitudinal improvement in ImageButton elements but not in more prevalent ImageView elements, that 8.8% of unique screens are unreachable without navigating at least one missing label failure, that app failure rate does not improve with number of downloads, and that effective labeling is neither limited to nor guaranteed by large software organizations. We then examine longitudinal data in individual apps, presenting illustrative examples of accessibility impacts of systematic improvements, incomplete improvements, interface redesigns, and accessibility regressions. We discuss these findings and potential opportunities for tools and practices to improve label-based accessibility.

著者
Raymond Fok
University of Washington, Seattle, Washington, United States
Mingyuan Zhong
University of Washington, Seattle, Washington, United States
Anne Spencer. Ross
Bucknell University, Lewisburg, Pennsylvania, United States
James Fogarty
University of Washington, Seattle, Washington, United States
Jacob O.. Wobbrock
University of Washington, Seattle, Washington, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502143

動画
ImageExplorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards Imperfect AI-Generated Image Captions
要旨

Blind users rely on alternative text (alt-text) to understand an image; however, alt-text is often missing. AI-generated captions are a more scalable alternative, but they often miss crucial details or are completely incorrect, which users may still falsely trust. In this work, we sought to determine how additional information could help users better judge the correctness of AI-generated captions. We developed ImageExplorer, a touch-based multi-layered image exploration system that allows users to explore the spatial layout and information hierarchies of images, and compared it with popular text-based (Facebook) and touch-based (Seeing AI) image exploration systems in a study with 12 blind participants. We found that exploration was generally successful in encouraging skepticism towards imperfect captions. Moreover, many participants preferred ImageExplorer for its multi-layered and spatial information presentation, and Facebook for its summary and ease of use. Finally, we identify design improvements for effective and explainable image exploration systems for blind users.

著者
Jaewook Lee
University of Illinois at Urbana-Champaign, Urbana, Illinois, United States
Jaylin Herskovitz
University of Michigan, Ann Arbor, Michigan, United States
Yi-Hao Peng
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Anhong Guo
University of Michigan, Ann Arbor, Michigan, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3501966

動画