124. Captioning Images, Videos and Applications

Watch It, Don't Imagine It: Creating a Better Caption-Occlusion Metric by Collecting More Ecologically Valid Judgments from DHH Viewers
説明

Television captions blocking visual information causes dissatisfaction among Deaf and Hard of Hearing (DHH) viewers, yet existing caption evaluation metrics do not consider occlusion. To create such a metric, DHH participants in a recent study imagined how bad it would be if captions blocked various on-screen text or visual content. To gather more ecologically valid data for creating an improved metric, we asked 24 DHH participants to give subjective judgments of caption quality after actually watching videos, and a regression analysis revealed which on-screen contents’ occlusion related to users’ judgments. For several video genres, a metric based on our new dataset out-performed the prior state-of-the-art metric for predicting the severity of captions occluding content during videos, which had been based on that prior study. We contribute empirical findings for improving DHH viewers’ experience, guiding the placement of captions to minimize occlusions, and automated evaluation of captioning quality in television broadcasts.

日本語まとめ
読み込み中…
読み込み中…
Remotely Co-Designing Features for Communication Applications using Automatic Captioning with Deaf and Hearing Pairs
説明

Deaf and Hard-of-Hearing (DHH) users face accessibility challenges during in-person and remote meetings. While emerging use of applications incorporating automatic speech recognition (ASR) is promising, more user-interface and user-experience research is needed. While co-design methods could elucidate designs for such applications, COVID-19 has interrupted in-person research. This study describes a novel methodology for conducting online co-design workshops with 18 DHH and hearing participant pairs to investigate ASR-supported mobile and videoconferencing technologies along two design dimensions: Correcting errors in ASR output and implementing notification systems for influencing speaker behaviors. Our methodological findings include an analysis of communication modalities and strategies participants used, use of an online collaborative whiteboarding tool, and how participants reconciled differences in ideas. Finally, we present guidelines for researchers interested in online DHH co-design methodologies, enabling greater geographically diversity among study participants even beyond the current pandemic.

日本語まとめ
読み込み中…
読み込み中…
A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android Apps
説明

We present the first large-scale longitudinal analysis of missing label accessibility failures in Android apps. We developed a crawler and collected monthly snapshots of 312 apps over 16 months. We use this unique dataset in empirical examinations of accessibility not possible in prior datasets. Key large-scale findings include missing label failures in 55.6% of unique image-based elements, longitudinal improvement in ImageButton elements but not in more prevalent ImageView elements, that 8.8% of unique screens are unreachable without navigating at least one missing label failure, that app failure rate does not improve with number of downloads, and that effective labeling is neither limited to nor guaranteed by large software organizations. We then examine longitudinal data in individual apps, presenting illustrative examples of accessibility impacts of systematic improvements, incomplete improvements, interface redesigns, and accessibility regressions. We discuss these findings and potential opportunities for tools and practices to improve label-based accessibility.

日本語まとめ
読み込み中…
読み込み中…
ImageExplorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards Imperfect AI-Generated Image Captions
説明

Blind users rely on alternative text (alt-text) to understand an image; however, alt-text is often missing. AI-generated captions are a more scalable alternative, but they often miss crucial details or are completely incorrect, which users may still falsely trust. In this work, we sought to determine how additional information could help users better judge the correctness of AI-generated captions. We developed ImageExplorer, a touch-based multi-layered image exploration system that allows users to explore the spatial layout and information hierarchies of images, and compared it with popular text-based (Facebook) and touch-based (Seeing AI) image exploration systems in a study with 12 blind participants. We found that exploration was generally successful in encouraging skepticism towards imperfect captions. Moreover, many participants preferred ImageExplorer for its multi-layered and spatial information presentation, and Facebook for its summary and ease of use. Finally, we identify design improvements for effective and explainable image exploration systems for blind users.

日本語まとめ
読み込み中…
読み込み中…