Assistive Interactions: Solutions for d/Deaf and Hard of Hearing Users

会議の名前
CHI 2024
Towards Inclusive Video Commenting: Introducing Signmaku for the Deaf and Hard-of-Hearing
要旨

Previous research underscored the potential of danmaku: a text-based commenting feature on videos for engaging hearing audiences. However, many Deaf and hard-of-hearing (DHH) users prioritize American Sign Language (ASL) over English. To improve inclusivity, we introduce Signmaku, a commenting mechanism that uses ASL as a sign language version of danmaku. Through a need-finding study (N=12) and a within-subject experiment (N=20), we evaluated three design styles: real human faces, cartoon-like, and robotic depictions. We found that cartoon signmaku not only provided entertainment but also prompted participants to create and share ASL comments with fewer privacy concerns compared to the other designs. Conversely, the robotic design's limited accuracy in conveying hand movements and facial expressions increased cognitive demands. Realist signmaku elicited the lowest cognitive load and was the easiest to understand among all three types. Our findings offer unique design implications for leveraging generative AI to create signmaku comments, enhancing co-learning experiences for DHH users.

著者
Si Chen
University of Illinois Urbana-Champaign, Champaign, Illinois, United States
Haocong Cheng
University of Illinois Urbana-Champaign, Champaign, Illinois, United States
Jason Situ
University of Illinois Urbana-Champaign, Urbana, Illinois, United States
Desirée Kirst
Gallaudet University , Washington , District of Columbia, United States
Suzy Su
University of Illinois Urbana-Champaign, Champaign, Illinois, United States
Saumya Malhotra
University of Illinois Urbana-Champaign, Champaign, Illinois, United States
Lawrence Angrave
University of Illinois Urbana-Champaign, Urbana, Illinois, United States
Qi Wang
Gallaudet University, Washington, District of Columbia, United States
Yun Huang
University of Illinois Urbana-Champaign, Champaign, Illinois, United States
論文URL

https://doi.org/10.1145/3613904.3642287

動画
How Users Experience Closed Captions on Live Television: Quality Metrics Remain a Challenge
要旨

This paper presents a mixed methods study on how deaf, hard of hearing and hearing viewers perceive live TV caption quality with captioned video stimuli designed to mirror TV captioning experiences. To assess caption quality, we used four commonly-used quality metrics focusing on accuracy: word error rate, weighted word error rate, automated caption evaluation (ACE), and its successor ACE2. We calculated the correlation between the four quality metrics and viewer ratings for subjective quality and found that the correlation was weak, revealing that other factors besides accuracy affect user ratings. Additionally, even high-quality captions are perceived to have problems, despite controlling for confounding factors. Qualitative analysis of viewer comments revealed three major factors affecting their experience: Errors within captions, difficulty in following captions, and caption appearance. The findings raise questions as to how objective caption quality metrics can be reconciled with the user experience across a diverse spectrum of viewers.

受賞
Honorable Mention
著者
Mariana Arroyo Chavez
Gallaudet University , Washington , District of Columbia, United States
Molly Feanny
Gallaudet University, Washington, District of Columbia, United States
Matthew Seita
Gallaudet University, Washington, District of Columbia, United States
Bernard Thompson
Gallaudet University, Washington DC, District of Columbia, United States
Keith Delk
Gallaudet University, Washington, District of Columbia, United States
Skyler Officer
Gallaudet University, Washington, District of Columbia, United States
Abraham Glasser
Gallaudet University, Washington, District of Columbia, United States
Raja Kushalnagar
Gallaudet University, Washington, District of Columbia, United States
Christian Vogler
Gallaudet University , Washington , District of Columbia, United States
論文URL

https://doi.org/10.1145/3613904.3641988

動画
Assessment of Sign Language-Based versus Touch-Based Input for Deaf Users Interacting with Intelligent Personal Assistants
要旨

With the recent advancements in intelligent personal assistants (IPAs), their popularity is rapidly increasing when it comes to utilizing Automatic Speech Recognition within households. In this study, we used a Wizard-of-Oz methodology to evaluate and compare the usability of American Sign Language (ASL), Tap to Alexa, and smart home apps among 23 deaf participants within a limited-domain smart home environment. Results indicate a slight usability preference for ASL. Linguistic analysis of the participants' signing reveals a diverse range of expressions and vocabulary as they interacted with IPAs in the context of a restricted-domain application. On average, deaf participants exhibited a vocabulary of 47 +/- 17 signs with an additional 10 +/- 7 fingerspelled words, for a total of 246 different signs and 93 different fingerspelled words across all participants. We discuss the implications for the design of limited-vocabulary applications as a stepping-stone toward general-purpose ASL recognition in the future.

著者
Nina Tran
Gallaudet University, Washington, District of Columbia, United States
Paige S. DeVries
Gallaudet University , Washington, District of Columbia, United States
Matthew Seita
Gallaudet University, Washington, District of Columbia, United States
Raja Kushalnagar
Gallaudet University, Washington, District of Columbia, United States
Abraham Glasser
Gallaudet University, Washington, District of Columbia, United States
Christian Vogler
Gallaudet University , Washington , District of Columbia, United States
論文URL

https://doi.org/10.1145/3613904.3642094

動画
Unspoken Sound: Identifying Trends in Non-Speech Audio Captioning on YouTube
要旨

High-quality closed captioning of both speech and non-speech elements (e.g., music, sound effects, manner of speaking, and speaker identification) is essential for the accessibility of video content, especially for d/Deaf and hard-of-hearing individuals. While many regions have regulations mandating captioning for television and movies, a regulatory gap remains for the vast amount of web-based video content, including the staggering 500+ hours uploaded to YouTube every minute. Advances in automatic speech recognition have bolstered the presence of captions on YouTube. However, the technology has notable limitations, including the omission of many non-speech elements, which are often crucial for understanding content narratives. This paper examines the contemporary and historical state of non-speech information (NSI) captioning on YouTube through the creation and exploratory analysis of a dataset of over 715k videos. We identify factors that influence NSI caption practices and suggest avenues for future research to enhance the accessibility of online video content.

著者
Lloyd May
Stanford University, Palo Alto, California, United States
Keita Ohshiro
New Jersey Institute of Technology, Newark, New Jersey, United States
Khang Dang
New Jersey Institute of Technology, Newark, New Jersey, United States
Sripathi Sridhar
New Jersey Institute of Technology, Newark, New Jersey, United States
Jhanvi Pai
New Jersey Institute of Technology, Newark, New Jersey, United States
Magdalena Fuentes
New York University, New York, New York, United States
Sooyeon Lee
New Jersey Institute of Technology, Newark, New Jersey, United States
Mark Cartwright
New York University, New York, New York, United States
論文URL

https://doi.org/10.1145/3613904.3642162

動画
Towards Co-Creating Access and Inclusion: A Group Autoethnography on a Hearing Individual's Journey Towards Effective Communication in Mixed-Hearing Ability Higher Education Settings
要旨

We present a group autoethnography detailing a hearing student's journey in adopting communication technologies at a mixed-hearing ability summer research camp. Our study focuses on how this student, a research assistant with emerging American Sign Language (ASL) skills, (in)effectively communicates with deaf and hard-of-hearing (DHH) peers and faculty during the ten-week program. The DHH members also reflected on their communication with the hearing student. We depict scenarios and analyze the (in)effectiveness of how emerging technologies like live automatic speech recognition (ASR) and typing are utilized to facilitate communication. We outline communication strategies to engage everyone with diverse signing skills in conversations - \textit{directing visual attention}, \textit{pause-for-attention-and-proceed}, and \textit{back-channeling via expressive body}. These strategies promote inclusive collaboration and leverage technology advancements. Furthermore, we delve into the factors that have motivated individuals to embrace more inclusive communication practices and provide design implications for accessible communication technologies within the mixed-hearing ability context.

著者
Si Chen
University of Illinois at Urbana Champaign , Champaign, Illinois, United States
James Waller
Gallaudet University, DC, Washington, United States
Matthew Seita
Rochester Institute of Technology, Rochester, New York, United States
Christian Vogler
Gallaudet University , Washington , District of Columbia, United States
Raja Kushalnagar
Gallaudet University, Washington, District of Columbia, United States
Qi Wang
Gallaudet University, Washington, District of Columbia, United States
論文URL

https://doi.org/10.1145/3613904.3642017

動画