Unspoken Sound: Identifying Trends in Non-Speech Audio Captioning on YouTube

要旨

High-quality closed captioning of both speech and non-speech elements (e.g., music, sound effects, manner of speaking, and speaker identification) is essential for the accessibility of video content, especially for d/Deaf and hard-of-hearing individuals. While many regions have regulations mandating captioning for television and movies, a regulatory gap remains for the vast amount of web-based video content, including the staggering 500+ hours uploaded to YouTube every minute. Advances in automatic speech recognition have bolstered the presence of captions on YouTube. However, the technology has notable limitations, including the omission of many non-speech elements, which are often crucial for understanding content narratives. This paper examines the contemporary and historical state of non-speech information (NSI) captioning on YouTube through the creation and exploratory analysis of a dataset of over 715k videos. We identify factors that influence NSI caption practices and suggest avenues for future research to enhance the accessibility of online video content.

著者
Lloyd May
Stanford University, Palo Alto, California, United States
Keita Ohshiro
New Jersey Institute of Technology, Newark, New Jersey, United States
Khang Dang
New Jersey Institute of Technology, Newark, New Jersey, United States
Sripathi Sridhar
New Jersey Institute of Technology, Newark, New Jersey, United States
Jhanvi Pai
New Jersey Institute of Technology, Newark, New Jersey, United States
Magdalena Fuentes
New York University, New York, New York, United States
Sooyeon Lee
New Jersey Institute of Technology, Newark, New Jersey, United States
Mark Cartwright
New York University, New York, New York, United States
論文URL

doi.org/10.1145/3613904.3642162

動画

会議: CHI 2024

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

セッション: Assistive Interactions: Solutions for d/Deaf and Hard of Hearing Users

321
5 件の発表
2024-05-15 20:00:00
2024-05-15 21:20:00