µCap: Instrumental Music Captions for Deaf and Hard-of-Hearing Individuals

要旨

Instrumental music conveys rich affective experiences through acoustic cues, yet instrumental passages often remain inaccessible to Deaf and Hard-of-Hearing (DHH) audiences. Although captioning practices for vocal songs have expanded, instrumental music remains largely uncaptioned, with no established criteria for representing musical content in text. We propose 𝜇Cap (Music Captions), an automatic instrumental music captioning system that transforms instrumental audio into time-aligned, non-lexical textual renderings enhanced with simple visuals. Drawing on Preliminary surveys with DHH individuals and expert group discussions, we developed a phonetic-like captioning schema grounded in music sound analysis and linguistics. We then implemented 𝜇Cap using audio feature extraction and a Retrieval-Augmented Generation(RAG) pipeline to produce expressive, sound-mimetic captions. Two user evaluations with DHH participants (n=20 and n=15) showed that 𝜇Cap enhanced music appreciation, immersion, and perceived presence of acoustic detail. This work contributes empirical evidence and insights for designing caption-based visual representations that make instrumental music more accessible.

受賞
Best Paper
著者
SooYeon Ahn
Gwangju Institute of Science and Technology, Gwangju, Korea, Republic of
In-Chang Baek
Gwnagju Institution of Science and Technology, Gwangju, Korea, Republic of
KyungJoong Kim
GIST, GwangJu, Korea, Republic of
Khai N.. Truong
University of Toronto, Toronto, Ontario, Canada
Jin-Hyuk Hong
Gwangju Institute of Science and Technology, Gwangju, Korea, Republic of

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Designing for Sensory Access

P1 - Room 120
7 件の発表
2026-04-17 18:00:00
2026-04-17 19:30:00