µCap: Instrumental Music Captions for Deaf and Hard-of-Hearing Individuals

Instrumental music conveys rich affective experiences through acoustic cues, yet instrumental passages often remain inaccessible to Deaf and Hard-of-Hearing (DHH) audiences. Although captioning practices for vocal songs have expanded, instrumental music remains largely uncaptioned, with no established criteria for representing musical content in text. We propose 𝜇Cap (Music Captions), an automatic instrumental music captioning system that transforms instrumental audio into time-aligned, non-lexical textual renderings enhanced with simple visuals. Drawing on Preliminary surveys with DHH individuals and expert group discussions, we developed a phonetic-like captioning schema grounded in music sound analysis and linguistics. We then implemented 𝜇Cap using audio feature extraction and a Retrieval-Augmented Generation(RAG) pipeline to produce expressive, sound-mimetic captions. Two user evaluations with DHH participants (n=20 and n=15) showed that 𝜇Cap enhanced music appreciation, immersion, and perceived presence of acoustic detail. This work contributes empirical evidence and insights for designing caption-based visual representations that make instrumental music more accessible.

Gwangju Institute of Science and Technology, Gwangju, Korea, Republic of

Gwnagju Institution of Science and Technology, Gwangju, Korea, Republic of

GIST, GwangJu, Korea, Republic of

University of Toronto, Toronto, Ontario, Canada

Gwangju Institute of Science and Technology, Gwangju, Korea, Republic of

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 120

7 件の発表

開始日時2026-04-17 18:00:00

終了日時2026-04-17 19:30:00

お気に入り

あとで読む

コレクション

要旨

受賞
Best Paper

著者

会議: CHI 2026

セッション: Designing for Sensory Access

µCap: Instrumental Music Captions for Deaf and Hard-of-Hearing Individuals

要旨

受賞Best Paper

著者

会議: CHI 2026

セッション: Designing for Sensory Access

受賞
Best Paper