CLARIS: Clear and Intelligible Speech from Whispered and Dysarthric Voices

要旨

Whispered and dysarthric speech hinder effective communication and undermine the reliability of voice-enabled systems. We present CLARIS, a compact speech-to-speech restoration system that turns such atypical input into clear, expressive speech. CLARIS requires no disorder-specific architectural tuning, generalizes across languages, and adapts quickly to new accents and speakers, enabling practical personalization. On whispered English, Hindi, and clinically challenging dysarthric speech, CLARIS delivers state-of-the-art intelligibility and naturalness, with listener studies confirming gains in quality, intelligibility, naturalness, and prosody. The system runs in real time, converting one second of input in about 30ms and enables inclusive, private, and personalized voice interaction. Audio samples are available at https://claris-w2s.github.io/CLARIS/

著者
Neil Shah
TCS Research, Pune, Maharashtra, India
Yash Sonkar
CVIT, IIIT Hyderabad, Hyderabad, Telangana, India
Shirish Subhash. Karande
TCS Research, Pune, Maharashtra, India
Vineet Gandhi
IIIT Hyderabad, Hyderabad, India

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Sound, Music, and Dance Accessibility

P1 - Room 120
7 件の発表
2026-04-15 20:15:00
2026-04-15 21:45:00