MELDER: The Design and Evaluation of a Real-time Silent Speech Recognizer for Mobile Devices

要旨

Silent speech is unaffected by ambient noise, increases accessibility, and enhances privacy and security. Yet current silent speech recognizers operate in a phrase-in/phrase-out manner, thus are slow, error prone, and impractical for mobile devices. We present MELDER, a Mobile Lip Reader that operates in real-time by splitting the input video into smaller temporal segments to process them individually. An experiment revealed that this substantially improves computation time, making it suitable for mobile devices. We further optimize the model for everyday use by exploiting the knowledge from a high-resource vocabulary using a transfer learning model. We then compare MELDER in both stationary and mobile settings with two state-of-the-art silent speech recognizers, where MELDER demonstrated superior overall performance. Finally, we compare two visual feedback methods of MELDER with the visual feedback method of Google Assistant. The outcomes shed light on how these proposed feedback methods influence users' perceptions of the model's performance.

著者
Laxmi Pandey
University of California, Merced, Merced, California, United States
Ahmed Sabbir. Arif
University of California, Merced, Merced, California, United States
論文URL

doi.org/10.1145/3613904.3642348

動画

会議: CHI 2024

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

セッション: Eye and Face

316B
5 件の発表
2024-05-14 18:00:00
2024-05-14 19:20:00