EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic Sensing

要旨

We present EchoSpeech, a minimally-obtrusive silent speech interface (SSI) powered by low-power active acoustic sensing. EchoSpeech uses speakers and microphones mounted on a glass-frame and emits inaudible sound waves towards the skin. By analyzing echos from multiple paths, EchoSpeech captures subtle skin deformations caused by silent utterances and uses them to infer silent speech. With a user study of 12 participants, we demonstrate that EchoSpeech can recognize 31 isolated commands and 3-6 figure connected digits with 4.5% (std 3.5%) and 6.1% (std 4.2%) Word Error Rate (WER), respectively. We further evaluated EchoSpeech under scenarios including walking and noise injection to test its robustness. We then demonstrated using EchoSpeech in demo applications in real-time operating at 73.3mW, where the real-time pipeline was implemented on a smartphone with only 1-6 minutes of training data. We believe that EchoSpeech takes a solid step towards minimally-obtrusive wearable SSI for real-life deployment.

著者
Ruidong Zhang
Cornell University, Ithaca, New York, United States
Ke Li
Cornell University, Ithaca, New York, United States
Yihong Hao
Cornell University, Ithaca, New York, United States
Yufan Wang
Cornell University, Ithaca, New York, United States
Zhengnan Lai
Cornell University, Ithaca, New York, United States
Francois Guimbretiere
Cornell University, Ithaca, New York, United States
Cheng Zhang
Cornell , Ithaca, New York, United States
論文URL

https://doi.org/10.1145/3544548.3580801

動画

会議: CHI 2023

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

セッション: Wearables and Materials

Room Y03+Y04
6 件の発表
2023-04-26 18:00:00
2023-04-26 19:30:00