Watch Your Mouth: Silent Speech Recognition with Depth Sensing

要旨

Silent speech recognition is a promising technology that decodes human speech without requiring audio signals, enabling private human-computer interactions. In this paper, we propose Watch Your Mouth, a novel method that leverages depth sensing to enable accurate silent speech recognition. By leveraging depth information, our method provides unique resilience against environmental factors such as variations in lighting and device orientations, while further addressing privacy concerns by eliminating the need for sensitive RGB data. We started by building a deep-learning model that locates lips using depth data. We then designed a deep learning pipeline to efficiently learn from point clouds and translate lip movements into commands and sentences. We evaluated our technique and found it effective across diverse sensor locations: On-Head, On-Wrist, and In-Environment. Watch Your Mouth outperformed the state-of-the-art RGB-based method, demonstrating its potential as an accurate and reliable input technique.

受賞
Honorable Mention
著者
Xue Wang
University of California, Los Angeles, Los Angeles, California, United States
Zixiong Su
The University of Tokyo, Tokyo, Japan
Jun Rekimoto
The University of Tokyo, Tokyo, Japan
Yang Zhang
University of California, Los Angeles, Los Angeles, California, United States
論文URL

https://doi.org/10.1145/3613904.3642092

動画

会議: CHI 2024

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2024.acm.org/)

セッション: Eye and Face

316B
5 件の発表
2024-05-14 18:00:00
2024-05-14 19:20:00