We introduce M^2Silent, which enables multi-user silent speech interactions in shared spaces using multi-directional speakers. Ensuring privacy during interactions with voice-controlled systems presents significant challenges, particularly in environments with multiple individuals, such as libraries, offices, or vehicles. M^2Silent addresses this by allowing users to communicate silently, without producing audible speech, using acoustic sensing integrated into directional speakers. We leverage FMCW signals as audio carriers, simultaneously playing audio and sensing the user's silent speech. To handle the challenge of multiple users interacting simultaneously, we propose time-shifted FMCW signals and blind source separation algorithms, which help isolate and accurately recognize the speech features of each user. We also present a deep-learning model for real-time silent speech recognition. M^2Silent achieves Word Error Rate (WER) of 6.5% and Sequence Error Rate (SER) of 12.8% in multi-user silent speech recognition while maintaining high audio quality, offering a novel solution for privacy-preserving, multi-user silent interactions in shared spaces.
https://dl.acm.org/doi/10.1145/3706598.3714174
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)