ProxiMic: Convenient Voice Activation via Close-to-Mic Speech Detected by a Single Microphone

要旨

Wake-up-free techniques (e.g., Raise-to-Speak) are important for improving the voice input experience. We present ProxiMic, a close-to-mic (within 5 cm) speech sensing technique using only one microphone. With ProxiMic, a user keeps a microphone-embedded device close to the mouth and speaks directly to the device without wake-up phrases or button presses. To detect close-to-mic speech, we use the feature from pop noise observed when a user speaks and blows air onto the microphone. Sound input is first passed through a low-pass adaptive threshold filter, then analyzed by a CNN which detects subtle close-to-mic features (mainly pop noise). Our two-stage algorithm can achieve 94.1% activation recall, 12.3 False Accepts per Week per User (FAWU) with 68 KB memory size, which can run at 352 fps on the smartphone. The user study shows that ProxiMic is efficient, user-friendly, and practical.

著者
Yue Qin
Tsinghua University, Beijing, China
Chun Yu
Tsinghua University, Beijing, China
Zhaoheng Li
Tsinghua University, Beijing, China
Mingyuan Zhong
University of Washington, Seattle, Washington, United States
Yukang Yan
Tsinghua University, Beijing, China
Yuanchun Shi
Tsinghua University, Beijing, China
DOI

10.1145/3411764.3445687

論文URL

https://doi.org/10.1145/3411764.3445687

動画

会議: CHI 2021

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2021.acm.org/)

セッション: Vision and Sensing

[A] Paper Room 10, 2021-05-12 17:00:00~2021-05-12 19:00:00 / [C] Paper Room 10, 2021-05-13 09:00:00~2021-05-13 11:00:00 / [B] Paper Room 10, 2021-05-13 01:00:00~2021-05-13 03:00:00
Paper Room 10
13 件の発表
2021-05-12 17:00:00
2021-05-12 19:00:00
日本語まとめ
読み込み中…