SonicSieve: Bringing Directional Speech Extraction to Smartphones Using Acoustic Microstructures

要旨

Imagine placing your smartphone on a table in a noisy restaurant and clearly capturing the voices of friends seated around you, or recording a lecturer’s voice with clarity in a reverberant auditorium. We introduce SonicSieve, the first intelligent directional speech extraction system for smartphones using a bio-inspired acoustic microstructure. Our passive design embeds directional cues onto incoming speech without any additional electronics. It attaches to the in-line mic of low-cost wired earphones which can be attached to smartphones. We present an end-to-end neural network that processes the raw audio mixtures in real-time on mobile devices. Our results show that SonicSieve achieves a signal quality improvement of 5.0~dB when focusing on a 30° angular region. Additionally, the performance of our system based on only two microphones exceeds that of conventional 5-microphone arrays.

著者
Kuang Yuan
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Yifeng Wang
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Xiyuxing Zhang
Tsinghua University, Beijing, China
Chengyi Shen
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Swarun Kumar
CMU, Pittsburgh, Pennsylvania, United States
Justin Chan
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Sensing and Novel Fabrication

P1 - Room 133
7 件の発表
2026-04-17 20:15:00
2026-04-17 21:45:00