Methods for faithfully capturing a user's holistic pose have immediate uses in AR/VR, ranging from multimodal input to expressive avatars. Although body-tracking has received the most attention, the mouth is also of particular importance, given that it is the channel for both speech and facial expression. In this work, we describe a new RF-based approach for capturing mouth pose using an antenna integrated into the underside of a VR/AR headset. Our approach side-steps privacy issues inherent in camera-based methods, while simultaneously supporting silent facial expressions that audio-based methods cannot. Further, compared to bio-sensing methods such as EMG and EIT, our method requires no contact with the wearer's body and can be fully self-contained in the headset, offering a high degree of physical robustness and user practicality. We detail our implementation along with results from two user studies, which show a mean 3D error of 2.6 mm for 11 mouth keypoints across worn sessions without re-calibration.
https://doi.org/10.1145/3586183.3606805
ACM Symposium on User Interface Software and Technology