HiSync: Spatio-Temporally Aligning Hand Motion from Wearable IMU and On-Robot Camera for Command Source Identification in Long-Range HRI

要旨

Long-range Human-Robot Interaction (HRI) remains underexplored. Within it, Command Source Identification (CSI) – determining who issued a command – is especially challenging due to multi-user and distance-induced sensor ambiguity. We introduce HiSync, an optical-inertial fusion framework that treats hand motion as binding cues by aligning robot-mounted camera optical flow with hand-worn IMU signals. We first elicit a user-defined (N=12) gesture set and collect a multimodal command gesture dataset (N=38) in long-range multi-user HRI scenarios. Next, HiSync extracts frequency-domain hand motion features from both camera and IMU data, and a learned CSINet denoises IMU readings, temporally aligns modalities, and performs distance-aware multi-window fusion to compute cross-modal similarity of subtle, natural gestures, enabling robust CSI. In three-person scenes up to 34m, HiSync achieves 92.32% CSI accuracy, outperforming the prior SOTA by 48.44%. HiSync is also validated on real-robot deployment. By making CSI reliable and natural, HiSync provides a practical primitive and design guidance for public-space HRI.

著者
Chengwen Zhang
Tsinghua University, Beijing, China
Chun Yu
Tsinghua University, Beijing, China
Borong Zhuang
Tsinghua University, Beijing, China
Haopeng Jin
Beijing University of Posts and Telecommunications, Beijing, China
Qingyang Wan
Tsinghua University, Beijing, China
Zhuojun Li
Tsinghua University, Beijing, China
Zhe He
Tsinghua University, Beijing, Beijing, China
Zhoutong Ye
Tsinghua University, Beijing, China
Yu Mei
Tsinghua University, Beijing, China
Chang Liu
Tsinghua University, Beijing, China
Weinan Shi
Tsinghua University, Beijing, China
Yuanchun Shi
Tsinghua University, Beijing, China

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Modeling Human Performance & Motion

P1 - Room 116
7 件の発表
2026-04-15 18:00:00
2026-04-15 19:30:00