Chi-Jung Lee (Cornell University, Ithaca, New York, United States)Ruidong Zhang (Cornell University, Ithaca, New York, United States)Devansh Agarwal (Cornell University, Ithaca, New York, United States)Tianhong Catherine. Yu (Cornell University, Ithaca, New York, United States)Vipin Gunda (Cornell University, Ithaca, New York, United States)Oliver Lopez (Cornell University, Ithaca, New York, United States)James Kim (Cornell University, Ithaca, New York, United States)Sicheng Yin (Cornell university, Ithaca, New York, United States)Boao Dong (Cornell University, Ithaca, New York, United States)Ke Li (Cornell University, Ithaca, New York, United States)Mose Sakashita (Cornell University, Ithaca, New York, United States)Francois Guimbretiere (Cornell , Ithaca, New York, United States)Cheng Zhang (Cornell University, Ithaca, New York, United States)
Our hands serve as a fundamental means of interaction with the world around us. Therefore, understanding hand poses and interaction contexts is critical for human-computer interaction (HCI). We present EchoWrist, a low-power wristband that continuously estimates 3D hand poses and recognizes hand-object interactions using active acoustic sensing. EchoWrist is equipped with two speakers emitting inaudible sound waves toward the hand. These sound waves interact with the hand and its surroundings through reflections and diffractions, carrying rich information about the hand's shape and the objects it interacts with. The information captured by the two microphones goes through a deep learning inference system that recovers hand poses and identifies various everyday hand activities. Results from the two 12-participant user studies show that EchoWrist is effective and efficient at tracking 3D hand poses and recognizing hand-object interactions. Operating at 57.9 mW, EchoWrist can continuously reconstruct 20 3D hand joints with MJEDE of 4.81 mm and recognize 12 naturalistic hand-object interactions with 97.6% accuracy.