AgentHands: Generating Interactive Hand Gestures for Spatially Grounded Agent Conversations in XR

要旨

Communicating spatial tasks via text or speech creates ``a mental mapping gap'' that limits an agent’s expressiveness. Inspired by co-speech gestures in face-to-face conversation, we propose \textsc{AgentHands}, an LLM-powered XR system that equips agents with hands to render responses clearer and more engaging. Guided by a design taxonomy distilled from a formative study (N=10), we implement a novel pipeline to generate and render a hand agent that augments conversational responses with synchronized, space-aware, and interactive hand gestures: using a meta-instruction, \textsc{AgentHands} generates verbal responses embedded with \textit{GestureEvents} aligned to specific words; each event specifies gesture type and parameters. At runtime, a parser converts events into time-stamped poses and motions, driving an animation system that renders expressive hands synchronized with speech. In a within-subjects study (N=12), \textsc{AgentHands} increased engagement and made spatially grounded conversations easier to follow compared to a speech-only baseline.

著者
Ziyi Liu
Purdue University, West Lafayette, Indiana, United States
David Li
Google, Mountain View, California, United States
Zhongyi Zhou
Google, Tokyo, Japan
David Kim
Google Research, Zurich, Switzerland
Ruofei Du
Google XR, San Francisco, California, United States
Xun Qian
Google, Mountain View, California, United States

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Thermal and Gestural Interaction

P1 - Room 133
7 件の発表
2026-04-15 20:15:00
2026-04-15 21:45:00