Persistent Assistant: Seamless Everyday AI Interactions via Intent Grounding and Multimodal Feedback

Current AI assistants predominantly use natural language interactions, which can be time-consuming and cognitively demanding, especially for frequent, repetitive tasks in daily life. We propose Persistent Assistant, a framework for seamless and unobtrusive interactions with AI assistants. The framework has three key functionalities: (1) efficient intent specification through grounded interactions, (2) seamless target referencing through embodied input, and (3) intuitive response comprehension through multimodal perceptible feedback. We developed a proof-of-concept system for everyday decision-making tasks, where users can easily repeat queries over multiple objects using eye gaze and pinch gesture, as well as receiving multimodal haptic and speech feedback. Our study shows that multimodal feedback enhances user experience and preference by reducing physical demand, increasing perceived speed, and enabling intuitive and instinctive human-AI assistant interaction. We discuss how our framework can be applied to build seamless and unobtrusive AI assistants for everyday persistent tasks.

Meta Inc., Redmond, Washington, United States

Meta, Seattle, Washington, United States

Reality Labs Research, Meta Inc., Redmond, Washington, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Meta Inc., Redmond, Washington, United States

Reality Labs Research, Redmond, Washington, United States

10.1145/3706598.3714317

https://dl.acm.org/doi/10.1145/3706598.3714317

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

G301

7 件の発表

開始日時2025-05-01 01:20:00