Agent Design

会議の名前
CHI 2025
"You Can Fool Me, You Can't Fool Her!": Autoethnographic Insights from Equine-Assisted Interventions to Inform Therapeutic Robot Design
要旨

Equine-Assisted Interventions (EAIs) aim to improve participant health and well-being through the development of a therapeutic relationship with a trained horse. These interventions leverage the horse’s ability to provide emotional feedback, as it responds to negative non-verbal cues with reciprocal negativity, thereby encouraging participants to regulate their emotions and achieve attunement with the horse. Despite their benefits, EAIs face significant challenges, including logistical, financial, and resource constraints, which hinder their widespread adoption and accessibility. To address these issues, we conducted an autoethnographic study of the lead researcher’s engagement in an EAI to investigate the underlying mechanisms and explore potential technological alternatives. Our findings suggest that the reciprocal and responsive non-verbal communication, combined with the horse’s considerable physical presence, supports the potential of an embodied robotic system as a viable alternative. Such a system could offer a scalable and sustainable solution to the current limitations of EAIs.

著者
Ellen Weir
University of Bristol, Bristol, United Kingdom
Ute Leonards
University of Bristol, Bristol, United Kingdom
Anne Roudaut
University of Bristol, Bristol, United Kingdom
DOI

10.1145/3706598.3714311

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714311

動画
Leveraging AI-Generated Emotional Self-Voice to Nudge People towards their Ideal Selves
要旨

Emotions, shaped by past experiences, significantly influence decision-making and goal pursuit. Traditional cognitive-behavioral techniques for personal development rely on mental imagery to envision ideal selves, but may be less effective for individuals who struggle with visualization. This paper introduces Emotional Self-Voice (ESV), a novel system combining emotionally expressive language models and voice cloning technologies to render customized responses in the user's own voice. We investigate the potential of ESV to nudge individuals towards their ideal selves in a study with 60 participants. Across all three conditions (ESV, text-only, and mental imagination), we observed an increase in resilience, confidence, motivation, and goal commitment, and the ESV condition was perceived as uniquely engaging and personalized. We discuss the implications of designing generated self-voice systems as a personalized behavioral intervention for different scenarios.

著者
Cathy Mengying Fang
MIT Media Lab, Cambridge, Massachusetts, United States
Phoebe Chua
School of Computing, National University of Singapore, Singapore, Singapore
Samantha W. T.. Chan
Nanyang Technological University, Singapore, Singapore
Joanne Leong
MIT, Cambridge, Massachusetts, United States
Andria Bao
Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
Pattie Maes
MIT Media Lab, Cambridge, Massachusetts, United States
DOI

10.1145/3706598.3713359

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713359

動画
Should Voice Agents Be Polite in an Emergency? Investigating Effects of Speech Style and Voice Tone in Emergency Simulation
要旨

Research in human-agent interaction highlights the significance of agents’ politeness in enhancing social engagement and interaction satisfaction. It remains unclear, however, if agents should maintain politeness even in time-constraint situations. This study explores how a voice agent should deliver instructions for emergency evacuation using a between-subjects experiment in which we manipulated agent speech style (politeness: positive vs. negative vs. direct) and voice tone (urgency: high vs. low) and measured the effects on users’ perceptions of the agent and their cognitive workload. We found that the urgency of the agent's tone had a positive effect on the perceived anthropomorphism, likability, and intelligence of the agent while reducing the required effort and frustration to complete the tasks. Urgent voices increased the cognitive trust and likeability of the agent when the agent used negative politeness for instructions. Our findings provide guidelines for designing voice agents for emergencies.

著者
Jieun Kim
Cornell University, Ithaca, New York, United States
Susan R.. Fussell
Cornell University, Ithaca, New York, United States
DOI

10.1145/3706598.3714203

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714203

動画
DreamGarden: A Designer Assistant for Growing Games from a Single Prompt
要旨

Coding assistants are increasingly leveraged in game design, both generating code and making high-level plans. To what degree can these tools align with developer workflows, and what new modes of human-computer interaction can emerge from their use? We present DreamGarden, an AI system capable of assisting with the development of diverse game environments in Unreal Engine. At the core of our method is an LLM-driven planner, capable of breaking down a single, high-level prompt---a dream, memory, or imagined scenario provided by a human user---into a hierarchical action plan, which is then distributed across specialized submodules facilitating concrete implementation. This system is presented to the user as a garden of plans and actions, both growing independently and responding to user intervention via seed prompts, pruning, and feedback. Through a user study, we explore design implications of this system, charting courses for future work in semi-autonomous assistants and open-ended simulation design.

受賞
Best Paper
著者
Sam Earle
New York University, Brooklyn, New York, United States
Samyak Parajuli
The University of Texas at Austin, Austin, Texas, United States
Andrzej Banburski-Fahey
Microsoft, Redmond, Washington, United States
DOI

10.1145/3706598.3714233

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714233

動画
Persistent Assistant: Seamless Everyday AI Interactions via Intent Grounding and Multimodal Feedback
要旨

Current AI assistants predominantly use natural language interactions, which can be time-consuming and cognitively demanding, especially for frequent, repetitive tasks in daily life. We propose Persistent Assistant, a framework for seamless and unobtrusive interactions with AI assistants. The framework has three key functionalities: (1) efficient intent specification through grounded interactions, (2) seamless target referencing through embodied input, and (3) intuitive response comprehension through multimodal perceptible feedback. We developed a proof-of-concept system for everyday decision-making tasks, where users can easily repeat queries over multiple objects using eye gaze and pinch gesture, as well as receiving multimodal haptic and speech feedback. Our study shows that multimodal feedback enhances user experience and preference by reducing physical demand, increasing perceived speed, and enabling intuitive and instinctive human-AI assistant interaction. We discuss how our framework can be applied to build seamless and unobtrusive AI assistants for everyday persistent tasks.

著者
Hyunsung Cho
Meta Inc., Redmond, Washington, United States
Jacqui Fashimpaur
Meta Inc., Redmond, Washington, United States
Naveen Sendhilnathan
Meta, Seattle, Washington, United States
Jonathan Browder
Reality Labs Research, Meta Inc., Redmond, Washington, United States
David Lindlbauer
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Tanya R.. Jonker
Meta Inc., Redmond, Washington, United States
Kashyap Todi
Reality Labs Research, Redmond, Washington, United States
DOI

10.1145/3706598.3714317

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714317

動画
Prompting an Embodied AI Agent: How Embodiment and Multimodal Signaling Affects Prompting Behaviour
要旨

Current voice agents wait for a user to complete their verbal instruction before responding; yet, this is misaligned with how humans engage in everyday conversational interaction, where interlocutors use multimodal signaling (e.g. nodding, grunting, or looking at referred to objects) to ensure conversational grounding. We designed an embodied VR agent that exhibits multimodal signaling behaviors in response to situated prompts, by turning its head, or by visually highlighting objects being discussed or referred to. We explore how people prompt this agent to design and manipulate the objects in a VR scene. Through a Wizard of Oz study, we found that participants interacting with an agent that indicated its understanding of spatial and action references were able to prevent errors 30% of the time, and were more satisfied and confident in the agent's abilities. These findings underscore the importance of designing multimodal signalling communication techniques for future embodied agents.

受賞
Honorable Mention
著者
Tianyi Zhang
Singapore Management University, Singapore, Singapore
Colin Au Yeung
University of Calgary, Calgary, Alberta, Canada
Emily Aurelia
Singapore Management University, Singapore, Singapore
Yuki Onishi
Tohoku University, Sendai, Japan
Neil Chulpongsatorn
University of Calgary, Calgary, Alberta, Canada
Jiannan Li
Singapore Management University , Singapore, Singapore
Anthony Tang
Singapore Management University, Singapore, Singapore
DOI

10.1145/3706598.3713110

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713110

動画
The People Behind the Robots: How Wizards Wrangle Robots in Public Deployments
要旨

In the Wizard-of-Oz study paradigm, human "wizards" perform not-yet-implemented system behavior, simulating, among others, how autonomous robots could interact in public to see how unwitting bystanders respond. This paper analyzes a 60-minute video recording of two wizards in a public plaza who are operating two trash-collecting robots within their line of sight. We take an ethnomethodology and conversation analysis perspective to scrutinize interactions between the wizards and the people in the plaza, focusing on critical instances where one robot gets stuck and requires collaborative intervention by the wizards. Our analysis unpacks how the wizards deal with emergent problems by pushing one robot into the other, how they manage front and backstage interactions, and how they monitor the location of each other's robots. We discuss how scrutinizing the work of wizards can inform explorative Wizard-of-Oz paradigms, the design of multi-agent robot systems, and the operation of urban robots from a distance.

著者
Hannah RM. Pelikan
Linköping University, Linköping, Sweden
Fanjun Bu
Cornell Tech, New York, New York, United States
Wendy Ju
Cornell Tech, New York, New York, United States
DOI

10.1145/3706598.3713237

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713237

動画