AI Collaboration in Practice

会議の名前
CHI 2026
Cocoa: Co-Planning and Co-Execution with AI Agents
要旨

As AI agents take on increasingly long-running tasks involving sophisticated planning and execution, there is a corresponding need for novel interaction designs that enable deeper human-agent collaboration. However, most prior works leverage human interaction to fix "autonomous" workflows that have yet to become fully autonomous or rigidly treat planning and execution as separate stages. Based on a formative study with 9 researchers using AI to support their work, we propose a design that affords greater flexibility in collaboration, so that users can 1) delegate agency to the user or agent via a collaborative plan where individual steps can be assigned; and 2) interleave planning and execution so that plans can adjust after partial execution. We introduce Cocoa, a system that takes design inspiration from computational notebooks to support complex research tasks. A lab study (n=16) found that Cocoa enabled steerability without sacrificing ease-of-use, and a week-long field deployment (n=7) showed how researchers collaborated with Cocoa to accomplish real-world tasks.

受賞
Best Paper
著者
K. J. Kevin Feng
University of Washington, Seattle, Washington, United States
Kevin Pu
University of Toronto, Toronto, Ontario, Canada
Matt Latzke
Allen Institute for AI, Seattle, Washington, United States
Tal August
University of Illinois Urbana-Champaign , Urbana, Illinois, United States
Pao Siangliulue
Allen Institute for AI, Seattle, Washington, United States
Jonathan Bragg
Allen Institute for Artificial Intelligence, Seattle, Washington, United States
Daniel S. Weld
Allen Institute for Artificial Intelligence, Seattle, Washington, United States
Amy X.. Zhang
University of Washington, Seattle, Washington, United States
Joseph Chee Chang
Allen Institute for AI, Seattle, Washington, United States
From Overload to Convergence: Supporting Multi-Issue Human–AI Negotiation with Bayesian Visualization
要旨

As AI systems increasingly mediate negotiations, understanding how the number of negotiated issues impacts human performance is crucial for maintaining human agency. We designed a human–AI negotiation case study in a realistic property rental scenario, varying the number of negotiated issues; empirical findings show that without support, performance stays stable up to three issues but declines as additional issues increase cognitive load. To address this, we introduce a novel uncertainty-based visualization driven by Bayesian estimation of agreement probability. It shows how the space of mutually acceptable agreements narrows as negotiation progresses, helping users identify promising options. In a within-subjects experiment (N=32), it improved human outcomes and efficiency, preserved human control, and avoided redistributing value. Our findings surface practical limits on the complexity people can manage in human–AI negotiation, advance theory on human performance in complex negotiations, and offer validated design guidance for interactive systems.

受賞
Best Paper
著者
Mehul Parmar
Asian Institute of Technology, Bangkok, Thailand
Chaklam Silpasuwanchai
Asian Institute of Technology, Pathumthani, Thailand
動画
Seeing Eye to Eye: Enabling Cognitive Alignment Through Shared First-Person Perspective in Human–AI Collaboration
要旨

Despite advances in multimodal AI, current vision-based assistants often remain inefficient in collaborative tasks. We identify two key gulfs: a communication gulf, where users must translate rich parallel intentions into verbal commands due to the channel mismatch , and an understanding gulf, where AI struggles to interpret subtle embodied cues. To address these, we propose Eye2Eye, a framework that leverages first-person perspective as a channel for human-AI cognitive alignment. It integrates three components: (1) joint attention coordination for fluid focus alignment, (2) revisable memory to maintain evolving common ground, and (3) reflective feedback allowing users to clarify and refine AI's understanding. We implement this framework in an AR prototype and evaluate it through a user study and a post-hoc pipeline evaluation. Results show that Eye2Eye significantly reduces task completion time and interaction load while increasing trust, demonstrating its components work in concert to improve collaboration.

著者
Zhuyu Teng
Zhejiang University, Hangzhou, China
Pei Chen
Zhejiang University, Hangzhou, China
Yichen Cai
Zhejiang University, Hangzhou, China
Ruoqing Lu
Zhejiang University, Hangzhou, China
Zhaoqu Jiang
Zhejiang University, Hangzhou, China
Jiayang Li
Zhejiang University, Hangzhou, China
Weitao You
College of Computer Science and Technology, Hangzhou, Zhejiang, China
Lingyun Sun
Zhejiang University, Hangzhou, China
Can LLM-Simulated Practice and Feedback Upskill Human Counselors? A Randomized Study with 90+ Novice Counselors
要旨

The growing demand for accessible mental health support requires training more counselors, yet existing approaches remain resource-intensive and difficult to scale. LLMs can realistically simulate patients and generate actionable feedback for training, but their actual impact on novice counselor skill development remains unknown. We developed an LLM-simulated practice and feedback system and conducted a randomized study with 94 novice counselors, comparing practice alone versus practice with feedback. We evaluated behavioral performance, self-efficacy, and qualitative reflections. Results showed the practice-and-feedback group improved in client-centered microskills (reflections, questions), while the practice-alone group showed no improvements. For empathy, the practice-alone group declined over time and performed significantly worse than the feedback group. Qualitative interviews reinforced these findings: feedback helped participants adopt a client-centered listening approach, while practice-alone participants remained solution-oriented. These results suggest LLM-based training systems can promote effective skill development, and combining simulated practice with structured feedback is critical for meaningful improvement.

受賞
Honorable Mention
著者
Ryan Louie
Stanford University, Stanford, California, United States
Raj Sanjay. Shah
Georgia Institute of Technology, Atlanta, Georgia, United States
Ifdita Hasan Orney
Stanford University, Stanford, California, United States
Juan Pablo Pacheco
Stanford University, Stanford, California, United States
Emma Brunskill
Stanford University, Stanford, California, United States
Diyi Yang
Stanford University, Stanford, California, United States
Towards Fluent Interaction with Cyber-Physical Architecture
要旨

What happens when your walls begin to move? This paper explores the design of human-robot interaction for architectural-scale, shape-changing environments. We present findings from two studies: (1) a series of speculative design workshops (N=20) that uncovered aspirational visions for these spaces, and (2) a task-based Wizard-of-Oz elicitation study (N=12) that grounded these visions in the challenges of practical interaction. Our workshop findings reveal a complex landscape of user desires, exposing critical tensions between proactive automation and the preservation of user autonomy, and between personalization and public ownership. Our elicitation study reveals a set of core interaction challenges related to multimodal collaboration; and, most critically: suggests the need for a modality-agnostic model of evolving user intent. We conclude with a set of grounded proposals for creating robotic environments that are collaborative and trusted partners in everyday life.

受賞
Best Paper
著者
Jesse T. Gonzalez
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Neeta M. Khanuja
CMU, Pittsburgh, Pennsylvania, United States
Michael Mingxuan. Li
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Maggie Guo
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Layomi Olaitan
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Emily Lau
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Jenny Pugh
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Alexandra Ion
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Scott E. Hudson
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Results-Actionability Gap: Understanding How Practitioners Evaluate LLM Products in the Wild
要旨

How do product teams evaluate LLM-powered products? As organizations integrate large language models (LLMs) into digital products, their unpredictable nature makes traditional evaluation approaches inadequate, yet little is known about how practitioners navigate this challenge. Through interviews with nineteen practitioners across diverse sectors, we identify ten evaluation practices spanning informal 'vibe checks' to organizational meta-work. Beyond confirming four documented challenges, we introduce a novel fifth we call the results-actionability gap, in which practitioners gather evaluation data but cannot translate findings into concrete improvements. Drawing on patterns from successful teams, we contribute strategies to bridge this gap, supporting practitioners' formalization journey from ad-hoc interpretive practices (e.g., vibe checks) toward systematic evaluation. Our analysis suggests these interpretive practices are necessary adaptations to LLM characteristics rather than methodological failures. For HCI researchers, this presents a research opportunity to support practitioners in systematizing emerging practices rather than developing new evaluation frameworks.

著者
Willem van der Maden
IT University of Copenhagen, Copenhagen, Denmark
Malak Sadek
Cambridge University, Cambridge, United Kingdom
Ziang Xiao
Johns Hopkins University, Baltimore, Maryland, United States
Aske Mottelson
IT University of Copenhagen, Copenhagen, Denmark
Q. Vera Liao
University of Michigan, Ann Arbor, Ann Arbor, Michigan, United States
Jichen Zhu
IT University of Copenhagen, Copenhagen, Denmark
TermSight: Making Service Contracts Approachable
要旨

Legal contracts govern much of our society, but their specialized language is difficult for non-experts to read. While AI has enabled simplification of complex language, legal contracts pose unique challenges because of their connection to readers' values, ambiguity, and legally binding nature. Based on a formative study (N=20) using Terms of Service (ToS) as example contracts to study challenges in contract reading, we developed TermSight, an intelligent reading interface to probe the opportunities and challenges of designing augmentations for legal text. TermSight guides readers to relevant clauses with color-coded plain-language snippets of information and contextualizes ambiguous language with definitions and hypothetical scenarios. Importantly, TermSight's features always foreground the original, legally-binding contract text (e.g., linking to associated clauses). Our within-subjects study (N=20) demonstrated the opportunities of TermSight in making ToS significantly easier to read and navigate while revealing the challenges of augmenting service contracts such as ToS.

著者
Ziheng Huang
University of Illinois Urbana-Champaign, Urbana, Illinois, United States
Tal August
University of Illinois Urbana-Champaign , Urbana, Illinois, United States
Hari Sundaram
University of Illinois, Urbana, Illinois, United States