LLM Interaction & Conversational Agents

会議の名前
CHI 2026
Sensemaking in Multi-Agent LLM Interfaces: How Users Interpret Transparency and Trustworthiness Cues
要旨

As multi-agent Large Language Models (LLMs) gain traction, designers must consider how to surface their internal reasoning in ways that foster appropriate trust. We present a design-led, qualitative, comparative structured observation study, exploring how users interpret and evaluate transparency in multi-agent LLMs. Participants interacted with five interface variants, each instantiating different combinations of transparency-related design dimensions, across two task types: information-seeking and logical reasoning. We surface participants’ mental models, the cues they interpret as signals of transparency and trustworthiness, and how they weigh the costs and benefits of increasing process visibility. Transparency needs were dynamic and context-sensitive, with the ideal "Goldilocks" (i.e., "just right" transparency) level shaped jointly by task demands, interface affordances, and user characteristics such as task expertise and dispositional AI trust. We highlight tensions between process visibility, information sufficiency, and cognitive effort, and synthesise these insights into design considerations for aligning transparency with user needs in future multi-agent LLM interfaces.

著者
Saumya Pareek
University of Melbourne, Melbourne, Victoria, Australia
Jarod Govers
University of Melbourne, Melbourne, Victoria, Australia
Naja Kathrine Kollerup
Department of Computer Science, Aalborg University, Aalborg, Denmark
Emily Wong
The University of Melbourne, Melbourne, Victoria, Australia
Eduardo Velloso
The University of Sydney, Sydney, New South Wales, Australia
Jorge Goncalves
University of Melbourne, Melbourne, Australia
Chaplains' Reflections on the Design and Usage of AI for Conversational Care
要旨

Despite growing recognition that responsible AI requires domain knowledge, current work on conversational AI primarily draws on clinical expertise that prioritises diagnosis and intervention. However, much of everyday emotional support needs occur in non-clinical contexts, and therefore requires different conversational approaches. We examine how chaplains, who guide individuals through personal crises, grief, and reflection, perceive and engage with conversational AI. We recruited eighteen chaplains to build AI chatbots. While some chaplains viewed chatbots with cautious optimism, the majority expressed limitations of chatbots’ ability to support everyday well-being. Our analysis reveals how chaplains perceive their pastoral care duties and areas where AI chatbots fall short, along the themes of Listening, Connecting, Carrying, and Wanting. These themes resonate with the idea of attunement, recently highlighted as a relational lens for understanding the delicate experiences care technologies provide. This perspective informs chatbot design aimed at supporting well-being in non-clinical contexts.

著者
Joel Wester
University of Copenhagen, Copenhagen, Denmark
Samuel Rhys. Cox
Aalborg University, Aalborg, Denmark
Henning Pohl
Aalborg University, Aalborg, Denmark
Niels van Berkel
Aalborg University, Aalborg, Denmark
“It Became My Buddy, But I’m Not Afraid to Disagree”: A Multi-Session Study of UX Evaluators Collaborating with Conversational AI Assistants
要旨

AI-assisted usability analysis can potentially reduce the time and effort of finding usability problems, yet little is known about how AI's perceived expertise influences evaluators' analytic strategies and perceptions over time. We ran a within-subjects, five-session study (six hours per participant) with 12 professional UX evaluators who worked with two conversational assistants designed to appear novice- or expert-like (differing in suggestion quantity and response accuracy). We logged behavioral measures (number of passes, suggestion acceptance rate), collected subjective ratings (trust, perceived efficiency), and conducted semi-structured interviews. Participants experienced an initial novelty effect and a subsequent dip in trust that recovered over time. Their efficiency improved as they shifted from a two-pass to a one-pass video inspection approach. Evaluators ultimately rated the experienced CA as significantly more efficient, trustworthy, and comprehensive, despite not perceiving expertise differences early on. We conclude with design implications for adapting AI expertise to enable calibrated human-AI collaboration.

著者
Emily Kuang
York University, Toronto, Ontario, Canada
Ehsan Jahangirzadeh Soure
University of Waterloo, Waterloo, Ontario, Canada
Luyao Shen
Computational Media and Arts Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Nitesh Goyal
Google Research, New York, New York, United States
Mingming Fan
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Kristen Shinohara
Rochester Institute of Technology, Rochester, New York, United States
Exploring the Effects of Support Type and Task Difficulty for Virtual Assistants as Social Companions
要旨

Virtual assistants (VAs) are increasingly positioned not just as tools, but as potential social companions—capable of offering either emotional or informational support. Yet, how these forms of support should adapt to varying task difficulties and embodiment styles remains underexplored. We conducted two user studies with cognitive and physical tasks to investigate how support type (emotional vs. informational) shapes user perceptions across variations in task difficulty (easy vs. hard) and embodiment (non-embodied vs. embodied). In Study 1, emotional support positively influenced users' impressions of VA in easy tasks, while informational support was more effective in difficult tasks. In Study 2, participants also preferred emotional support for easy tasks, but differences between support types were less pronounced for difficult tasks. Notably, embodiment exerted no significant influence in either study. These findings underscore the role of context in shaping effective support strategies, offering design insights for VAs as social companions.

著者
Sei Kang
Chonnam National University , Gwangju, Korea, Republic of
YunSu Lee
Chonnam National University, Gwangju, Korea, Republic of
Gun A.. Lee
Adelaide University, Adelaide, SA, Australia
Hyung-Jeong Yang
Chonnam National University, Gwangju, Korea, Republic of
Soo-Hyung Kim
Chonnam National University, Gwangju, Korea, Republic of
Ji-eun Shin
Chonnam National University, Gwangju, Korea, Republic of
Seungwon Kim
Chonnam National University, Gwangju, Korea, Republic of
Relational Gains, Privacy Strains: Exploring Users’ Perceptions and Experiences with ChatGPT’s Memory Feature
要旨

ChatGPT’s memory feature is designed to provide users with greater control and more helpful responses. Yet, it remains unclear how users perceive this feature in relation to privacy. To address this gap, we conducted interviews with 20 ChatGPT users from diverse backgrounds. Our findings revealed four major characteristics that distinguish ChatGPT's memory from human memory: perceived unforgetfulness, detailedness, accuracy, and lack of emotions, highlighting the machine-like nature of AI memory. Moreover, both ChatGPT's memory and human memory were perceived as beneficial for relationship building. Notably, most participants experienced negative expectancy violations after learning what ChatGPT remembered about them. They expressed a strong need for greater visibility, accessibility, transparency, and user control in the design of future memory features. Drawing on users' suggestions and theoretical frameworks on privacy management, we provide design implications for developing a more transparent, responsible, and user-aligned memory experience that helps them navigate privacy-personalization trade-offs when interacting with LLM-based memories.

著者
Cheng Chen
Oregon State University, Corvallis, Oregon, United States
Maria D.. Molina
Michigan State University , East Lansing, Michigan, United States
Mengqi Liao
University of Georgia, Athens, Georgia, United States
Eugene Cho. Snyder
New Jersey Institute of Technology, Newark, New Jersey, United States
"I Wouldn't Really Use It as a Practice Tool": Understanding Medical Students’ Perspectives and Needs on LLM-Enhanced Clinical Skills Training
要旨

Large Language Models (LLMs) are expected to enhance medical education through personalized clinical skills training. However, their practical application from the student user experience perspective remains underexplored. This gap is critical because without understanding students' needs, LLM-based tools risk poor adoption and suboptimal learning outcomes. This study explores medical students' challenges and expectations when using LLM-based clinical skills training through a two-phase investigation involving 14 medical students. We integrated five Type 2 Diabetes cases into a probe platform and conducted probe-based studies followed by co-design workshops. We identified challenges across three categories: dialogue content (lack of realism, insufficient knowledge depth differentiation); dialogue presentation (information overload, single modality limitations); and dialogue interaction (inadequate guidance and feedback). Co-design workshops revealed expectations for enhanced patient modeling, personalized content delivery, structured presentation frameworks, and collaborative features. These findings provide design considerations for developing more effective, user-centered LLM-based medical education systems.

著者
Yuru Huang
The Hong Kong University of Science and Technology (Guangzhou), Guangdong, China
Chao LIU
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Yunna Cai
The HongKong University of Science and Technology(Guangzhou), Guangzhou, China
Lina Xu
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Yun Hou
Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China
Mingming Fan
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
From Human Pragmatic Language Skills to Conversational Agent Design: A Systematic Review of Transfer Strategies
要旨

While conversational agents’ (CAs) semantic and syntactic capabilities have advanced, their pragmatic skills, using language appropriately in context, have emerged as a critical focus in practical applications. Hence, scholars integrate conversational skills derived from human-human interaction into CA designs. However, existing research mainly adopts an empirical approach and focuses on specific CA deployment, making it challenging to identify overarching patterns or develop a comprehensive methodology for transferring human pragmatic skills to CA design. Thus, we conducted a systematic review of 85 studies from primary databases (e.g., ACM, IEEE, etc.), focusing on designing CAs with human-derived conversational skills. We identified skill categories (verbal, paralinguistic, nonverbal), transfer strategies (from dialog data, theories, and via co-design), implementations, and evaluation metrics. We consolidated these insights into a four-stage design process: human skill exploration, definition, transfer, and iterative evaluation. Future research can leverage this to design CAs that achieve conversational goals through contextually appropriate language use.

著者
Jiaxiong Hu
The Hong Kong University of Science and Technology, Hong Kong SAR, China
Xiwen Yao
Tsinghua University, Beijing, China
Zeyu Huang
The Hong Kong University of Science and Technology, New Territories, Hong Kong, China
Danxuan LIANG
Hong Kong University of Science and Technology , Hong Kong , Hong Kong, China
Dongjie Yang
Hong Kong University of Science and Technology , Hong Kong , Hong Kong, China
Dingdong Liu
The Hong Kong University of Science and Technology, Hong Kong , China
Junze Li
The Hong Kong University of Science and Technology, Hong Kong, Hong Kong
Yuanhao Zhang
Hong Kong University of Science and Technology, Hong Kong, China
Xiaojuan Ma
Hong Kong University of Science and Technology, Hong Kong, Hong Kong