LLM for Health

https://dl.acm.org/doi/10.1145/3706598.3714255

Mental-illness stigma is a persistent social problem, hampering both treatment-seeking and recovery. Accordingly, there is a pressing need to understand it more clearly, but analyzing the relevant data is highly labor-intensive. Therefore, we designed a chatbot to engage participants in conversations; coded those conversations qualitatively with AI assistance; and, based on those coding results, built causal knowledge graphs to decode stigma. The results we obtained from 1,002 participants demonstrate that conversation with our chatbot can elicit rich information about people’s attitudes toward depression, while our AI-assisted coding was strongly consistent with human-expert coding. Our novel approach combining large language models (LLMs) and causal knowledge graphs uncovered patterns in individual responses and illustrated the interrelationships of psychological constructs in the dataset as a whole. The paper also discusses these findings’ implications for HCI researchers in developing digital interventions, decomposing human psychological constructs, and fostering inclusive attitudes.

National University of Singapore, Singapore, Singapore

10.1145/3706598.3714255

https://dl.acm.org/doi/10.1145/3706598.3713307

We evaluated the viability of using Large Language Models (LLMs) to trigger and personalize content in Just-in-Time Adaptive Interventions (JITAIs) in digital health. As an interaction pattern representative of context-aware computing, JITAIs are being explored for their potential to support sustainable behavior change, adapting interventions to an individual’s current context and needs. Challenging traditional JITAI implementation models, which face severe scalability and flexibility limitations, we tested GPT-4 for suggesting JITAIs in the use case of heart-healthy activity in cardiac rehabilitation. Using three personas representing patients affected by CVD with varying severeness and five context sets per persona, we generated 450 JITAI decisions and messages. These were systematically evaluated against those created by 10 laypersons (LayPs) and 10 healthcare professionals (HCPs). GPT-4-generated JITAIs surpassed human-generated intervention suggestions, outperforming both LayPs and HCPs across all metrics (i.e., appropriateness, engagement, effectiveness, and professionalism). These results highlight the potential of LLMs to enhance JITAI implementations in personalized health interventions, demonstrating how generative AI could revolutionize context-aware computing.

Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria

Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg , Austria

Johannes Kepler University Linz, Linz, Austria

Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria

University Institute of Sports Medicine, Prevention and Rehabilitation, Salzburg, Austria

Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria

Newcastle University, Newcastle, Tyne and Wear, United Kingdom

LMU Munich, Munich, Germany

Ludwig Boltzmann Institute for Digital Health and Prevention, Salzburg, Austria

10.1145/3706598.3713307

https://dl.acm.org/doi/10.1145/3706598.3713613

Information gathering is an important capability that allows chatbots to understand and respond to users' needs, yet the effectiveness of LLM-powered chatbots at this task remains underexplored. Our work investigates this question in the context of clinical pre-consultation, wherein patients provide information to an intermediary before meeting with a physician to facilitate communication and reduce consultation inefficiencies. We conducted a study at a walk-in clinic with 45 patients who interacted with one of three conversational agents: a chatbot, a questionnaire, and a Wizard-of-Oz. We analyzed patients' messages using metrics adapted from Grice's maxims to assess the quality of information gathered at each conversation turn. We found that the Wizard and LLM were more successful than the questionnaire because they modified questions and asked follow-ups when participants provided unsatisfactory answers. However, the LLM did not ask nearly as many follow-up questions as the Wizard, particularly when participants provided unclear answers.

University of Toronto, Toronto, Ontario, Canada

Independent Researcher, Brampton, Ontario, Canada

University of Toronto, Toronto, Ontario, Canada

10.1145/3706598.3713613

https://dl.acm.org/doi/10.1145/3706598.3713918

Premenstrual syndrome (PMS) is a prevalent disorder among women, often exacerbated by a lack of peer support due to associated stigmatization. Drawing inspiration from the established benefits of group therapy, particularly the sense of belonging it fosters, we developed a multi-chatbot group motivational interviewing system. The system consists of a facilitator bot and two peer bots, and simulates a group counseling environment for PMS management using Large Language Models (LLMs). We conducted a study with 63 participants and divided them into three conditions (no intervention, 1-on-1 chatbot, group chatbots) over two menstruation cycles for evaluation. Our findings revealed that participants in the group chat condition exhibited higher levels of engagement and language convergence with the chatbots. These participants were also able to engage in social learning and demonstrated motivation in coping through interactions with the chatbots. Finally, we discuss design implications for multi-chatbot interactions in supporting mental health.

The University of Tokyo, Tokyo, Japan

Ochanomizu University, Tokyo, Japan

University of Wisconsin-Madison, Madison, Wisconsin, United States

University of Oulu, Oulu, Oulu, Finland

University of Tokyo, Tokyo, Japan

10.1145/3706598.3713918

https://dl.acm.org/doi/10.1145/3706598.3714196

Hospital admission interviews are critical for patient care but strain nurses' capacity due to time constraints and staffing shortages. While LLM-powered conversational agents (CAs) offer automation potential, their rigid sequencing and lack of humanized communication skills risk misunderstandings and incomplete data capture. Through participatory design with clinicians and volunteers, we identified essential communication strategies and developed a novel CA that implements these strategies through: (1) dynamic topic management using graph-based conversation flows, and (2) context-aware scaffolding with few-shot prompt tuning. Technical evaluation on an admission interview dataset showed our system achieving performance comparable to or surpassing human-written ground truth, while outperforming prompt-engineered baselines. A between-subject study (N=44) demonstrated significantly improved user experience and data collection accuracy compared to existing solutions. We contribute a framework for humanizing medical CAs by translating clinician expertise into algorithmic strategies, alongside empirical insights for balancing efficiency and empathy in healthcare interactions, and considerations for generalizability.

The Hong Kong University of Science and Technology, Hong Kong , China

KTH Royal Institute of Technology, Stockholm, Stockholm, Sweden

The Hong Kong University of Science and Technology, Hong Kong SAR, China

The Hong Kong University of Science and Technology, Hong Kong, China

Southeast University, Nanjing, China

Hong Kong University of Science and Technology, Hong Kong, Hong Kong

10.1145/3706598.3714196