3. LLM: New applications

会議の名前
UIST 2024
VoicePilot: Harnessing LLMs as Speech Interfaces for Assistive Robotics
要旨

Physically assistive robots present an opportunity to significantly increase the well-being and independence of individuals with motor impairments or other forms of disability who are unable to complete activities of daily living. Speech interfaces, especially ones that utilize Large Language Models (LLMs), can enable individuals to effectively and naturally communicate high-level commands and nuanced preferences to robots. Frameworks for integrating LLMs as interfaces to robots for high level task planning and code generation have been proposed, but fail to incorporate human-centric considerations which are essential while developing assistive interfaces. In this work, we present a framework for incorporating LLMs as speech interfaces for physically assistive robots, constructed iteratively with 3 stages of testing involving a feeding robot, culminating in an evaluation with 11 older adults at an independent living facility. We use both quantitative and qualitative data from the final study to validate our framework and additionally provide design guidelines for using LLMs as speech interfaces for assistive robots. Videos, code, and supporting files are located on our project website\footnote{\url{https://sites.google.com/andrew.cmu.edu/voicepilot/}}

著者
Akhil Padmanabha
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Jessie Yuan
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Janavi Gupta
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Zulekha Karachiwalla
Carnegie Mellon, Pittsburgh, Pennsylvania, United States
Carmel Majidi
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Henny Admoni
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Zackory Erickson
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
論文URL

https://doi.org/10.1145/3654777.3676401

動画
ComPeer: A Generative Conversational Agent for Proactive Peer Support
要旨

Conversational Agents (CAs) acting as peer supporters have been widely studied and demonstrated beneficial for people's mental health. However, previous peer support CAs either are user-initiated or follow predefined rules to initiate the conversations, which may discourage users to engage and build relationships with the CAs for long-term benefits. In this paper, we develop ComPeer, a generative CA that can proactively offer adaptive peer support to users. ComPeer leverages large language models to detect and reflect significant events in the dialogue, enabling it to strategically plan the timing and content of proactive care. In addition, ComPeer incorporates peer support strategies, conversation history, and its persona into the generative messages. Our one-week between-subjects study (N=24) demonstrates ComPeer's strength in providing peer support over time and boosting users' engagement compared to a baseline user-initiated CA. We report users' interaction patterns with ComPeer and discuss implications for designing proactive generative agents to promote people's well-being.

著者
Tianjian Liu
Sun Yat-sen University, Guangzhou, Guangdong Province, China
Hongzheng Zhao
School of Physics and Astronomy, Zhuhai, China
Yuheng Liu
Sun Yat-sen University, Guangzhou, China
Xingbo Wang
Cornell University, New York, New York, United States
Zhenhui Peng
Sun Yat-sen University, Zhuhai, Guangdong Province, China
論文URL

https://doi.org/10.1145/3654777.3676430

動画
SHAPE-IT: Exploring Text-to-Shape-Display for Generative Shape-Changing Behaviors with LLMs
要旨

This paper introduces text-to-shape-display, a novel approach to generating dynamic shape changes in pin-based shape displays through natural language commands. By leveraging large language models (LLMs) and AI-chaining, our approach allows users to author shape-changing behaviors on demand through text prompts without programming. We describe the foundational aspects necessary for such a system, including the identification of key generative elements (primitive, animation, and interaction) and design requirements to enhance user interaction, based on formative exploration and iterative design processes. Based on these insights, we develop SHAPE-IT, an LLM-based authoring tool for a 24 x 24 shape display, which translates the user's textual command into executable code and allows for quick exploration through a web-based control interface. We evaluate the effectiveness of SHAPE-IT in two ways: 1) performance evaluation and 2) user evaluation (N= 10). The study conclusions highlight the ability to facilitate rapid ideation of a wide range of shape-changing behaviors with AI. However, the findings also expose accuracy-related challenges and limitations, prompting further exploration into refining the framework for leveraging AI to better suit the unique requirements of shape-changing systems.

著者
Wanli Qian
University of Chicago, Chicago, Illinois, United States
Chenfeng Gao
University of Chicago, Chicago, Illinois, United States
Anup Sathya
University of Chicago, Chicago, Illinois, United States
Ryo Suzuki
University of Calgary, Calgary, Alberta, Canada
Ken Nakagaki
University of Chicago, Chicago, Illinois, United States
論文URL

https://doi.org/10.1145/3654777.3676348

動画
WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization
要旨

Large language models (LLMs) support data analysis through conversational user interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis or Code Interpreter). Essentially, LLMs produce code for accomplishing diverse analysis tasks. However, presenting raw code can obscure the logic and hinder user verification. To empower users with enhanced comprehension and augmented control over analysis conducted by LLMs, we propose a novel approach to transform LLM-generated code into an interactive visual representation. In the approach, users are provided with a clear, step-by-step visualization of the LLM-generated code in real time, allowing them to understand, verify, and modify individual data operations in the analysis. Our design decisions are informed by a formative study (N=8) probing into user practice and challenges. We further developed a prototype named WaitGPT and conducted a user study (N=12) to evaluate its usability and effectiveness. The findings from the user study reveal that WaitGPT facilitates monitoring and steering of data analysis performed by LLMs, enabling participants to enhance error detection and increase their overall confidence in the results.

著者
Liwenhan Xie
The Hong Kong University of Science and Technology, Hong Kong, China
Chengbo Zheng
Hong Kong University of Science and Technology, Hong Kong, Hong Kong
Haijun Xia
University of California, San Diego, San Diego, California, United States
Huamin Qu
The Hong Kong University of Science and Technology, Hong Kong, China
Chen Zhu-Tian
University of Minnesota-Twin Cities, Minneapolis, Minnesota, United States
論文URL

https://doi.org/10.1145/3654777.3676374

動画