この勉強会は終了しました。ご参加ありがとうございました。
Apple’s 1987 Knowledge Navigator video contains a vision of a sophisticated digital personal assistant, but the natural human-agent conversational dialog shown does not currently exist. To investigate why, the authors analyzed the video using three theoretical frameworks: the DiCoT framework, the HAT Game Analysis framework, and the Flows of Power framework. These were used to codify the human-agent interactions and classify the agent’s capabilities. While some barriers to creating such agents are technological, other barriers arise from privacy, social and situational factors, trust, and the financial business case. The social roles and asymmetric interactions of the human and agent are discussed in the broader context of HAT research, along with the need for a new term for these agents that does not rely on a human social relationship metaphor. This research offers designers of conversational agents a research roadmap to build more highly capable and trusted non-human teammates.
Large Language Models (LLMs) have created opportunities for designing chatbots that can support complex question-answering (QA) scenarios and improve news audience engagement. However, we still lack an understanding of what roles journalists and readers deem fit for such a chatbot in newsrooms. To address this gap, we first interviewed six journalists to understand how they answer questions from readers currently and how they want to use a QA chatbot for this purpose. To understand how readers want to interact with a QA chatbot, we then conducted an online experiment (N=124) where we asked each participant to read three news articles and ask questions to either the author(s) of the articles or a chatbot. By combining results from the studies, we present alignments and discrepancies between how journalists and readers want to use QA chatbots and propose a framework for designing effective QA chatbots in newsrooms.
Voice Agents (VAs) are touted as being able to help users in complex tasks such as cooking and interacting as a conversational partner to provide information and advice while the task is ongoing. Through conversation analysis of 7 cooking sessions with a commercial VA, we identify challenges caused by a lack of contextual awareness leading to irrelevant responses, misinterpretation of requests, and information overload. Informed by this, we evaluated 16 cooking sessions with a wizard-led context-aware VA. We observed more fluent interaction between humans and agents, including more complex requests, explicit grounding within utterances, and complex social responses. We discuss reasons for this, the potential for personalisation, and the division of labour in VA communication and proactivity. Then, we discuss the recent advances in generative models and the VAs interaction challenges. We propose limited context awareness in VAs as a step toward explainable, explorable conversational interfaces.
The widespread use of Large Language Model (LLM)-based conversational agents (CAs), especially in high-stakes domains, raises many privacy concerns. Building ethical LLM-based CAs that respect user privacy requires an in-depth understanding of the privacy risks that concern users the most. However, existing research, primarily model-centered, does not provide insight into users' perspectives. To bridge this gap, we analyzed sensitive disclosures in real-world ChatGPT conversations and conducted semi-structured interviews with 19 LLM-based CA users. We found that users are constantly faced with trade-offs between privacy, utility, and convenience when using LLM-based CAs. However, users' erroneous mental models and the dark patterns in system design limited their awareness and comprehension of the privacy risks. Additionally, the human-like interactions encouraged more sensitive disclosures, which complicated users' ability to navigate the trade-offs. We discuss practical design guidelines and the needs for paradigm shifts to protect the privacy of LLM-based CA users.
We explore a range of different metaphors used for Voice User Interfaces (VUIs) by designers, end-users, manufacturers, and researchers using a novel framework derived from semi-structured interviews and a literature review. We focus less on the well-established idea of metaphors as a way for interface designers to help novice users learn how to interact with novel technology, and more on other ways metaphors can be used. We find that metaphors people use are contextually fluid, can change with the mode of conversation, and can reveal differences in how people perceive VUIs compared to other devices. Not all metaphors are helpful, and some may be offensive. Analyzing this broader class of metaphors can help understand, perhaps even predict problems. Metaphor analysis can be a low-cost tool to inspire design creativity and facilitate complex discussions about sociotechnical issues, enabling us to spot potential opportunities and problems in the situated use of technologies.