Large language models (LLMs) are prone to hallucinations, i.e., nonsensical, unfaithful, and undesirable text. Users tend to overrely on LLMs and corresponding hallucinations which can lead to misinterpretations and errors. To tackle the problem of overreliance, we propose HILL, the Hallucination Identifier for Large Language Models. First, we identified design features for HILL with a Wizard of Oz approach with nine participants. Subsequently, we implemented HILL based on the identified design features and evaluated HILL's interface design by surveying 17 participants. Further, we investigated HILL's functionality to identify hallucinations based on an existing question-answering dataset and five user interviews. We find that HILL can correctly identify and highlight hallucinations in LLM responses which enables users to handle LLM responses with more caution. With that, we propose an easy-to-implement adaptation to existing LLMs and demonstrate the relevance of user-centered designs of AI artifacts.
https://doi.org/10.1145/3613904.3642428
In contrast to dialogue, wherein the exchange of completed messages occurs through turn-taking, synlogue is a mode of conversation characterized by co-creative processes, such as mutually complementing incomplete utterances and cooperative overlaps of backchannelings. Such co-creative conversations have the potential to alleviate social divisions in contemporary information environments. This study proposed the design concept of a synlogue based on literature in linguistics and anthropology and explored features that facilitate synlogic interactions in computer-mediated interfaces. Through an experiment, we focused on aizuchi, an important backchanneling element that drives synlogic conversation, and compared the speech and perceptual changes of participants when a bot dynamically uttered aizuchi or otherwise silent in a situation simulating an online video call. Consequently, we discussed the implications for interaction design based on our qualitative and quantitative analysis of the experiment. The synlogic perspective presented in this study is expected to facilitate HCI researchers to achieve more convivial forms of communication.
https://doi.org/10.1145/3613904.3642046
While conversational agents based on Large Language Models (LLMs) can drive progress in many domains, they are prone to generating faulty information. To ensure an efficient, safe, and satisfactory user experience maximizing benefits of these systems, users must be empowered to judge the reliability of system outputs. In this, both disclaimers and agents' communicative style are pivotal design instances. In an online study with 594 participants, we investigated how these affect users' trust and a mock-up agent's persuasiveness, based on an established framework from social psychology. While prior information on potential inaccuracies or faulty information did not affect trust, an authoritative communicative style elicited more trust. Also, a trusted agent was more persuasive resulting in more positive attitudes regarding the subject of the conversation. Results imply that disclaimers on agents' limitations fail to effectively alter users' trust but can be supported by appropriate communicative style during interaction.
https://doi.org/10.1145/3613904.3642122
With their generative capabilities, large language models (LLMs) have transformed the role of technological writing assistants from simple editors to writing collaborators. Such a transition emphasizes the need for understanding user perception and experience, such as balancing user intent and the involvement of LLMs across various writing domains in designing writing assistants. In this study, we delve into the less explored domain of personal writing, focusing on the use of LLMs in introspective activities. Specifically, we designed DiaryMate, a system that assists users in journal writing with LLM. Through a 10-day field study (N=24), we observed that participants used the diverse sentences generated by the LLM to reflect on their past experiences from multiple perspectives. However, we also observed that they are over-relying on the LLM, often prioritizing its emotional expressions over their own. Drawing from these findings, we discuss design considerations when leveraging LLMs in a personal writing practice.
https://doi.org/10.1145/3613904.3642693