Large language models (LLMs) increasingly support heterogeneous tasks within a single interface, requiring users to form, update, and act upon beliefs about one system across domains with different reliability profiles. Understanding how such beliefs transfer across tasks and shape delegation is critical for the design of multipurpose AI systems. We report a preregistered experiment (N = 240, 7,200 trials) in which participants interacted with a controlled AI simulation across grammar checking, travel planning, and visual question answering. Delegation was operationalized as a binary reliance decision—accepting the AI’s output versus acting independently—and belief dynamics were evaluated against Bayesian benchmarks. We find three main results. First, participants do not reset beliefs between tasks, instead carrying expectations from prior interactions. Second, within tasks, belief updating follows the Bayesian direction but is substantially conservative. Third, delegation is driven primarily by subjective beliefs about AI accuracy rather than self-confidence, though confidence independently reduces reliance when beliefs are held constant. Based on these results, we discuss implications for expectation calibration, reliance design, and the risks of belief spillovers in deployed LLM-based interfaces.
How might messages about large language models (LLMs) found in public discourse influence the way people think about and interact with these models? To explore this question, we randomly assigned participants (N = 470) to watch short informational videos presenting LLMs as either machines, tools, or companions---or to watch no video. We then assessed how strongly they believed LLMs to possess various mental capacities, such as the ability to have intentions or remember things. We found that participants who watched video messages presenting LLMs as companions reported believing that LLMs more fully possessed these capacities than did participants in other groups. In a follow-up study (N = 604), we replicated these findings and found nuanced effects on how these videos also impact people's reliance on LLM-generated responses when seeking out factual information. Together, these studies suggest that messages about LLMs---beyond technical advances---may shape what people believe about these systems and how they rely on LLM-generated responses.
Human autonomy is a core concept that helps explain the acceptance of and interaction with computer systems and AI technology. However, autonomy is often vaguely defined and conflated with related constructs. This paper disentangles autonomy by integrating the dualistic nature of positive and negative liberty from the perspective of political philosophy. Using an online vignette study with N=194 participants, we show that positive and negative liberty act as correlated but distinct dimensions of the autonomy foundation. While negative liberty predicts the sense of agency, positive liberty is a key dimension for people’s willingness to use technology. We argue that this dualistic stand - positive liberty as the freedom to pursue authentic goals, and negative liberty as the freedom from external constraints - offers a valuable and actionable perspective on human autonomy that can inform future system design and better answer the ambivalent question “how much autonomy is enough”?
AI demonstrates unprecedented reasoning capabilities, but its increasing integration into human reasoning via automated reading and summarization has provoked debate about its use for cultural interpretation. Close reading---the practice of understanding, analyzing, and critiquing cultural texts for pleasure---is a skill at the core of such interpretation, traditionally being seen as exclusive to humans. To test AI's impact on close reading, both in terms of interpretative performance and pleasure, we conducted a preregistered randomized experiment (n = 400) investigating the impact of AI assistance by presenting single or multiple AI interpretations, on close reading poems, compared to no AI assistance. We found that single AI interpretation boosted both performance and pleasure, while multiple AI interpretations only improved performance. Further exploration revealed a trade-off: participants who heavily relied on AI showed better performance on the task but lower pleasure. Our results contribute to discussion on whether and how to calibrate AI assistance for cultural interpretation: “less is more.”
Human-like behavior in Artificial Intelligence (AI) increasingly affects human–AI interaction, leading users to attribute consciousness to these systems. Yet, the factors shaping how such attributions arise remain largely unexplored. We report findings from an online survey (N=553) with participants primarily consisting of academics from formal sciences, natural sciences, and humanities, whose educational backgrounds provide more accurate mental models within their field of study, alongside participants from diverse backgrounds. Respondents evaluated their perceptions of consciousness (self-defined) in Large Language Models (LLMs) they previously interacted with, consciousness in future AI, and related ethical considerations. The results show that, across groups, around half of the participants attributed some degree of consciousness to LLMs. Individual traits such as gender, as well as participants’ conceptual positions regarding consciousness and its link to intelligence, influence consciousness perceptions, outweighing the effects of technical knowledge or system transparency. Beyond shaping academic discussions, these perspectives inform how AI is designed, governed, and integrated into everyday interactions.
The application of generative artificial intelligence in Creativity Support Tools (CSTs) presents the challenge of interfacing two black boxes: the user's mind and the machine engine. According to Artificial Cognition, this challenge involves theories, methods, and constructs developed to study human creativity. Consistently, the paper investigated the relationship between semantic networks organisation and idea originality in Large Language Models. Data was collected by administering a set of standardised tests to ChatGPT-4o and 81 psychology students, divided into higher and lower creative individuals. The expected relationship was confirmed in the comparison between ChatGPT-4o and higher creative humans. However, despite having a more rigid network, ChatGPT-4o emerged as more original than lower creative humans. We attributed this difference to human motivational processes and model hyperparameters, advancing a research agenda for the study of artificial creativity. In conclusion, we illustrate the potential of this construct for designing and evaluating CSTs.