Large language models (LLMs) increasingly support heterogeneous tasks within a single interface, requiring users to form, update, and act upon beliefs about one system across domains with different reliability profiles. Understanding how such beliefs transfer across tasks and shape delegation is critical for the design of multipurpose AI systems. We report a preregistered experiment (N = 240, 7,200 trials) in which participants interacted with a controlled AI simulation across grammar checking, travel planning, and visual question answering. Delegation was operationalized as a binary reliance decision—accepting the AI’s output versus acting independently—and belief dynamics were evaluated against Bayesian benchmarks. We find three main results. First, participants do not reset beliefs between tasks, instead carrying expectations from prior interactions. Second, within tasks, belief updating follows the Bayesian direction but is substantially conservative. Third, delegation is driven primarily by subjective beliefs about AI accuracy rather than self-confidence, though confidence independently reduces reliance when beliefs are held constant. Based on these results, we discuss implications for expectation calibration, reliance design, and the risks of belief spillovers in deployed LLM-based interfaces.
ACM CHI Conference on Human Factors in Computing Systems