Generative chatbots promise to scale personalized learning. Most publicly available generative chatbots are designed to provide confident and eloquent responses by default, even when hallucinating. Prior work has observed that learners using such chatbots often engage shallowly and fail to detect chatbot errors due to overtrust, cognitive overload, and prioritization of short-term gains. To address these challenges, this work examines two chatbot design options in a STEM learning context: introducing verbal uncertainty and reducing response verbosity. Using Bayesian causal inference and thematic analysis in a quasi-experimental setting, we found that a less verbose chatbot improved detection of errors with logical fallacies, but did not increase the use of alternative resources. A chatbot that always expressed uncertainty reduced the adoption of incorrect chatbot responses, but had mixed effects on learning outcomes, suggesting the need to increase signal credibility and maintain learners’ engagement in the learning process despite chatbot disuse.
ACM CHI Conference on Human Factors in Computing Systems