EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays

要旨

Open-ended writing assignments are central to higher education, yet heterogeneous submissions and scale make evaluation difficult. Automated writing evaluation (AWE) promises speed but often trades away transparency and sidelines human judgment. This paper repositions AI as an on-demand collaborator that can provide specific, targeted support. In a formative study, we expose leverage points in three cognitive dimensions: evidence identification, comparative judgment, and feedback composition. Guided by these insights, we build EvaluAId, which supports interactive rubric-content mapping, adaptive benchmarking and self-calibration, and personalized, rubric-aligned feedback synthesis. Through a within-subjects study with 12 TAs, we evaluate how this approach supports grading compared with a rubric+LLM chatbot and an LLM-based AWE; EvaluAId improved alignment with expert ratings and increased graders' satisfaction. Finally, interviews with TAs, instructors, and students underscored the value of thoughtfulness supported by EvaluAId while surfacing practical considerations for integration into classroom. Together, our results argue for deliberate, evidence-first, human-in-the-loop evaluation.

著者
Chao Zhang
Cornell University, Ithaca, New York, United States
Kexin Phyllis. Ju
University of Michigan, Ann Arbor, Michigan, United States
Xinyi Lu
University of Michigan, Ann Arbor, Michigan, United States
Grace Yu-Chun Yen
National Yang-Ming Chiao Tung University, Hsinchu, Taiwan
Jeffrey M. Rzeszotarski
Loyola University Maryland, Baltimore, Maryland, United States

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Intelligent Feedback & Learning Design

P1 - Room 129
6 件の発表
2026-04-15 20:15:00
2026-04-15 21:45:00