EvaluAId: Human-AI Collaborative Evaluation of Open-Ended Student Essays

Open-ended writing assignments are central to higher education, yet heterogeneous submissions and scale make evaluation difficult. Automated writing evaluation (AWE) promises speed but often trades away transparency and sidelines human judgment. This paper repositions AI as an on-demand collaborator that can provide specific, targeted support. In a formative study, we expose leverage points in three cognitive dimensions: evidence identification, comparative judgment, and feedback composition. Guided by these insights, we build EvaluAId, which supports interactive rubric-content mapping, adaptive benchmarking and self-calibration, and personalized, rubric-aligned feedback synthesis. Through a within-subjects study with 12 TAs, we evaluate how this approach supports grading compared with a rubric+LLM chatbot and an LLM-based AWE; EvaluAId improved alignment with expert ratings and increased graders' satisfaction. Finally, interviews with TAs, instructors, and students underscored the value of thoughtfulness supported by EvaluAId while surfacing practical considerations for integration into classroom. Together, our results argue for deliberate, evidence-first, human-in-the-loop evaluation.

Cornell University, Ithaca, New York, United States

University of Michigan, Ann Arbor, Michigan, United States

National Yang-Ming Chiao Tung University, Hsinchu, Taiwan

Loyola University Maryland, Baltimore, Maryland, United States

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 129

6 件の発表

開始日時2026-04-15 20:15:00

終了日時2026-04-15 21:45:00

お気に入り

あとで読む

コレクション

要旨

著者

会議: CHI 2026

セッション: Intelligent Feedback & Learning Design