When Help Hurts: Verification Load and Fatigue with AI Coding Assistants

AI coding assistants help, but developers still spend effort verifying model output. We isolate interface effects by holding a single LLM fixed while N=60 participants solve three Python tasks with Inline, Chat, or Structured prompting, plus a no-AI control. AI reduced workload by -18.2 TLX points and time by 22% (25.0 vs. 32.1 min) and improved correctness (OR=1.71). Within AI, Inline is fastest and lowest-load on simple work; Chat yields higher correctness beyond a per-observation complexity threshold (z≈+0.41) without a time cost; Structured benefits novices at mid complexity. We introduce a mode-agnostic verification-load index (failures, time-to-first-compile, churn, pauses, switches) that partially mediates rising stress/fatigue across tasks. We translate these findings into design guidance: adaptive mode orchestration, transparency on demand, and verification-aware packaging, and propose reporting verification load alongside outcomes to evaluate interfaces as models evolve.

Taiyuan University of Science and Technology, Taiyuan, Shanxi Province, China

Universiti Malaya, Kuala Lumpur, Malaysia

Taiyuan University of Science and Technology, Taiyuan, shanxi, China

School of Computer Science and Technology, Taiyuan, ShanXi, China

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 130

6 件の発表

開始日時2026-04-15 18:00:00

終了日時2026-04-15 19:30:00

お気に入り

あとで読む

コレクション

要旨

受賞
Honorable Mention

著者

会議: CHI 2026

セッション: HCAI and Collaboration

When Help Hurts: Verification Load and Fatigue with AI Coding Assistants

要旨

受賞Honorable Mention

著者

会議: CHI 2026

セッション: HCAI and Collaboration

受賞
Honorable Mention