Desirable Unfamiliarity: Insights from Eye Movements on Engagement and Readability of Dictation Interfaces

要旨

Transcripts displayed on dictation interfaces can be hard to read due to recognition errors and disfluencies. LLM-based text auto-correction could help, but changing the text during production could lead to distraction and unintended phrasing. To understand how to balance readability, attention, and accuracy, we conducted an eye-tracking experiment with 20 participants to compare five dictation interfaces: PLAIN (real-time transcription), AOC (periodic corrections), RAKE (keyword highlights), GP-TSM (grammar-preserving highlights), and SUMMARY (LLM-generated abstractive summary). By analyzing participants’ gaze patterns during speech composition and reviewing processes, we found that during composition, participants spent only 7%-11% of their time in active reading regardless of the interface. Although SUMMARY introduced unfamiliar words and phrasing during composition, it was easier to read and more preferred by participants. Our findings suggest a high user tolerance for altering spoken words in LLM-enabled diction interfaces.

著者
Zhaohui Liang
University of Chinese Academy of Sciences, Beijing, China
Yonglin Chen
Southern University of Science and Technology, Shenzhen, China
Naser Al Madi
Colby College, Waterville, Maine, United States
Can Liu
City University of Hong Kong, Hong Kong, China

会議: CHI 2026

ACM CHI Conference on Human Factors in Computing Systems

セッション: Modeling Spatial, Linguistic, and Sensory Errors

P1 - Room 128
6 件の発表
2026-04-14 20:15:00
2026-04-14 21:45:00