Desirable Unfamiliarity: Insights from Eye Movements on Engagement and Readability of Dictation Interfaces

Transcripts displayed on dictation interfaces can be hard to read due to recognition errors and disfluencies. LLM-based text auto-correction could help, but changing the text during production could lead to distraction and unintended phrasing. To understand how to balance readability, attention, and accuracy, we conducted an eye-tracking experiment with 20 participants to compare five dictation interfaces: PLAIN (real-time transcription), AOC (periodic corrections), RAKE (keyword highlights), GP-TSM (grammar-preserving highlights), and SUMMARY (LLM-generated abstractive summary). By analyzing participants’ gaze patterns during speech composition and reviewing processes, we found that during composition, participants spent only 7%-11% of their time in active reading regardless of the interface. Although SUMMARY introduced unfamiliar words and phrasing during composition, it was easier to read and more preferred by participants. Our findings suggest a high user tolerance for altering spoken words in LLM-enabled diction interfaces.

University of Chinese Academy of Sciences, Beijing, China

Southern University of Science and Technology, Shenzhen, China

Colby College, Waterville, Maine, United States

City University of Hong Kong, Hong Kong, China

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 128

6 件の発表

開始日時2026-04-14 20:15:00

終了日時2026-04-14 21:45:00

お気に入り

あとで読む

コレクション

Desirable Unfamiliarity: Insights from Eye Movements on Engagement and Readability of Dictation Interfaces

要旨

著者

会議: CHI 2026

セッション: Modeling Spatial, Linguistic, and Sensory Errors