LLM Powered Text Entry Decoding and Flexible Typing on Smartphones

要旨

Large language models (LLMs) have shown exceptional performance in various language-related tasks. However, their application in keyboard decoding, which involves converting input signals (e.g. taps and gestures) into text, remains underexplored. This paper presents a fine-tuned FLAN-T5 model for decoding. It achieves 93.1% top-1 accuracy on user-drawn gestures, outperforming the widely adopted SHARK2 decoder, and 95.4% on real-word tap typing data. In particular, our decoder supports Flexible Typing, allowing users to enter a word with taps, gestures, multi-stroke gestures, and tap-gesture combinations. User study results show that Flexible Typing is beneficial and well-received by participants, where 35.9% of words were entered using word gestures, 29.0% with taps, 6.1% with multi-stroke gestures, and the remaining 29.0% using tap-gestures. Our investigation suggests that the LLM-based decoder improves decoding accuracy over existing word gesture decoders while enabling the Flexible Typing method, which enhances the overall typing experience and accommodates diverse user preferences.

著者
Yan Ma
Stony Brook University, Stony Brook, New York, United States
Dan Zhang
Stony Brook University, New York city, New York, United States
IV Ramakrishnan
Stony Brook University, Stony Brook, New York, United States
Xiaojun Bi
Stony Brook University, Stony Brook, New York, United States
DOI

10.1145/3706598.3714314

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714314

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Mobile Input

G416+G417
7 件の発表
2025-05-01 18:00:00
2025-05-01 19:30:00
日本語まとめ
読み込み中…