LLM Powered Text Entry Decoding and Flexible Typing on Smartphones

Large language models (LLMs) have shown exceptional performance in various language-related tasks. However, their application in keyboard decoding, which involves converting input signals (e.g. taps and gestures) into text, remains underexplored. This paper presents a fine-tuned FLAN-T5 model for decoding. It achieves 93.1% top-1 accuracy on user-drawn gestures, outperforming the widely adopted SHARK2 decoder, and 95.4% on real-word tap typing data. In particular, our decoder supports Flexible Typing, allowing users to enter a word with taps, gestures, multi-stroke gestures, and tap-gesture combinations. User study results show that Flexible Typing is beneficial and well-received by participants, where 35.9% of words were entered using word gestures, 29.0% with taps, 6.1% with multi-stroke gestures, and the remaining 29.0% using tap-gestures. Our investigation suggests that the LLM-based decoder improves decoding accuracy over existing word gesture decoders while enabling the Flexible Typing method, which enhances the overall typing experience and accommodates diverse user preferences.

Stony Brook University, Stony Brook, New York, United States

Stony Brook University, New York city, New York, United States

Stony Brook University, Stony Brook, New York, United States

10.1145/3706598.3714314

https://dl.acm.org/doi/10.1145/3706598.3714314

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

G416+G417

7 件の発表

開始日時2025-05-01 18:00:00

終了日時2025-05-01 19:30:00

読み込み中…

お気に入り