Mobile Input

会議の名前
CHI 2025
LLM Powered Text Entry Decoding and Flexible Typing on Smartphones
要旨

Large language models (LLMs) have shown exceptional performance in various language-related tasks. However, their application in keyboard decoding, which involves converting input signals (e.g. taps and gestures) into text, remains underexplored. This paper presents a fine-tuned FLAN-T5 model for decoding. It achieves 93.1% top-1 accuracy on user-drawn gestures, outperforming the widely adopted SHARK2 decoder, and 95.4% on real-word tap typing data. In particular, our decoder supports Flexible Typing, allowing users to enter a word with taps, gestures, multi-stroke gestures, and tap-gesture combinations. User study results show that Flexible Typing is beneficial and well-received by participants, where 35.9% of words were entered using word gestures, 29.0% with taps, 6.1% with multi-stroke gestures, and the remaining 29.0% using tap-gestures. Our investigation suggests that the LLM-based decoder improves decoding accuracy over existing word gesture decoders while enabling the Flexible Typing method, which enhances the overall typing experience and accommodates diverse user preferences.

著者
Yan Ma
Stony Brook University, Stony Brook, New York, United States
Dan Zhang
Stony Brook University, New York city, New York, United States
IV Ramakrishnan
Stony Brook University, Stony Brook, New York, United States
Xiaojun Bi
Stony Brook University, Stony Brook, New York, United States
DOI

10.1145/3706598.3714314

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714314

動画
Enhancing Smartphone Eye Tracking with Cursor-Based Interactive Implicit Calibration
要旨

The limited accuracy of eye-tracking on smartphones restricts its use. Existing RGB-camera-based eye-tracking relies on extensive datasets, which could be enhanced by continuous fine-tuning using calibration data implicitly collected from the interaction. In this context, we propose COMETIC (Cursor Operation Mediated Eye-Tracking Implicit Calibration), which introduces a cursor-based interaction and utilizes the inherent correlation between cursor and eye movement. By filtering valid cursor coordinates as proxies for the ground truth of gaze and fine-tuning the eye-tracking model with corresponding images, COMETIC enhances accuracy during the interaction. Both filtering and fine-tuning use pre-trained models and could be facilitated using personalized, dynamically updated data. Results show COMETIC achieves an average eye-tracking er- ror of 278.3 px (1.60 cm, 2.29◦), representing a 27.2% improvement compared to that without fine-tuning. We found that filtering cursor points whose actual distance to gaze is 150.0 px (0.86 cm) yields the best eye-tracking results.

著者
Chang Liu
Tsinghua University, BeiJing, China
Xiangyang Wang
Tsinghua University, Beijing, China
Chun Yu
Tsinghua University, Beijing, China
Yingtian Shi
Georgia Institute of Technology, Atlanta, Georgia, United States
Chongyang Wang
Sichuan University, Chengdu, Sichuan, China
Ziqi Liu
Tsinghua University, Beijing, Beijing, China
Chen Liang
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, Guangdong, China
Yuanchun Shi
Qinghai University, Xining, Qinghai, China
DOI

10.1145/3706598.3713936

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713936

動画
Everything to Gain: Combining Area Cursors with increased Control-Display Gain for Fast and Accurate Touchless Input
要旨

Touchless displays often use mid-air gestures to control on-screen cursors for pointer interactions. Area cursors can simplify touchless cursor input by implicitly targeting nearby widgets without the cursor entering the target. However, for displays with dense target layouts, the cursor still has to arrive close to the widget, meaning the benefits of area cursors for time-to-target and effort are diminished. Through two experiments, we demonstrate for the first time that fine-tuning the mapping between hand and cursor movements (control-display gain -- CDG) can address the deficiencies of area cursors and improve the performance of touchless interaction. Across several display sizes and target densities (representative of myriad public displays used in retail, transport, museums, etc), our findings show that the forgiving nature of an area cursor compensates for the imprecision of a high CDG, helping users interact more effectively with smaller and more controlled hand/arm movements.

著者
Kieran Waugh
University of Glasgow , Glasgow, Scotland, United Kingdom
Mark McGill
University of Glasgow, Glasgow, Lanarkshire, United Kingdom
Euan Freeman
University of Glasgow, Glasgow, United Kingdom
DOI

10.1145/3706598.3714021

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714021

動画
WritingRing: Enabling Natural Handwriting Input with a Single IMU Ring
要旨

Tracking continuous 2D sequential handwriting trajectories accurately using a single IMU ring is extremely challenging due to the significant displacement between the IMU's wearing position and the location of the tracked fingertip. We propose WritingRing, a system that uses a single IMU ring worn at the base of the finger to support natural handwriting input and provide real-time 2D trajectories. To achieve this, we first built a handwriting dataset using a touchpad and an IMU ring (N=20). Next, we improved the LSTM model by incorporating streaming input and a TCN network, significantly enhancing accuracy and computational efficiency, and achieving an average trajectory accuracy of 1.63mm. Real-time usability studies demonstrated that the system achieved 88.7% letter recognition accuracy and 68.2% word recognition accuracy, which reached 84.36% when restricting the output to words within a vocabulary of size 3000. WritingRing can also be embedded into existing ring systems, providing a natural and real-time solution for various applications.

著者
Zhe He
Tsinghua University, Beijing, Beijing, China
Zixuan Wang
Tsinghua University, Beijing, China
Chun Yu
Tsinghua University, Beijing, China
Chengwen Zhang
Tsinghua University, Beijing, China
Xiyuan Shen
University of Washington, Seattle, Washington, United States
Yuanchun Shi
Tsinghua University, Beijing, China
DOI

10.1145/3706598.3714066

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714066

動画
MagPie: Extending a Smartphone’s Interaction Space via a Customizable Magnetic Back-of-Device Input Accessory
要旨

Back-of-Device (BoD) interfaces have emerged as a promising solution to free up screen real estate in smartphones by offloading interactions from the display to the back, thereby reducing reliance on on-screen interfaces. However, existing BoD solutions face limitations, such as requiring specialized hardware, consuming excessive power, or offering limited input vocabularies. We introduce MagPie, a novel BoD interface that leverages the magnetic phenomenon induced by MagSafe, part of the wireless charging standard. Users can seamlessly attach MagPie to MagSafe-enabled smartphones and interact using tangible, modular interfaces that generate unique magnetic signals upon activation. MagPie then detects these signals and recognizes the input through magnetic sensing. Our experiments with real-world users demonstrate that i) MagPie achieves high performance in accuracy, usability, deployability, responsiveness, and robustness across diverse environments, and ii) its tangible, intuitive, and customizable design opens up possibilities for a whole new class of smartphone interaction scenarios.

著者
Insu Kim
Chung-Ang University, Seoul, Korea, Republic of
Suhyeon Shin
Chung-Ang University, Seoul, Korea, Republic of
Jaemin Choi
Chung-Ang university , Seoul, Korea, Republic of
Junseob Kim
Chung-Ang University, Seoul, Korea, Republic of
Junhyub Lee
Chung-Ang university, Seoul, Korea, Republic of
Sangeun Oh
Korea University, Seoul, Korea, Republic of
Eunji Park
Chung-Ang university, Seoul, Korea, Republic of
Hyosu Kim
Chung-Ang University, Seoul, Korea, Republic of
DOI

10.1145/3706598.3713956

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713956

動画
PropType: Everyday Props as Typing Surfaces in Augmented Reality
要旨

We introduce PropType, an interactive interface that transforms everyday objects into typing surfaces within an Augmented Reality (AR) environment. Users can interact with nearby props, such as cups, water bottles, boxes, and various other objects, utilizing them as on-the-go keyboards. To develop PropType, we conducted three studies. The first study involved observing users to understand how they naturally engage with prop surfaces for typing. The second study assessed the reachability and efficiency of touch input across four props with different sizes and shapes. Based on these insights, we designed customized keyboard layouts for each prop. In the third study, we evaluated typing performance using PropType, achieving an average typing speed of up to 26.1 words per minute (WPM) with 2.2% corrected error rate (CER) and 1.1% uncorrected error rate (UER). Finally, we present a PropType editing tool that allows users to customize keyboard layouts and visual effects for prop-based typing.

受賞
Honorable Mention
著者
Hyunjae Gil
The University of Texas at Dallas, Allen, Texas, United States
Ashish Pratap
University of Texas at Dallas, Richardson, Texas, United States
Iniyan Joseph
University of Texas at Dallas, Richardson, Texas, United States
Jin Ryong Kim
University of Texas at Dallas, Richardson, Texas, United States
DOI

10.1145/3706598.3714056

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714056

動画
Exploring Mobile Touch Interaction with Large Language Models
要旨

Interacting with Large Language Models (LLMs) for text editing on mobile devices currently requires users to break out of their writing environment and switch to a conversational AI interface. In this paper, we propose to control the LLM via touch gestures performed directly on the text. We first chart a design space that covers fundamental touch input and text transformations. In this space, we then concretely explore two control mappings: spread-to-generate and pinch-to-shorten, with visual feedback loops. We evaluate this concept in a user study (N=14) that compares three feedback designs: no visualisation, text length indicator, and length + word indicator. The results demonstrate that touch-based control of LLMs is both feasible and user-friendly, with the length + word indicator proving most effective for managing text generation. This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.

著者
Tim Zindulka
University of Bayreuth, Bayreuth, Germany
Jannek Maximilian. Sekowski
University of Bayreuth, Bayreuth, Germany
Florian Lehmann
University of Bayreuth, Bayreuth, Germany
Daniel Buschek
University of Bayreuth, Bayreuth, Germany
DOI

10.1145/3706598.3713554

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713554

動画