FineType: Fine-grained Tapping Gesture Recognition for Text Entry
説明

With the rise of mixed reality (MR) and augmented reality (AR) applications, efficient text input in AR/MR environments remains challenging. We propose \textit{FineType}, a text entry system using tapping gestures with finger combinations and postures on any flat surface. Using a wristband with an IMU and an infrared camera, we detect tapping events and employ a multi-task convolutional neural network to predict these gestures, enabling nearly full keyboard mapping (including letters, symbols, numbers, etc.) with one hand.

We collected gestures from participants (N=28) with 10 finger combinations and 3 finger postures for training. Cross-user validation showed accuracies of 98.26\% for combinations, 95.53\% for postures, and 94.19\% for all categories. For 8 newly defined finger combinations and their postures, classification accuracies were 91.27\% and 93.86\%. Using user-adaptive few-shot learning, we improved the finger combination accuracy to 97.05\%. The results demonstrate our potential to map tapping gestures composed of all finger combinations and three postures. Our user study (N=10) demonstrated an average typing speed of 35.1 WPM with a character error rate of 5.1\% after two hours of practice.

日本語まとめ
読み込み中…
読み込み中…
Textoshop: Interactions Inspired by Drawing Software to Facilitate Text Editing
説明

We explore how interactions inspired by drawing software can help edit text. Making an analogy between visual and text editing, we consider words as pixels, sentences as regions, and tones as colours. For instance, direct manipulations move, shorten, expand, and reorder text; tools change number, tense, and grammar; colours map to tones explored along three dimensions in a tone picker; and layers help organize and version text. This analogy also leads to new workflows, such as boolean operations on text fragments to construct more elaborated text. A study shows participants were more successful at editing text and preferred using the proposed interface over existing solutions. Broadly, our work highlights the potential of interaction analogies to rethink existing workflows, while capitalizing on familiar features.

日本語まとめ
読み込み中…
読み込み中…
There Is More to Dwell Than Meets the Eye: Toward Better Gaze-Based Text Entry Systems With Multi-Threshold Dwell
説明

Dwell-based text entry seems to peak at 20 words per minute (WPM). Yet, little is known about the factors contributing to this limit, except that it requires extensive training. Thus, we conducted a longitudinal study, broke the overall dwell-based selection time into six different components, and identified several design challenges and opportunities. Subsequently, we designed two novel dwell keyboards that use multiple yet much shorter dwell thresholds: Dual-Threshold Dwell (DTD) and Multi-Threshold Dwell (MTD). The performance analysis showed that MTD (18.3 WPM) outperformed both DTD (15.3 WPM) and the conventional Constant-Threshold Dwell (12.9 WPM). Notably, absolute novices achieved these speeds within just 30 phrases. Moreover, MTD’s performance is also the fastest-ever reported average text entry speed for gaze-based keyboards. Finally, we discuss how our chosen parameters can be further optimized to pave the way toward more efficient dwell-based text entry.

日本語まとめ
読み込み中…
読み込み中…
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation
説明

This study introduces \textbf{InteractEval}, a framework that integrates the outcomes of Think-Aloud (TA) conducted by humans and LLMs to generate attributes for checklist-based text evaluation. By combining humans' flexibility and high-level reasoning with LLMs' consistency and extensive knowledge, InteractEval outperforms text evaluation baselines on a text summarization benchmark (SummEval) and an essay scoring benchmark (ELLIPSE). Furthermore, an in-depth analysis shows that it promotes divergent thinking in both humans and LLMs, leading to the generation of a wider range of relevant attributes and enhancement of text evaluation performance. A subsequent comparative analysis reveals that humans excel at identifying attributes related to internal quality (Coherence and Fluency), but LLMs perform better at those attributes related to external alignment (Consistency and Relevance). Consequently, leveraging both humans and LLMs together produces the best evaluation outcomes, highlighting the necessity of effectively combining humans and LLMs in an automated checklist-based text evaluation.

日本語まとめ
読み込み中…
読み込み中…
Plume: Scaffolding Text Composition in Dashboards
説明

Text in dashboards plays multiple critical roles, including providing context, offering insights, guiding interactions, and summarizing key information. Despite its importance, most dashboarding tools focus on visualizations and offer limited support for text authoring. To address this gap, we developed Plume, a system to help authors craft effective dashboard text. Through a formative review of exemplar dashboards, we created a typology of text parameters and articulated the relationship between visual placement and semantic connections, which informed Plume’s design. Plume employs large language models (LLMs) to generate contextually appropriate content and provides guidelines for writing clear, readable text. A preliminary evaluation with 12 dashboard authors explored how assisted text authoring integrates into workflows, revealing strengths and limitations of LLM-generated text and the value of our human-in-the-loop approach. Our findings suggest opportunities to improve dashboard authoring tools by better supporting the diverse roles that text plays in conveying insights.

日本語まとめ
読み込み中…
読み込み中…
Simulating Errors in Touchscreen Typing
説明

Empirical evidence shows that typing on touchscreen devices is prone to errors and that correcting them poses a major detriment to users’ performance. Design of text entry systems that better serve users, across their broad capability range, necessitates understanding the cognitive mechanisms that underpin these errors. However, prior models of typing cover only motor slips. The paper reports on extending the scope of computational modeling of typing to cover the cognitive mechanisms behind the three main types of error: slips (inaccurate execution), lapses (forgetting), and mistakes (incorrect knowledge). Given a phrase, a keyboard, and user parameters, Typoist simulates eye and finger movements while making human-like insertion, omission, substitution, and transposition errors. Its main technical contribution is the formulation of a supervisory control problem wherein the controller allocates cognitive resources to detect and fix errors generated by the various mechanisms. The model generates predictions of typing performance that can inform design, for better text entry systems.

日本語まとめ
読み込み中…
読み込み中…
Investigating Context-Aware Collaborative Text Entry on Smartphones using Large Language Models
説明

Text entry is a fundamental and ubiquitous task, but users often face challenges such as situational impairments or difficulties in sentence formulation. Motivated by this, we explore the potential of large language models (LLMs) to assist with text entry in real-world contexts. We propose a collaborative smartphone-based text entry system, CATIA, that leverages LLMs to provide text suggestions based on contextual factors, including screen content, time, location, activity, and more. In a 7-day in-the-wild study with 36 participants, the system offered appropriate text suggestions in over 80% of cases. Users exhibited different collaborative behaviors depending on whether they were composing text for interpersonal communication or information services. Additionally, the relevance of contextual factors beyond screen content varied across scenarios. We identified two distinct mental models: AI as a supportive facilitator or as a more equal collaborator. These findings outline the design space for human-AI collaborative text entry on smartphones.

日本語まとめ
読み込み中…
読み込み中…