Digital Dexterity: Touching and Typing Techniques

会議の名前
UIST 2023
Robust Finger Interactions with COTS Smartwatches via Unsupervised Siamese Adaptation
要旨

Wearable devices like smartwatches and smart wristbands have gained substantial popularity in recent years. However, their small interfaces create inconvenience and limit computing functionality. To fill this gap, we propose ViWatch, which enables robust finger interactions under deployment variations, and relies on a single IMU sensor that is ubiquitous in COTS smartwatches. To this end, we design an unsupervised Siamese adversarial learning method. We built a real-time system on commodity smartwatches and tested it with over one hundred volunteers. Results show that the system accuracy is about 97% over a week. In addition, it is resistant to deployment variations such as different hand shapes, finger activity strengths, and smartwatch positions on the wrist. We also developed a number of mobile applications using our interactive system and conducted a user study where all participants preferred our unsupervised approach to supervised calibration. The demonstration of ViWatch is shown at https://youtu.be/N5-ggvy2qfI

著者
Wenqiang Chen
Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, United States
Ziqi Wang
University of California, Los Angeles, Los Angeles, California, United States
Pengrui Quan
University of California, Los Angeles, Los Angeles, Virginia, United States
Zhencan Peng
Shenzhen University, Shenzhen, China
Shupei Lin
VibInt Limited, Hong Kong, China
Mani Srivastava
University of California, Los Angeles, Los Angeles, California, United States
Wojciech Matusik
MIT, Cambridge, Massachusetts, United States
John Stankovic
University of Virginia, Charlottesville, Virginia, United States
論文URL

https://doi.org/10.1145/3586183.3606794

動画
Structured Light Speckle: Joint egocentric depth estimation and low-latency contact detection via remote vibrometry
要旨

Despite advancements in egocentric hand tracking using head-mounted cameras, contact detection with real-world objects remains challenging, particularly for the quick motions often performed during interaction in Mixed Reality. In this paper, we introduce a novel method for detecting touch on discovered physical surfaces purely from an egocentric perspective using optical sensing. We leverage structured laser light to detect real-world surfaces from the disparity of reflections in real-time and, at the same time, extract a time series of remote vibrometry sensations from laser speckle motions. The pattern caused by structured laser light reflections enables us to simultaneously sample the mechanical vibrations that propagate through the user's hand and the surface upon touch. We integrated Structured Light Speckle into TapLight, a prototype system that is a simple add-on to Mixed Reality headsets. In our evaluation with a Quest 2, TapLight---while moving---reliably detected horizontal and vertical surfaces across a range of surface materials. TapLight also reliably detected rapid touch contact and robustly discarded other hand motions to prevent triggering spurious input events. Despite the remote sensing principle of Structured Light Speckle, our method achieved a latency for event detection in realistic settings that matches body-worn inertial sensing without needing such additional instrumentation. We conclude with a series of VR demonstrations for situated interaction that leverage the quick touch interaction TapLight supports.

著者
Paul Streli
ETH Zürich, Zurich, Switzerland
Jiaxi Jiang
ETH Zürich, Zurich, Switzerland
Juliete Rossie
ETH Zürich, Zurich, Switzerland
Christian Holz
ETH Zürich, Zurich, Switzerland
論文URL

https://doi.org/10.1145/3586183.3606749

動画
ShadowTouch: Enabling Free-Form Touch-Based Hand-to-Surface Interaction with Wrist-Mounted Illuminant by Shadow Projection
要旨

We present ShadowTouch, a novel sensing method to recognize the subtle hand-to-surface touch state for independent fingers based on optical auxiliary. ShadowTouch mounts a forward-facing light source on the user's wrist to construct shadows on the surface in front of the fingers when the corresponding fingers are close to the surface. With such an optical design, the subtle vertical movements of near-surface fingers are magnified and turned to shadow features cast on the surface, which are recognizable for computer vision algorithms. To efficiently recognize the touch state of each finger, we devised a two-stage CNN-based algorithm that first extracted all the fingertip regions from each frame and then classified the touch state of each region from the cropped consecutive frames. Evaluations showed our touch state detection algorithm achieved a recognition accuracy of 99.1% and an F-1 score of 96.8% in the leave-one-out cross-user evaluation setting. We further outlined the hand-to-surface interaction space enabled by ShadowTouch's sensing capability from the aspects of touch-based interaction, stroke-based interaction, and out-of-surface information and developed four application prototypes to showcase ShadowTouch's interaction potential. The usability evaluation study showed the advantages of ShadowTouch over threshold-based techniques in aspects of lower mental demand, lower effort, lower frustration, more willing to use, easier to use, better integrity, and higher confidence.

著者
Chen Liang
Tsinghua University, Beijing, Beijing, China
Xutong Wang
Tsinghua University, Beijing, China
Zisu Li
The Hong Kong University of Science and Technology, Hong Kong SAR, Hong Kong, China
Chi Hsia
Tsinghua University, Beijing, China
Mingming Fan
The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China
Chun Yu
Tsinghua University, Beijing, China
Yuanchun Shi
Tsinghua University, Beijing, China
論文URL

https://doi.org/10.1145/3586183.3606785

動画
Stereoscopic Viewing and Monoscopic Touching: Selecting Distant Objects in VR Through a Mobile Device
要旨

In this study, we explore a new way to complementarily utilize the immersive visual output of VR and the physical haptic input of a smartphone. In particular, we focus on interacting with distant virtual objects using a smartphone in a through-plane manner and present a novel selection technique that overcomes the binocular parallax that occurs in such an arrangement. In our proposed technique, when a user in the stereoscopic viewing mode needs to perform a distant selection, the user brings the fingertip near the screen of the mobile device, triggering a smoothly animated transition to the monoscopic touching mode. Using a novel proof-of-concept implementation that utilizes a transparent acrylic panel, we conducted a user study and found that the proposed technique is significantly quicker, more precise, more direct, and more intuitive compared to the ray casting baseline. Subsequently, we created VR applications that explore the rich and interesting use cases of the proposed technique.

著者
Joon Hyub Lee
KAIST, Daejeon, Korea, Republic of
Taegyu Jin
KAIST, Daejeon, Korea, Republic of
Sang-Hyun Lee
KAIST, Daejeon, Korea, Republic of
Seung-Jun Lee
KAIST, Daejeon, Korea, Republic of
Seok-Hyung Bae
KAIST, Daejeon, Korea, Republic of
論文URL

https://doi.org/10.1145/3586183.3606809

動画
TouchType-GAN: Modeling Touch Typing with Generative Adversarial Network
要旨

Models that can generate touch typing tasks are important to the development of touch typing keyboards. We propose TouchType- GAN, a Conditional Generative Adversarial Network that can sim- ulate locations and time stamps of touch points in touch typing. TouchType-GAN takes arbitrary text as input to generate realistic touch typing both spatially (i.e., (𝑥, 𝑦) coordinates of touch points) and temporally (i.e., timestamps of touch points). TouchType-GAN in- troduces a variational generator that estimates Gaussian Distribu- tions for every target letter to prevent mode collapse. Our experi- ments on a dataset with 3k typed sentences show that TouchType- GAN outperforms existing touch typing models, including the Ro- tational Dual Gaussian model for simulating the distribution of touch points, and the Finger-Fitts Euclidean Model for sim- ulating typing time. Overall, our research demonstrates that the proposed GAN structure can learn the distribution of user typed touch points, and the resulting TouchType-GAN can also estimate typing movements. TouchType-GAN can serve as a valuable tool for designing and evaluating touch typing input systems.

著者
Jeremy Chu
Stony Brook University, Stony Brook, New York, United States
Yan Ma
Stony Brook University, Stony Brook, New York, United States
Shumin Zhai
Google, Mountain View, California, United States
Xianfeng David Gu
Stony Brook University, Stony Brook, New York, United States
Xiaojun Bi
Stony Brook University, Stony Brook, New York, United States
論文URL

https://doi.org/10.1145/3586183.3606760

動画
C-PAK: Correcting and Completing Variable-length Prefix-based Abbreviated Keystrokes
要旨

Improving keystroke savings is a long-term goal of text input research. We present a study into the design space of an abbreviated style of text input called C-PAK (Correcting and completing variable-length Prefix-based Abbreviated Keystrokes) for text entry on mobile devices. Given a variable length and potentially inaccurate input string (e.g., 'li g t m'), C-PAK aims to expand it into a complete phrase (e.g., 'looks good to me'). We develop a C-PAK prototype keyboard, PhraseWriter, based on a current state-of-the-art mobile keyboard consisting of 1.3 million n-grams and 164,000 words. Using computational simulations on a large dataset of realistic input text, we found that, in comparison to conventional single-word suggestions, PhraseWriter improves the maximum keystroke savings rate by 6.7% (from 46.3% to 49.4,), reduces the word error rate by 14.7%, and is particularly advantageous for common phrases. We conducted a lab study of novice user behavior and performance which found that users could quickly utilize the C-PAK style abbreviations implemented in PhraseWriter, achieving a higher keystroke savings rate than forward suggestions (25% vs. 16%). Furthermore, they intuitively and successfully abbreviated more with common phrases. However, users had a lower overall text entry rate due to their limited experience with the system (28.5 words per minute vs. 37.7). We outline future technical directions to improve C-PAK over the PhraseWriter baseline, and further opportunities to study the perceptual, cognitive, and physical action trade-offs that underlie the learning curve of C-PAK systems.

著者
Tianshi Li
Philip Quinn
Shumin Zhai