Input Techniques

会議の名前
CHI 2022
A Conversational Approach for Modifying Service Mashups in IoT Environments
要旨

Existing conversational approaches for Internet of Things (IoT) service mashup do not support modification because of the usability challenge, although it is common for users to modify the service mashups in IoT environments. To support the modification of IoT service mashups through conversational interfaces in a usable manner, we propose the conversational mashup modification agent (CoMMA). Users can modify IoT service mashups using CoMMA through natural language conversations. CoMMA has a two-step mashup modification interaction, an implicature-based localization step, and a modification step with a disambiguation strategy. The localization step allows users to easily search for a mashup by vocalizing their expressions in the environment. The modification step supports users to modify mashups by speaking simple modification commands. We conducted a user study and the results show that CoMMA is as effective as visual approaches in terms of task completion time and perceived task workload for modifying IoT service mashups.

著者
Sanghoon Kim
Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, Republic of
In-Young Ko
Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, Republic of
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517655

動画
Improving Finger Stroke Recognition Rate for Eyes-Free Mid-Air Typing in VR
要旨

We examine mid-air typing data collected from touch typists to evaluate the features and classification models for recognizing finger stroke. A large number of finger movement traces have been collected using finger motion capture systems, labeled into individual finger strokes, and classified into several key features. We test finger kinematic features, including 3D position, velocity, acceleration, and temporal features, including previous fingers and keys. Based on this analysis, we assess the performance of various classifiers, including Naive Bayes, Random Forest, Support Vector Machines, and Deep Neural Networks, in terms of the accuracy for correctly classifying the keystroke. We finally incorporate a linguistic heuristic to explore the effectiveness of the character prediction model and improve the total accuracy.

著者
Yatharth Singhal
University of Texas at Dallas, Dallas, Texas, United States
Richard Huynh. Noeske
The University of Texas at Dallas, Richardson, Texas, United States
Ayush Bhardwaj
The University of Texas at Dallas, Richardson, Texas, United States
Jin Ryong Kim
The University of Texas at Dallas, Richardson, Texas, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502100

動画
TriboTouch: Micro-Patterned Surfaces for Low Latency Touchscreens
要旨

Touchscreen tracking latency, often 80ms or more, creates a rubber-banding effect in everyday direct manipulation tasks such as dragging, scrolling, and drawing. This has been shown to decrease system preference, user performance, and overall realism of these interfaces. In this research, we demonstrate how the addition of a thin, 2D micro-patterned surface with 5 micron spaced features can be used to reduce motor-visual touchscreen latency. When a finger, stylus, or tangible is translated across this textured surface frictional forces induce acoustic vibrations which naturally encode sliding velocity. This acoustic signal is sampled at 192kHz using a conventional audio interface pipeline with an average latency of 28ms. When fused with conventional low-speed, but high-spatial-accuracy 2D touch position data, our machine learning model can make accurate predictions of real time touch location.

受賞
Honorable Mention
著者
Craig Shultz
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Daehwa Kim
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Karan Ahuja
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Chris Harrison
Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502069

動画
HybridTrak: Adding Full-Body Tracking to VR Using an Off-the-Shelf Webcam
要旨

Full-body tracking in virtual reality improves presence, allows interaction via body postures, and facilitates better social expression among users. However, full-body tracking systems today require a complex setup fixed to the environment (e.g., multiple lighthouses/cameras) and a laborious calibration process, which goes against the desire to make VR systems more portable and integrated. We present HybridTrak, which provides accurate, real-time full-body tracking by augmenting inside-out upper-body VR tracking systems with a single external off-the-shelf RGB web camera. HybridTrak converts and transforms users' 2D full-body poses from the webcam to 3D poses leveraging the inside-out upper-body tracking data with a full-neural solution. We showed HybridTrak is more accurate than RGB or depth-based tracking method on the MPI-INF-3DHP dataset. We also tested HybridTrak in the popular VRChat app and showed that body postures presented by HybridTrak are more distinguishable and more natural than a solution using an RGBD camera.

著者
Jackie (Junrui). Yang
Stanford University, Stanford, California, United States
Tuochao Chen
EECS, Beijing, Beijing, China
Fang Qin
Carnegie Mellon University , Pittsburgh, Pennsylvania, United States
Monica Lam
Stanford University, Stanford, California, United States
James A.. Landay
Stanford University, Stanford, California, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502045

動画
Integrating Gaze and Speech for Enabling Implicit Interactions
要旨

Gaze and speech are rich contextual sources of information that, when combined, can result in effective and rich multimodal interactions. This paper proposes a machine learning-based pipeline that leverages and combines users’ natural gaze activity, the semantic knowledge from their vocal utterances and the synchronicity between gaze and speech data to facilitate users’ interaction. We evaluated our proposed approach on an existing dataset, which involved 32 participants recording voice notes while reading an academic paper. Using a Logistic Regression classifier, we demonstrate that our proposed multimodal approach maps voice notes with accurate text passages with an average 𝐹1-Score of 0.90. Our proposed pipeline motivates the design of multimodal interfaces that combines natural gaze and speech patterns to enable robust interactions

著者
Anam Ahmad Khan
The University of Melbourne, Melbourne, Victoria, Australia
Joshua Newn
The University of Melbourne, Melbourne, VIC, Australia
James Bailey
The University of Melbourne, Melbourne, Victoria, Australia
Eduardo Velloso
University of Melbourne, Melbourne, Victoria, Australia
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3502134

動画