This paper investigates integration, coordination, and transition strategies of gaze and hand input for 3D object manipulation in VR. Specifically, this work aims to understand whether incorporating gaze input can benefit VR object manipulation tasks, and how it should be combined with hand input for improved usability and efficiency. We designed four gaze-supported techniques that leverage different combination strategies for object manipulation and evaluated them in two user studies. Overall, we show that gaze did not offer significant performance benefits for transforming objects in the primary working space, where all objects were located in front of the user and within the arm-reach distance, but can be useful for a larger environment with distant targets. We further offer insights regarding combination strategies of gaze and hand input, and derive implications that can help guide the design of future VR systems that incorporate gaze input for 3D object manipulation.
QWERTY is the primary smartphone text input keyboard configuration. However, insertion and substitution errors caused by hand tremors, often experienced by users with Parkinson's disease, can severely affect typing efficiency and user experience. In this paper, we investigated Parkinson's users' typing behavior on smartphones. In particular, we identified and compared the typing characteristics generated by users with and without Parkinson's symptoms. We then proposed an elastic probabilistic model for input prediction. By incorporating both spatial and temporal features, this model generalized the classical statistical decoding algorithm to correct insertion, substitution and omission errors, while maintaining direct physical interpretation. User study results confirmed that the proposed algorithm outperformed baseline techniques: users reached 22.8 WPM typing speed with a significantly lower error rate and higher user-perceived performance and preference. We concluded that our method could effectively improve the text entry experience on smartphones for users with Parkinson's disease.
We propose a nail-mounted foldable haptic device that provides tactile feedback to mixed reality (MR) environments by pressing against the user’s fingerpad when a user touches a virtual object. What is novel in our device is that it quickly tucks away when the user interacts with real-world objects. Its design allows it to fold back on top of the user’s nail when not in use, keeping the user’s fingerpad free to, for instance, manipulate handheld tools and other objects while in MR. To achieve this, we engineered a wireless and self-contained haptic device, which measures 24×24×41 mm and weighs 9.5 g. Furthermore, our foldable end-effector also features a linear resonant actuator, allowing it to render not only touch contacts (i.e., pressure) but also textures (i.e., vibrations). We demonstrate how our device renders contacts with MR surfaces, buttons, low- and high-frequency textures. In our first user study, we found that participants perceived our device to be more realistic than a previous haptic device that also leaves the fingerpad free (i.e., fingernail vibration). In our second user study, we investigated the participants’ experience while using our device in a real-world task that involved physical objects. We found that our device allowed participants to use the same finger to manipulate handheld tools, small objects, and even feel textures and liquids, without much hindrance to their dexterity, while feeling haptic feedback when touching MR interfaces.
We designed a mid-air input space for restful interactions on the couch. We observed people gesturing in various postures on a couch and found that posture affects the choice of arm motions when no constraints are imposed by a system. Study participants that sat with the arm rested were more likely to use the forearm and wrist, as opposed to the whole arm. We investigate how a spherical input space, where forearm angles are mapped to screen coordinates, can facilitate restful mid-air input in multiple postures. We present two controlled studies. In the first, we examine how a spherical space compares with a planar space in an elbow-anchored setup, with a shoulder-level input space as baseline. In the second, we examine the performance of a spherical input space in four common couch postures that set unique constraints to the arm. We observe that a spherical model that captures forearm movement facilitates comfortable input across different seated postures.
Aerial hoops are circular, hanging devices for both acrobatic exercise and artistic performance that let us explore the role of interactive sonification in physical activity. We present SonicHoop, an augmented aerial hoop that generates auditory feedback via capacitive touch sensing, thus becoming a digital musical instrument that performers can play with their bodies. We compare three sonification strategies through a structured observation study with two professional aerial hoop performers. Results show that SonicHoop fundamentally changes their perception and choreographic processes: instead of translating music into movement, they search for bodily expressions that compose music. Different sound designs affect their movement differently, and auditory feedback, regardless of type of sound, improves movement quality. We discuss opportunities for using SonicHoop as an aerial hoop training tool, as a digital musical instrument, and as a creative object; as well as using interactive sonification in other acrobatic practices to explore full-body vertical interaction.
This work explores the design of marking menus for gaze-based AR/VR menu selection by expert and novice users. It first identifies and explains the challenges inherent in ocular motor control and current eye tracking hardware, including overshooting, incorrect selections, and false activations. Through three empirical studies, we optimized and validated design parameters to mitigate these errors while reducing completion time, task load, and eye fatigue. Based on the findings from these studies, we derived a set of design guidelines to support gaze-based marking menus in AR/VR. To overcome the overshoot errors found with eye-based expert marking menu behaviour, we developed StickyPie, a marking menu technique that enables scale-independent marking input by estimating saccade landing positions. An evaluation of StickyPie revealed that StickyPie was easier to learn than the traditional technique (i.e., RegularPie) and was 10% more efficient after 3 sessions.
Eye gaze and head movement are attractive for hands-free 3D interaction in head-mounted displays, but existing interfaces afford only limited control. Radi-Eye is a novel pop-up radial interface designed to maximise expressiveness with input from only the eyes and head. Radi-Eye provides widgets for discrete and continuous input and scales to support larger feature sets. Widgets can be selected with Look & Cross, using gaze for pre-selection followed by head-crossing as trigger and for manipulation. The technique leverages natural eye-head coordination where eye and head move at an offset unless explicitly brought into alignment, enabling interaction without risk of unintended input. We explore Radi-Eye in three augmented and virtual reality applications, and evaluate the effect of radial interface scale and orientation on performance with Look & Cross. The results show that Radi-Eye provides users with fast and accurate input while opening up a new design space for hands-free fluid interaction.
Text entry by gaze is a useful means of hands-free interaction that is applicable in settings where dictation suffers from poor voice recognition or where spoken words and sentences jeopardize privacy or confidentiality. However, text entry by gaze still shows inferior performance and it quickly exhausts its users. We introduce text entry by gaze and hum as a novel hands-free text entry. We review related literature to converge to word-level text entry by analysis of gaze paths that are temporally constrained by humming. We develop and evaluate two design choices: “HumHum” and “Hummer.” The first method requires short hums to indicate the start and end of a word. The second method interprets one continuous humming as an indication of the start and end of a word. In an experiment with 12 participants, Hummer achieved a commendable text entry rate of 20.8 words per minute and outperformed HumHum and the gaze-only method EyeSwipe in both quantitative and qualitative measures.
There have been promising studies that show a potential of providing social signal feedback to improve communication skills. However, these studies have primarily focused on unimodal methods of feedback. In addition to this, studies do not assess whether skills are maintained after a given time. With a sample size of 22 this paper investigates whether multimodal social signal feedback is an effective method of improving communication in the context of media interviews. A pre-post experimental evaluation of media skills training intervention is presented which compares standard feedback with augmented feedback based on automated recognition of multimodal social signals. Results revealed significantly different training effects between the two conditions. However, the initial experiment study failed to show significant differences in human judgement of performance. A 6-month follow-up study revealed human judgement ratings were higher for the experiment group. This study suggests that augmented selective multimodal social signal feedback is an effective method for communication skills training.
We explore how discreet input can be provided using the tensor tympani - a small muscle in the middle ear that some people can voluntarily contract to induce a dull rumbling sound. We investigate the prevalence and ability to control the muscle through an online questionnaire (N=192) in which 43.2% of respondents reported the ability to "ear rumble". Data collected from participants (N=16) shows how in-ear barometry can be used to detect voluntary tensor tympani contraction in the sealed ear canal. This data was used to train a classifier based on three simple ear rumble "gestures" which achieved 95% accuracy. Finally, we evaluate the use of ear rumbling for interaction, grounded in three manual, dual-task application scenarios (N=8). This highlights the applicability of EarRumble as a low-effort and discreet eyes- and hands-free interaction technique that users found "magical" and "almost telepathic".
Using microgestures, prior work has successfully enabled gestural interactions while holding objects. Yet, these existing methods are prone to false activations caused by natural finger movements while holding or manipulating the object. We address this issue with SoloFinger, a novel concept that allows design of microgestures that are robust against movements that naturally occur during primary activities. Using a data-driven approach, we establish that single-finger movements are rare in everyday hand-object actions and infer a single-finger input technique resilient to false activation. We demonstrate this concept's robustness using a white-box classifier on a pre-existing dataset comprising 36 everyday hand-object actions. Our findings validate that simple SoloFinger gestures can relieve the need for complex finger configurations or delimiting gestures and that SoloFinger is applicable to diverse hand-object actions. Finally, we demonstrate SoloFinger's high performance on commodity hardware using random forest classifiers.
Inefficient menu interfaces lead to system and application commands being tedious to execute in Immersive Environments. HoloBar is a novel approach to ease the interaction with multi-level menus in immersive environments: with HoloBar, the hierarchical menu splits between the field of view (FoV) of the Head Mounted Display and the smartphone (SP). Command execution is based on around-the-FoV interaction with the SP, and touch input on the SP display. The HoloBar offers a unique combination of features, namely rapid mid-air activation, implicit selection of top-level items and preview of second-level items on the SP, ensuring rapid access to commands. In a first study we validate its activation method, which consists in bringing the SP within an activation distance from the FoV. In a second study, we compare the HoloBar to two alternatives, including the standard HoloLens menu. Results show that the HoloBar shortens each step of a multi-level menu interaction (menu activation, top-level item selection, second-level item selection and validation), with a high success rate. A follow-up study confirms that these results remain valid when compared with the two validation mechanisms of HoloLens (Air-Tap and clicker).
Learning a musical instrument requires regular exercise. However, students are often on their own during their practice sessions due to the limited time with their teachers, which increases the likelihood of mislearning playing techniques. To address this issue, we present Let's Frets - a modular guitar learning system that provides visual indicators and capturing of finger positions on a 3D-printed capacitive guitar fretboard. We based the design of Let's Frets on requirements collected through in-depth interviews with professional guitarists and teachers. In a user study (N=24), we evaluated the feedback modules of Let's Frets against fretboard charts. Our results show that visual indicators require the least time to realize new finger positions while a combination of visual indicators and position capturing yielded the highest playing accuracy. We conclude how Let's Frets enables independent practice sessions that can be translated to other musical instruments.