Virtual Reality (VR) offers portable and flexible workspaces. However, enabling efficient and comfortable interactions without external input devices remains challenging. We propose leveraging redirected input to enable comfortable and touch-like interaction for quick and intuitive control. Our design study revealed that while touch interaction performs well with direct input, its performance degrades significantly under input redirection. In contrast, using pinch improves redirected input by providing self-haptic feedback and reducing input dimensionality, thereby compensating for spatial discrepancies. Based on these findings, we introduce Redirected Pinch, a bare-hand interaction technique that combines input redirection with pinch confirmation. It creates a virtual plane at waist height, remapping hand movements on the plane to a vertical window, with pinch gestures used for confirmation. A user study demonstrated that Redirected Pinch achieves a strong balance of accuracy, efficiency, comfort, and sense of agency across fundamental interactions.
Target disambiguation is crucial in resolving input ambiguity in augmented reality (AR), especially for queries over distant objects or cluttered scenes on the go. Yet, visual feedforward techniques that support this process remain underexplored. We present Uncertain Pointer, a systematic exploration of feedforward visualizations that annotate multiple candidate targets before user confirmation, either by adding distinct visual identities (e.g., colors) to support disambiguation or by modulating visual intensity (e.g., opacity) to convey system uncertainty. First, we construct a pointer space of 25 pointers by analyzing existing placement strategies and visual signifiers used in target visualizations across 30 years of relevant literature. We then evaluate them through two online experiments (n = 60 and 40), measuring user preference, confidence, mental ease, target visibility, and identifiability across varying object distances and sparsities. Finally, from the results, we derive design recommendations in choosing different Uncertain Pointers based on AR context and disambiguation techniques.
Selecting out-of-reach objects is a fundamental task in mixed reality (MR).
Existing methods rely on a single cue or deterministically fuse multiple cues, leading to performance degradation when the dominant cue becomes unreliable.
In this work, we introduce a probabilistic cue integration framework that enables flexible combination of multiple user-generated cues for intent inference.
Inspired by natural grasping behavior, we instantiate the framework with pointing direction and grasp gestures as a new interaction technique, \textsc{Point\&Grasp}.
To this end, we collect the \datasetfullname~(\dataset) dataset to train a robust likelihood model of the gestural cue, which captures grasping patterns not present in existing in-reach datasets.
User studies demonstrate that our selection method with cue integration not only improves accuracy and speed over single-cue baselines, but also remains practically effective compared to state-of-the-art methods across various sources of ambiguity. The dataset and code are available at \url{https://github.com/drlxj/point-and-grasp}.
Disambiguating distal target selection in dense and occluded virtual environments has been a challenge for Virtual Reality (VR) interaction design. While raycasting is a widely used interaction technique for selecting distant objects, it defaults to the first intersected target, forcing users into disambiguation phases, which can disrupt presence, increase cognitive load, and slow interaction. We introduce VoiceRay, a voice-based target selection that allows users to specify the ordinal position of the intended target along the ray (e.g., “second object”) without altering the scene or requiring additional inputs from the user. In a study with 24 participants, VoiceRay was compared against five existing techniques: AlphaCursor, LassoGrid, RayCursor, BubbleRay, and Raycasting. Results showed that VoiceRay significantly decreased selection time, maintained presence, increased usability, and reduced cognitive load. These findings demonstrate that voice-based interaction offers an effective, easy-to-use alternative for resolving 3D selection ambiguity in dense and occluded VR environments.
Dexterous freehand manipulation in virtual reality offers rich interaction but is limited by physical reach. Existing indirect remote manipulation techniques often sacrifice this dexterity. We address this by defining "virtually direct" manipulation, a conceptual framework for techniques that break from a purely direct or indirect model by decoupling the virtual hand from the physical one. Within this framework, we present Hitchhiking Hands (HH), a novel implementation designed to preserve the rich dexterity of direct manipulation at a distance. HH allows users to instantly switch control between multiple pre-defined virtual hands using gaze. We evaluated HH in two user studies. A first study showed our approach, which consistently maintains direct-touch properties, surpasses an established baseline in 6DoF manipulation performance and embodiment. A second qualitative study revealed HH excels in structured spaces but is ill-suited for unstructured global tasks, highlighting a trade-off between flexibility and learnability.
Virtual hand selection techniques in AR/VR face a persistent challenge due to the inherent speed–accuracy trade-off. Although target prediction offers a promising direction, its practical adoption is limited by the inevitable errors of predictive models. We present Motion-Touch, a selection technique that integrates a Kinematics-Based Adaptive Switch (KBAS) with deep-learning-based target prediction. KBAS switches between the two phases of pointing process: an untriggerable ballistic phase and a corrective phase in which only the AI-predicted target can be triggered through Touch. The technique can adaptively switch between these phases under distinct kinematic conditions. We collected a hand kinematics dataset from 20 participants to support model training and mechanism calibration. Compared to baseline techniques, Motion-Touch achieves selection times statistically comparable to the fastest reliable controller, while offering controller-free, error-free selection with minimal trigger effort. Our findings demonstrate how Motion-Touch achieves a near-optimal compromise for the speed–accuracy trade-off in virtual hand selection.
Perceptual grouping enables people to organize elements into units according to intrinsic (e.g., proximity) and extrinsic (e.g., common region) principles. However, the role of physical surfaces as extrinsic grouping cues for virtual elements in Augmented Reality (AR) remains unclear. To provide a deeper understanding, we conducted two within-subject studies. The first study (N = 24) using repetition discrimination tasks revealed that surfaces can be common-region cues in 3D, with their influence depending on their distance to target objects along the viewing direction. Building on these findings, the second study (N = 24) employed both objective and subjective measures to capture the interaction between proximity and common-region cues in AR. Results indicate that competing cues reduce group clarity. They also enable us to distill people's strategies for improving the clarity by leveraging their physical and virtual environments. Finally, we propose design recommendations for future AR systems in assisted grouping tasks.