Automated biomechanical testing has great potential for the development of VR applications, as initial insights into user behaviour can be gained in silico early in the design process. In particular, it allows prediction of user movements and ergonomic variables, such as fatigue, prior to conducting user studies. However, there is a fundamental disconnect between simulators hosting state-of-the-art biomechanical user models and simulators used to develop and run VR applications. Existing user simulators often struggle to capture the intricacies of real-world VR applications, reducing ecological validity of user predictions. In this paper, we introduce SIM2VR, a system that aligns user simulation with a given VR application by establishing a continuous closed loop between the two processes. This, for the first time, enables training simulated users directly in the same VR application that real users interact with. We demonstrate that SIM2VR can predict differences in user performance, ergonomics and strategies in a fast-paced, dynamic arcade game. In order to expand the scope of automated biomechanical testing beyond simple visuomotor tasks, advances in cognitive models and reward function design will be needed.
https://doi.org/10.1145/3654777.3676452
Extended Reality (XR) systems with hand-tracking support direct manipulation of objects with both hands. A common interaction in this context is for the non-dominant hand (NDH) to orient an object for input by the dominant hand (DH). We explore bimanual interaction with gaze through three new modes of interaction where the input of the NDH, DH, or both hands is indirect based on Gaze+Pinch. These modes enable a new dynamic interplay between our hands, allowing flexible alternation between and pairing of complementary operations. Through applications, we demonstrate several use cases in the context of 3D modelling, where users exploit occlusion-free, low-effort, and fluid two-handed manipulation. To gain a deeper understanding of each mode, we present a user study on an asymmetric rotate-translate task. Most participants preferred indirect input with both hands for lower physical effort, without a penalty on user performance. Otherwise, they preferred modes where the NDH oriented the object directly, supporting preshaping of the hand, which is more challenging with indirect gestures. The insights gained are of relevance for the design of XR interfaces that aim to leverage eye and hand input in tandem.
https://doi.org/10.1145/3654777.3676331
We introduce Pro-Tact, a novel eyes-free pointing technique for interacting with out-of-view (OoV) VR menus. This technique combines rapid rough pointing using proprioception with fine-grain adjustments through tactile exploration, enabling menu interaction without visual attention. Our user study demonstrated that Pro-Tact allows users to select menu items accurately (95% accuracy for 54 items) in an eyes-free manner, with reduced fatigue and sickness compared to eyes-engaged interaction. Additionally, we observed that participants voluntarily interacted with OoV menus eyes-free when Pro-Tact's tactile feedback was provided in practical VR application usage contexts. This research contributes by introducing the novel interaction technique, Pro-Tact, and quantitatively evaluating its benefits in terms of performance, user experience, and user preference in OoV menu interactions.
https://doi.org/10.1145/3654777.3676324
We present GradualReality, a novel interface enabling a Cross Reality experience that includes gradual interaction with physical objects in a virtual environment and supports both presence and usability. Daily Cross Reality interaction is challenging as the user's physical object interaction state is continuously changing over time, causing their attention to frequently shift between the virtual and physical worlds. As such, presence in the virtual environment and seamless usability for interacting with physical objects should be maintained at a high level. To address this issue, we present an Interaction State-Aware Blending approach that (i) balances immersion and interaction capability and (ii) provides a fine-grained, gradual transition between virtual and physical worlds. The key idea includes categorizing the flow of physical object interaction into multiple states and designing novel blending methods that offer optimal presence and sufficient physical awareness at each state. We performed extensive user studies and interviews with a working prototype and demonstrated that GradualReality provides better Cross Reality experiences compared to baselines.
https://doi.org/10.1145/3654777.3676463
Text input is a critical component of any general purpose computing system, yet efficient and natural text input remains a challenge in AR and VR. Headset based hand-tracking has recently become pervasive among consumer VR devices and affords the opportunity to enable touch typing on virtual keyboards. We present an approach for decoding touch typing on uninstrumented flat surfaces using only egocentric camera-based hand-tracking as input. While egocentric hand-tracking accuracy is limited by issues like self occlusion and image fidelity, we show that a sufficiently diverse training set of hand motions paired with typed text can enable a deep learning model to extract signal from this noisy input. Furthermore, by carefully designing a closed-loop data collection process, we can train an end-to-end text decoder that accounts for natural sloppy typing on virtual keyboards. We evaluate our work with a user study (n=18) showing a mean online throughput of 42.4 WPM with an uncorrected error rate (UER) of 7% with our method compared to a physical keyboard baseline of 74.5 WPM at 0.8% UER, showing progress towards unlocking productivity and high throughput use cases in AR/VR.
https://doi.org/10.1145/3654777.3676343
Hand-tracking in Extended Reality (XR) enables moving objects in near space with direct hand gestures, to pick, drag and drop objects in 3D. In this work, we investigate the use of eye-tracking to reduce the effort involved in this interaction. As the eyes naturally look ahead to the target for a drag operation, the principal idea is to map the translation of the object in the image plane to gaze, such that the hand only needs to control the depth component of the operation. We have implemented four techniques that explore two factors: the use of gaze only to move objects in X-Y vs.\ extra refinement by hand, and the use of hand input in the Z axis to directly move objects vs.\ indirectly via a transfer function. We compared all four techniques in a user study (N=24) against baselines of direct and indirect hand input. We detail user performance, effort and experience trade-offs and show that all eye-hand techniques significantly reduce physical effort over direct gestures, pointing toward effortless drag-and-drop for XR environments.
https://doi.org/10.1145/3654777.3676446