Sketch mapping is an effective technique to externalize and communicate spatial information. However, it has been limited to 2D mediums, making it difficult to represent 3D information, particularly for terrains with elevation changes. We present Sketch2Terrain, an intuitive generative-3D-sketch-mapping system combining freehand sketching with generative Artificial Intelligence that radically changes sketch map creation and representation using Augmented Reality. Sketch2Terrain empowers non-experts to create unambiguous sketch maps of natural environments and provides a homogeneous interface for researchers to collect data and conduct experiments. A between-subject study (N=36) revealed that generative-3D-sketch-mapping improved efficiency by 38.4%, terrain-topology accuracy by 12.5%, and landmark accuracy by up to 12.1%, with only a 4.7% trade-off in terrain-elevation accuracy compared to freehand 3D-sketch-mapping. Additionally, generative-3D-sketch-mapping reduced perceived strain by 60.5% and stress by 39.5% over 2D-sketch-mapping. These findings underscore potential applications of generative-3D-sketch-mapping for in-depth understanding and accurate representation of vertically complex environments. The implementation is publicly available.
As augmented reality devices (e.g., smartphones and headsets) proliferate in the market, multi-user AR scenarios are set to become more common. Co-located users will want to share coherent and synchronized AR experiences, but this is surprisingly cumbersome with current methods. In response, we developed PatternTrack, a novel tracking approach that repurposes the structured infrared light patterns emitted by VCSEL-driven depth sensors, like those found in the Apple Vision Pro, iPhone, iPad, and Meta Quest 3. Our approach is infrastructure-free, requires no pre-registration, works on featureless surfaces, and provides the real-time 3D position and orientation of other users' devices. In our evaluation --- tested on six different surfaces and with inter-device distances of up to 260 cm --- we found a mean 3D positional tracking error of 11.02 cm and a mean angular error of 6.81°.
Unlike other inputs for extended reality (XR) that work out of the box, eye tracking typically requires custom calibration per user or session. We present a multimodal inputs approach for implicit calibration of eye tracker in VR, leveraging UI interaction for continuous, background calibration. Our method analyzes gaze data alongside controller interaction with UI elements, and employing ML techniques it continuously refines the calibration matrix without interrupting users from their current tasks. Potentially eliminating the need for explicit calibration. We demonstrate the accuracy and effectiveness of this implicit approach across various tasks and real time applications achieving comparable eye tracking accuracy to native, explicit calibration. While our evaluation focuses on VR and controller-based interactions, we anticipate the broader applicability of this approach to various XR devices and input modalities.
Olfactory experiences are increasingly in demand due to their immersive benefits. However, most interaction implementations are passive and rely on conventions established for other modalities. In this work, we investigated proactive olfactory interactions, where users actively engage with scents, focusing on mid-air gestures as an input modality miming real-world object- and scent-manipulation, e.g., fanning away an odor. Our study had participants develop a user-defined gesture set for interacting with scents in Virtual Reality (VR), covering various object types (solid, liquid, gas) and interaction modes (out-of-reach, \revision{not graspable}, graspable), participants compared interacting with scents in VR using traditional controllers versus proactive gestures, revealing that proactive gestures enhanced user experience, presence, and task performance. Finally, an exploratory study showed strong participants' preferences for personalization, enhanced interaction capabilities, and multi-sensory integration. Based on these findings, we propose design guidelines and applications for proactive interactions with scents.
Sensing touch on arbitrary surfaces has long been a goal of ubiquitous computing, but often requires instrumenting the surface. Depth camera-based systems have emerged as a promising solution for minimizing instrumentation, but at the cost of high touch-down detection error rates, high touch latency, and high minimum hover distance, limiting them to basic tasks. We developed HaloTouch, a vision-based system which exploits a multipath interference effect from an off-the-shelf time-of-flight depth camera to enable fast, accurate touch interactions on general surfaces. HaloTouch achieves a 99.2% touch-down detection accuracy across various materials, with a motion-to-photon latency of 150 ms. With a brief (20s) user-specific calibration, HaloTouch supports millimeter-accurate hover sensing as well as continuous pressure sensing. We conducted a user study with 12 participants, including a typing task demonstrating text input at 26.3 AWPM. HaloTouch shows promise for more robust, dynamic touch interactions without instrumenting surfaces or adding hardware to users.
Handheld-style head-mounted displays (HMDs) are becoming increasingly popular as a convenient option for onsite exhibitions. However, they lack established practices for basic interactions, particularly pointing methods. Through our formative study involving practitioners, we discovered that controllers and hand gestures are the primary pointing methods being utilized. Building upon these findings, we conducted a usability study to explore seven different pointing methods, incorporating insights from the formative study and current virtual reality (VR) practices. The results showed that while controllers remain a viable option, hand gestures are not recommended. Notably, dwell time-based methods, which are not fast and are not commonly recognized by practitioners, demonstrate high usability and user confidence, particularly for inexperienced VR users. We recommend the use of dwell-based methods for onsite exhibition contexts. This research provides insights for the adoption of handheld-style HMDs, laying the groundwork for improving user interaction in exhibition environments, thereby potentially enhancing visitor experiences.
Index-to-palm interaction plays a crucial role in Mixed Reality(MR) interactions. However, achieving a satisfactory inter-hand interaction experience is challenging with existing vision-based hand tracking technologies, especially in scenarios where only a single camera is available. Therefore, we introduce Palmpad, a novel sensing method utilizing a single RGB camera to detect the touch of an index finger on the opposite palm. Our exploration reveals that the incorporation of optical flow techniques to extract motion information between consecutive frames for the index finger and palm leads to a significant improvement in touch status determination. By doing so, our CNN model achieves 97.0% recognition accuracy and a 96.1% F1 score. In usability evaluation, we compare Palmpad with Quest's inherent hand gesture algorithms. Palmpad not only delivers superior accuracy 95.3% but also reduces operational demands and significantly improves users’ willingness and confidence. Palmpad aims to enhance accurate touch detection for lightweight MR devices.