Immersive Touch and Gesture Interaction

https://dl.acm.org/doi/10.1145/3706598.3713388

As augmented reality devices (e.g., smartphones and headsets) proliferate in the market, multi-user AR scenarios are set to become more common. Co-located users will want to share coherent and synchronized AR experiences, but this is surprisingly cumbersome with current methods. In response, we developed PatternTrack, a novel tracking approach that repurposes the structured infrared light patterns emitted by VCSEL-driven depth sensors, like those found in the Apple Vision Pro, iPhone, iPad, and Meta Quest 3. Our approach is infrastructure-free, requires no pre-registration, works on featureless surfaces, and provides the real-time 3D position and orientation of other users' devices. In our evaluation --- tested on six different surfaces and with inter-device distances of up to 260 cm --- we found a mean 3D positional tracking error of 11.02 cm and a mean angular error of 6.81°.

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

University of British Columbia, Vancouver, British Columbia, Canada

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

10.1145/3706598.3713388

https://dl.acm.org/doi/10.1145/3706598.3713461

Unlike other inputs for extended reality (XR) that work out of the box, eye tracking typically requires custom calibration per user or session. We present a multimodal inputs approach for implicit calibration of eye tracker in VR, leveraging UI interaction for continuous, background calibration. Our method analyzes gaze data alongside controller interaction with UI elements, and employing ML techniques it continuously refines the calibration matrix without interrupting users from their current tasks. Potentially eliminating the need for explicit calibration. We demonstrate the accuracy and effectiveness of this implicit approach across various tasks and real time applications achieving comparable eye tracking accuracy to native, explicit calibration. While our evaluation focuses on VR and controller-based interactions, we anticipate the broader applicability of this approach to various XR devices and input modalities.

Google, Seattle, Washington, United States

Google, Mountain View, California, United States

Google, Seattle, Washington, United States

Google Inc, Mountain View, California, India

Google, Mountain View, California, United States

Aarhus University, Aarhus, Denmark

Lancaster University, Lancaster, United Kingdom

Google, Seattle, Washington, United States

10.1145/3706598.3713461

https://dl.acm.org/doi/10.1145/3706598.3713964

Olfactory experiences are increasingly in demand due to their immersive benefits. However, most interaction implementations are passive and rely on conventions established for other modalities. In this work, we investigated proactive olfactory interactions, where users actively engage with scents, focusing on mid-air gestures as an input modality miming real-world object- and scent-manipulation, e.g., fanning away an odor. Our study had participants develop a user-defined gesture set for interacting with scents in Virtual Reality (VR), covering various object types (solid, liquid, gas) and interaction modes (out-of-reach, \revision{not graspable}, graspable), participants compared interacting with scents in VR using traditional controllers versus proactive gestures, revealing that proactive gestures enhanced user experience, presence, and task performance. Finally, an exploratory study showed strong participants' preferences for personalization, enhanced interaction capabilities, and multi-sensory integration. Based on these findings, we propose design guidelines and applications for proactive interactions with scents.

Zhejiang University, Hangzhou, Zhejiang, China

Donghua University, Shanghai , China

Zhejiang University, Hangzhou, Zhejiang, China

University of Chicago, Chicago, Illinois, United States

Donghua University, Shanghai, China

Colledge of Fashion and Design, Shanghai, Shanghai, China

Donghua University, Shanghai, China

10.1145/3706598.3713964

https://dl.acm.org/doi/10.1145/3706598.3714179

Sensing touch on arbitrary surfaces has long been a goal of ubiquitous computing, but often requires instrumenting the surface. Depth camera-based systems have emerged as a promising solution for minimizing instrumentation, but at the cost of high touch-down detection error rates, high touch latency, and high minimum hover distance, limiting them to basic tasks. We developed HaloTouch, a vision-based system which exploits a multipath interference effect from an off-the-shelf time-of-flight depth camera to enable fast, accurate touch interactions on general surfaces. HaloTouch achieves a 99.2% touch-down detection accuracy across various materials, with a motion-to-photon latency of 150 ms. With a brief (20s) user-specific calibration, HaloTouch supports millimeter-accurate hover sensing as well as continuous pressure sensing. We conducted a user study with 12 participants, including a typing task demonstrating text input at 26.3 AWPM. HaloTouch shows promise for more robust, dynamic touch interactions without instrumenting surfaces or adding hardware to users.

University of British Columbia, Vancouver, British Columbia, Canada

10.1145/3706598.3714179

https://dl.acm.org/doi/10.1145/3706598.3713874

Handheld-style head-mounted displays (HMDs) are becoming increasingly popular as a convenient option for onsite exhibitions. However, they lack established practices for basic interactions, particularly pointing methods. Through our formative study involving practitioners, we discovered that controllers and hand gestures are the primary pointing methods being utilized. Building upon these findings, we conducted a usability study to explore seven different pointing methods, incorporating insights from the formative study and current virtual reality (VR) practices. The results showed that while controllers remain a viable option, hand gestures are not recommended. Notably, dwell time-based methods, which are not fast and are not commonly recognized by practitioners, demonstrate high usability and user confidence, particularly for inexperienced VR users. We recommend the use of dwell-based methods for onsite exhibition contexts. This research provides insights for the adoption of handheld-style HMDs, laying the groundwork for improving user interaction in exhibition environments, thereby potentially enhancing visitor experiences.

Hokkaido University, Sapporo, Japan

HokkaidoUniversity, Sapporo, Japan

University of Tsukuba, Tsukuba, Ibaraki, Japan

Hokkaido University, Sapporo, Japan

Hokkaido University, Sapporo, Hokkaido, Japan

10.1145/3706598.3713874