Mixed-Reality Systems for Spatial Understanding and Navigation

会議の名前
CHI 2026
MoXaRt: Audio-Visual Object-Guided Sound Interaction for XR
要旨

In Extended Reality (XR), complex acoustic environments often overwhelm users, compromising both scene awareness and social engagement due to entangled sound sources. We introduce MoXaRt, a real-time XR system that uses audio-visual cues to separate these sources and enable fine-grained sound interaction. MoXaRt's core is a cascaded architecture that performs coarse, audio-only separation in parallel with visual detection of sources (e.g., faces, instruments). These visual anchors then guide refinement networks to isolate individual sources, separating complex mixes of up to 5 concurrent sources (e.g., 2 voices + 3 instruments) with ~2 second processing latency. We validate MoXaRt through a technical evaluation on a new dataset of 30 one-minute recordings featuring concurrent speech and music, and a 22-participant user study. Empirical results indicate that our system significantly enhances speech intelligibility, yielding a 36.2% (p < 0.01) increase in listening comprehension within adversarial acoustic environments while substantially reducing cognitive load (p < 0.001), thereby paving the way for more perceptive and socially adept XR experiences.

著者
Tianyu Xu
Google, Mountain View, California, United States
Qianhui Zheng
University of Michigan, Ann Arbor, Michigan, United States
Sieun Kim
University of Michigan, Ann Arbor, Michigan, United States
Ruoyu Xu
Columbia University, nyc, New York, United States
Tejasvi Ravi
Google , San Francisco, California, United States
Anuva Kulkarni
Google, Mountain View, California, United States
Katrina Passarella-Ward
Google, San Francisco, California, United States
Junyi Zhu
University of Michigan, Ann Arbor, Michigan, United States
Adarsh Kowdle
Google, San Francisco, California, United States
Chromotion: Controlling Motion-Induced Color on Object Motion Paths via High-Speed Temporal Additive Projection
要旨

We present Chromotion, a high-speed projection method that renders intended colors along the motion trajectories of moving objects. When an object moves across a temporally multiplexed sequence, its occlusion of the projected patterns can, through persistence of vision, produce motion dependent colors along its path. Chromotion exploits this phenomenon by decomposing each static image into a short sequence in which target color frames are interleaved with a single complementary color frame. This temporal design allows moving objects to sample the sequence so that the perceived color along their motion paths converges to the target color, while stationary regions still integrate to the original static color. We built a prototype and conducted a camera based technical evaluation and user evaluations. The results show that Chromotion reliably produces the target color on motion trajectories without degrading static color fidelity. Because the approach requires no body or gaze tracking and no decoding of embedded information, it scales to public settings and supports multiuser and multimodal interactions. We also discuss limitations, and outline application scenarios such as public, ambient displays that blend into the environment.

受賞
Honorable Mention
著者
Shio Miyafuji
Institute of Science Tokyo, Tokyo, Japan
Arisa Kohtani
Institute of Science Tokyo, Tokyo, Japan
Hideki Koike
Institute of Science Tokyo, Tokyo, Japan
PlayWrite: A Multimodal System for AI Supported Narrative Co-Authoring Through Play in XR
要旨

Current AI writing tools, which rely on text prompts, poorly support the spatial and interactive nature of storytelling where ideas emerge from direct manipulation and play. We present PlayWrite, a mixed-reality system where users author stories by directly manipulating virtual characters and props. A multi-agent AI pipeline interprets these actions into Intent Frames—structured narrative beats visualized as rearrangeable story marbles on a timeline. A large language model then transforms the user’s assembled sequence into a final narrative. A user study (N=13) with writers from varying domains found that PlayWrite fosters a highly improvisational and playful process. Users treated the AI as a collaborative partner, using its unexpected responses to spark new ideas and overcome creative blocks. PlayWrite demonstrates an approach for co-creative systems that move beyond text to embrace direct manipulation and play as core interaction modalities.

著者
Esen K. Tütüncü
Autodesk Research, San Francisco, California, United States
Qian Zhou
Autodesk Research, San Francisco, California, United States
Frederik Brudy
Autodesk Research, Toronto, Ontario, Canada
George Fitzmaurice
Autodesk Research, Toronto, Ontario, Canada
Fraser Anderson
Autodesk Research, Toronto, Ontario, Canada
Understanding Spatiotemporal-Aware Multimodal Conversational Search in the Outdoor Urban Space
要旨

Emerging multimodal conversational search (MCS) tools (e.g., Gemini Live) allow users to search for spatiotemporal information through natural language dialogues as they move through urban space. Despite the growing popularity of these tools, there is limited understanding of how people engage with this technology. To address this gap, we developed UrbanSearch, an MCS technology probe designed to capture the user's current geolocation, time, and visual surroundings. A contextual inquiry (N=23) revealed that MCS tools provide two core values: requiring low effort in forming queries while offering highly relevant responses, and functioning as a central information gateway. As a promising technology, MCS supports environmental learning, in-situ decision making, and personalized navigation. Participants also revealed unmet needs for spatial reasoning and transparent integration of multi-source information, along with concerns related to peripheral awareness, social context, and personal space. Drawing from the findings, we discuss design implications for future MCS tools in urban spaces.

著者
Jiangnan Xu
Rochester Institute of Technology, West Henrietta, New York, United States
Suyeon Seo
Yonsei University, Seoul, Korea, Republic of
Joni Salminen
University of Vaasa, Vaasa, Finland
Michael Saker
City University London, London, United Kingdom
Joongi Shin
Aalto University, Espoo, Finland
Alan Chamberlain
University of Nottingham, Nottingham, United Kingdom
Konstantinos Papangelis
Rochester Institute of Technology, Rochester, New York, United States
Dae Hyun Kim
Yonsei University, Seoul, Korea, Republic of
CineCraft: Unified Shot Planning, Capture, and Post-Processing for Mobile Cinematography
要旨

We present CineCraft, an interactive mobile application that unifies planning, capture, and post-processing for cinematography on a single device. Our key design insight is to use a storyboard-like shot plan as a persistent representation that connects different stages of the filmmaking process, emulating coordination strategies used by professional film crews. Our shot plans extend common storyboarding conventions to encode time-varying parameters (e.g., camera movement, focus, and zoom) on a shared timeline, enabling previsualization during planning and precise synchronization during capture. CineCraft uses shot plans to generate camera movement instructions, provide augmented-reality (AR) framing guidance during filming, automate focus and zoom, and organize takes for review and rough-cut assembly. By consolidating stages that are often fragmented across separate mobile apps and ad hoc workflows, our system enables rapid on-location iteration with immediate playback. We demonstrate our system through a range of examples and two user studies.

著者
Nhan Tran
Cornell University, Ithaca, New York, United States
Sam Belliveau
Cornell University, Ithaca, New York, United States
Zixin Xu
Cornell University, Ithaca, New York, United States
Abe Davis
Cornell University, New York, New York, United States
動画
Continuous Measurement Methods for Transient Physiological Discomfort in VR Locomotion
要旨

Motion sickness, in addition to its persistent long-term effects, also exhibits short-term effects characterized as transient physiological discomfort, which changes rapidly with variations in locomotion. However, such discomforts are challenging to assess using current subjective scales and objective physiological measurements. To tackle this issue, this paper suggests continuous measurement methods designed specifically for evaluating transient physiological discomfort during VR locomotion. Through a user-elicitation study, three preferred measurement methods—'squeezing ball', 'sliding thumb', and 'rubbing thigh'—were identified. These techniques were then evaluated for reliability, validity, attention, presence, and workload, with 'sliding thumb' identified as the most effective option. The paper expands traditional measurement methods to capture users' physiological experiences in VR interactions, offering practical choices for researchers in this field along with an in-depth discussion of design considerations, detailed implementation guidelines, and potential ways to optimize the VR experiences utilizing the measurement data.

著者
Tianren Luo
Institute of Software, Beijing, China
Pengxiang Wang
Information Engineering College,Capital Normal University, Beijing, China
Shuting Chang
Tongji University College of Design and Innovation, Shanghai, China
Ke Zhou
Hongkong University of Science and Technology (GuangZhou)), GuangZhou, China
Nianlong Li
Institute of Software, Chinese Academy of Sciences, Beijing, China
Yulong Bian
Shandong University, Weihai, Shandong, China
Xiaohui Tan
Capital Normal University, Beijing, China
Qi Wang
Tongji University, Shanghai, China
Teng Han
Institute of Software, Chinese Academy of Sciences, Beijing, China
Feng Tian
Institute of software, Chinese Academy of Sciences, Beijing, China