Mixed-Reality Systems for Spatial Understanding and Navigation

Paper Guilds

/

会議の一覧

/

CHI 2026

/

Mixed-Reality Systems for Spatial Understanding and Navigation

CHI 2026

Mindfulness, Breathing, and Biofeedback Technologies

Multi-Agent Reasoning Systems for Sensemaking and Planning

In Extended Reality (XR), complex acoustic environments often overwhelm users, compromising both scene awareness and social engagement due to entangled sound sources. We introduce MoXaRt, a real-time XR system that uses audio-visual cues to separate these sources and enable fine-grained sound interaction. MoXaRt's core is a cascaded architecture that performs coarse, audio-only separation in parallel with visual detection of sources (e.g., faces, instruments). These visual anchors then guide refinement networks to isolate individual sources, separating complex mixes of up to 5 concurrent sources (e.g., 2 voices + 3 instruments) with ~2 second processing latency. We validate MoXaRt through a technical evaluation on a new dataset of 30 one-minute recordings featuring concurrent speech and music, and a 22-participant user study. Empirical results indicate that our system significantly enhances speech intelligibility, yielding a 36.2% (p < 0.01) increase in listening comprehension within adversarial acoustic environments while substantially reducing cognitive load (p < 0.001), thereby paving the way for more perceptive and socially adept XR experiences.

Google, Mountain View, California, United States

University of Michigan, Ann Arbor, Michigan, United States

Columbia University, nyc, New York, United States

Google , San Francisco, California, United States

Google, Mountain View, California, United States

Google, San Francisco, California, United States

University of Michigan, Ann Arbor, Michigan, United States

Google, San Francisco, California, United States

お気に入り

あとで読む

コレクション

We present Chromotion, a high-speed projection method that renders intended colors along the motion trajectories of moving objects. When an object moves across a　temporally multiplexed sequence, its occlusion of the projected patterns can, through persistence of vision, produce motion dependent colors along its path. Chromotion exploits this phenomenon by decomposing each static image into a short sequence in which target color frames are interleaved with a single complementary color frame. This temporal design allows moving objects to sample the sequence so that the perceived color along their motion paths converges to the target color, while stationary regions still integrate to the original static color. We built a prototype and conducted a camera based technical evaluation and user evaluations. The results show that Chromotion reliably produces the target color on motion trajectories without degrading static color fidelity. Because the approach requires no body or gaze tracking and no decoding of embedded information, it scales to public settings and supports multiuser and multimodal interactions. We also discuss limitations, and outline application scenarios such as public, ambient displays that blend into the environment.

Institute of Science Tokyo, Tokyo, Japan

お気に入り

あとで読む

コレクション

Current AI writing tools, which rely on text prompts, poorly support the spatial and interactive nature of storytelling where ideas emerge from direct manipulation and play. We present PlayWrite, a mixed-reality system where users author stories by directly manipulating virtual characters and props. A multi-agent AI pipeline interprets these actions into Intent Frames—structured narrative beats visualized as rearrangeable story marbles on a timeline. A large language model then transforms the user’s assembled sequence into a final narrative. A user study (N=13) with writers from varying domains found that PlayWrite fosters a highly improvisational and playful process. Users treated the AI as a collaborative partner, using its unexpected responses to spark new ideas and overcome creative blocks. PlayWrite demonstrates an approach for co-creative systems that move beyond text to embrace direct manipulation and play as core interaction modalities.

Autodesk Research, San Francisco, California, United States

Autodesk Research, Toronto, Ontario, Canada

お気に入り

あとで読む

コレクション

Emerging multimodal conversational search (MCS) tools (e.g., Gemini Live) allow users to search for spatiotemporal information through natural language dialogues as they move through urban space. Despite the growing popularity of these tools, there is limited understanding of how people engage with this technology. To address this gap, we developed UrbanSearch, an MCS technology probe designed to capture the user's current geolocation, time, and visual surroundings. A contextual inquiry (N=23) revealed that MCS tools provide two core values: requiring low effort in forming queries while offering highly relevant responses, and functioning as a central information gateway. As a promising technology, MCS supports environmental learning, in-situ decision making, and personalized navigation. Participants also revealed unmet needs for spatial reasoning and transparent integration of multi-source information, along with concerns related to peripheral awareness, social context, and personal space. Drawing from the findings, we discuss design implications for future MCS tools in urban spaces.

Rochester Institute of Technology, West Henrietta, New York, United States

Yonsei University, Seoul, Korea, Republic of

University of Vaasa, Vaasa, Finland

City University London, London, United Kingdom

Aalto University, Espoo, Finland

University of Nottingham, Nottingham, United Kingdom

Rochester Institute of Technology, Rochester, New York, United States

Yonsei University, Seoul, Korea, Republic of

お気に入り

あとで読む

コレクション

We present CineCraft, an interactive mobile application that unifies planning, capture, and post-processing for cinematography on a single device. Our key design insight is to use a storyboard-like shot plan as a persistent representation that connects different stages of the filmmaking process, emulating coordination strategies used by professional film crews. Our shot plans extend common storyboarding conventions to encode time-varying parameters (e.g., camera movement, focus, and zoom) on a shared timeline, enabling previsualization during planning and precise synchronization during capture. CineCraft uses shot plans to generate camera movement instructions, provide augmented-reality (AR) framing guidance during filming, automate focus and zoom, and organize takes for review and rough-cut assembly. By consolidating stages that are often fragmented across separate mobile apps and ad hoc workflows, our system enables rapid on-location iteration with immediate playback. We demonstrate our system through a range of examples and two user studies.

Cornell University, Ithaca, New York, United States

Cornell University, New York, New York, United States

お気に入り

あとで読む

コレクション

Motion sickness, in addition to its persistent long-term effects, also exhibits short-term effects characterized as transient physiological discomfort, which changes rapidly with variations in locomotion. However, such discomforts are challenging to assess using current subjective scales and objective physiological measurements. To tackle this issue, this paper suggests continuous measurement methods designed specifically for evaluating transient physiological discomfort during VR locomotion. Through a user-elicitation study, three preferred measurement methods—'squeezing ball', 'sliding thumb', and 'rubbing thigh'—were identified. These techniques were then evaluated for reliability, validity, attention, presence, and workload, with 'sliding thumb' identified as the most effective option. The paper expands traditional measurement methods to capture users' physiological experiences in VR interactions, offering practical choices for researchers in this field along with an in-depth discussion of design considerations, detailed implementation guidelines, and potential ways to optimize the VR experiences utilizing the measurement data.

Institute of Software, Beijing, China

Information Engineering College，Capital Normal University, Beijing, China

Tongji University College of Design and Innovation, Shanghai, China

Hongkong University of Science and Technology (GuangZhou)), GuangZhou, China

Institute of Software, Chinese Academy of Sciences, Beijing, China

Shandong University, Weihai, Shandong, China

Capital Normal University, Beijing, China

Tongji University, Shanghai, China

Institute of Software, Chinese Academy of Sciences, Beijing, China

Institute of software, Chinese Academy of Sciences, Beijing, China

お気に入り

あとで読む

コレクション

Mindfulness, Breathing, and Biofeedback Technologies

Multi-Agent Reasoning Systems for Sensemaking and Planning

要旨

著者

要旨

受賞
Honorable Mention

著者

要旨

著者

要旨

著者

要旨

著者

動画

要旨

著者

Mixed-Reality Systems for Spatial Understanding and Navigation

要旨

著者

要旨

受賞Honorable Mention

著者

要旨

著者

要旨

著者

要旨

著者

動画

要旨

著者

受賞
Honorable Mention