SceneScout: Towards AI-Driven Access to Street Level Imagery for Blind Users

People who are blind or have low-vision (BLV) may hesitate to travel independently in unfamiliar environments due to uncertainty about the physical landscape. While most tools focus on in-situ navigation assistance, those supporting pre-travel assistance typically provide information about only landmarks and turn-by-turn instructions, lacking detailed visual context. Street level imagery, which contains rich visual information and has the potential to reveal environmental details, remains inaccessible to BLV people. In this work, we present SceneScout, a multimodal large language model (MLLM)-driven prototype that enables accessible interactions with street level imagery. SceneScout supports two modes: (1) Route Preview, enabling users to familiarize themselves with visual details along a route, and (2) Virtual Exploration, enabling user-driven movement within street level imagery. Our user study demonstrates that SceneScout helps BLV users uncover visual information otherwise unavailable through existing means. An initial analysis of AI-generated descriptions suggests that majority are accurate and describe stable visual elements even in older imagery, though occasional subtle and plausible errors make them difficult to verify without sight. We discuss future opportunities and challenges of street level imagery-based navigation experiences.

Columbia University, New York, New York, United States

Apple, Cupertino, California, United States

Apple Inc., Seattle, Washington, United States

ACM CHI Conference on Human Factors in Computing Systems

P1 - Room 120

7 件の発表

開始日時2026-04-15 20:15:00

終了日時2026-04-15 21:45:00

お気に入り

あとで読む

コレクション

要旨

著者

会議: CHI 2026

セッション: Sound, Music, and Dance Accessibility