2. Contextual Augmentations

会議の名前
UIST 2024
StreetNav: Leveraging Street Cameras to Support Precise Outdoor Navigation for Blind Pedestrians
要旨

Blind and low-vision (BLV) people rely on GPS-based systems for outdoor navigation. GPS's inaccuracy, however, causes them to veer off track, run into obstacles, and struggle to reach precise destinations. While prior work has made precise navigation possible indoors via hardware installations, enabling this outdoors remains a challenge. Interestingly, many outdoor environments are already instrumented with hardware such as street cameras. In this work, we explore the idea of repurposing existing street cameras for outdoor navigation. Our community-driven approach considers both technical and sociotechnical concerns through engagements with various stakeholders: BLV users, residents, business owners, and Community Board leadership. The resulting system, StreetNav, processes a camera's video feed using computer vision and gives BLV pedestrians real-time navigation assistance. Our evaluations show that StreetNav guides users more precisely than GPS, but its technical performance is sensitive to environmental occlusions and distance from the camera. We discuss future implications for deploying such systems at scale.

著者
Gaurav Jain
Columbia University, New York, New York, United States
Basel Hindi
Columbia University , New York, New York, United States
Zihao Zhang
Columbia University, New York, New York, United States
Koushik Srinivasula
Columbia University , New York , New York, United States
Mingyu Xie
Columbia University, New York, New York, United States
Mahshid Ghasemi
Columbia University, New York, New York, United States
Daniel Weiner
Lehman College, Bronx, New York, United States
Sophie Ana Paris
New York University, New York, New York, United States
Xin Yi Therese Xu
Pomona College, Claremont, California, United States
Michael C. Malcolm
New York City College of Technology, Brooklyn , New York, United States
Mehmet Kerem Turkcan
Columbia University, New York, New York, United States
Javad Ghaderi
Columbia University, New York, New York, United States
Zoran Kostic
Columbia University, New York, New York, United States
Gil Zussman
Columbia University , New York, New York, United States
Brian A.. Smith
Columbia University, New York, New York, United States
論文URL

https://doi.org/10.1145/3654777.3676333

動画
WorldScribe: Towards Context-Aware Live Visual Descriptions
要旨

Automated live visual descriptions can aid blind people in understanding their surroundings with autonomy and independence. However, providing descriptions that are rich, contextual, and just-in-time has been a long-standing challenge in accessibility. In this work, we develop WorldScribe, a system that generates automated live real-world visual descriptions that are customizable and adaptive to users' contexts: (i) WorldScribe's descriptions are tailored to users' intents and prioritized based on semantic relevance. (ii) WorldScribe is adaptive to visual contexts, e.g., providing consecutively succinct descriptions for dynamic scenes, while presenting longer and detailed ones for stable settings. (iii) WorldScribe is adaptive to sound contexts, e.g., increasing volume in noisy environments, or pausing when conversations start. Powered by a suite of vision, language, and sound recognition models, WorldScribe introduces a description generation pipeline that balances the tradeoffs between their richness and latency to support real-time use. The design of WorldScribe is informed by prior work on providing visual descriptions and a formative study with blind participants. Our user study and subsequent pipeline evaluation show that WorldScribe can provide real-time and fairly accurate visual descriptions to facilitate environment understanding that is adaptive and customized to users' contexts. Finally, we discuss the implications and further steps toward making live visual descriptions more context-aware and humanized.

受賞
Best Paper
著者
Ruei-Che Chang
University of Michigan, Ann Arbor, Michigan, United States
Yuxuan Liu
University of Michigan, Ann Arbor, Michigan, United States
Anhong Guo
University of Michigan, Ann Arbor, Michigan, United States
論文URL

https://doi.org/10.1145/3654777.3676375

動画
CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision
要旨

Cooking is a central activity of daily living, supporting independence as well as mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV), we present CookAR, a head-mounted AR system with real-time object affordance augmentations to support safe and efficient interactions with kitchen tools. To design and implement CookAR, we collected and annotated the first egocentric dataset of kitchen tool affordances, fine-tuned an affordance segmentation model, and developed an AR system with a stereo camera to generate visual augmentations. To validate CookAR, we conducted a technical evaluation of our fine-tuned model as well as a qualitative lab study with 10 LV participants for suitable augmentation design. Our technical evaluation demonstrates that our model outperforms the baseline on our tool affordance dataset, while our user study indicates a preference for affordance augmentations over the traditional whole object augmentations.

著者
Jaewook Lee
University of Washington, Seattle, Washington, United States
Andrew D. Tjahjadi
University of Washington, Seattle, Washington, United States
Jiho Kim
University of Washington, Seattle, Washington, United States
Junpu Yu
University of Washington, Seattle, Washington, United States
Minji Park
Sungkyunkwan University, Suwon, Korea, Republic of
Jiawen Zhang
University of Washington, Seattle, Washington, United States
Jon E.. Froehlich
University of Washington, Seattle, Washington, United States
Yapeng Tian
University of Texas at Dallas, Richardson, Texas, United States
Yuhang Zhao
University of Wisconsin-Madison, Madison, Wisconsin, United States
論文URL

https://doi.org/10.1145/3654777.3676449

動画
DesignChecker: Visual Design Support for Blind and Low Vision Web Developers
要旨

Blind and low vision (BLV) developers create websites to share knowledge and showcase their work. A well-designed website can engage audiences and deliver information effectively, yet it remains challenging for BLV developers to review their web designs. We conducted interviews with BLV developers (N=9) and analyzed 20 websites created by BLV developers. BLV developers created highly accessible websites but wanted to assess the usability of their websites for sighted users and follow the design standards of other websites. They also encountered challenges using screen readers to identify illegible text, misaligned elements, and inharmonious colors. We present DesignChecker, a browser extension that helps BLV developers improve their web designs. With DesignChecker, users can assess their current design by comparing it to visual design guidelines, a reference website of their choice, or a set of similar websites. DesignChecker also identifies the specific HTML elements that violate design guidelines and suggests CSS changes for improvements. Our user study participants (N=8) recognized more visual design errors than using their typical workflow and expressed enthusiasm about using DesignChecker in the future.

著者
Mina Huh
University of Texas, Austin, Austin, Texas, United States
Amy Pavel
University of Texas, Austin, Austin, Texas, United States
論文URL

https://doi.org/10.1145/3654777.3676369

動画