Beyond Visual Perception: Insights from Smartphone Interaction of Visually Impaired Users with Large Multimodal Models

要旨

Large multimodal models (LMMs) have enabled new AI-powered applications that help people with visual impairments (PVI) receive natural language descriptions of their surroundings through audible text. We investigated how this emerging paradigm of visual assistance transforms how PVI perform and manage their daily tasks. Moving beyond usability assessments, we examined both the capabilities and limitations of LMM-based tools in personal and social contexts, while exploring design implications for their future development. Through interviews with 14 visually impaired users of Be My AI (an LMM-based application) and analysis of its image descriptions from both study participants and social media platforms, we identified two key limitations. First, these systems' context awareness suffers from hallucinations and misinterpretations of social contexts, styles, and human identities. Second, their intent-oriented capabilities often fail to grasp and act on users' intentions. Based on these findings, we propose design strategies for improving both human-AI and AI-AI interactions, contributing to the development of more effective, interactive, and personalized assistive technologies.

著者
Jingyi Xie
Pennsylvania State University, University Park, Pennsylvania, United States
Rui Yu
University of Louisville, Louisville, Kentucky, United States
He Zhang
Pennsylvania State University, State College, Pennsylvania, United States
Syed Masum Billah
Pennsylvania State University, University Park , Pennsylvania, United States
Sooyeon Lee
New Jersey Institute of Technology, Newark, New Jersey, United States
John M.. Carroll
Pennsylvania State University, University Park, Pennsylvania, United States
DOI

10.1145/3706598.3714210

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714210

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Assistive Technologies

G314+G315
7 件の発表
2025-05-01 01:20:00
2025-05-01 02:50:00
日本語まとめ
読み込み中…