AI and UI Design

会議の名前
CHI 2024
SimUser: Generating Usability Feedback by Simulating Various Users Interacting with Mobile Applications
要旨

The conflict between the rapid iteration demand of prototyping and the time-consuming nature of user tests has led researchers to adopt AI methods to identify usability issues. However, these AI-driven methods concentrate on evaluating the feasibility of a system, while often overlooking the influence of specified user characteristics and usage contexts. Our work proposes a tool named SimUser based on large language models (LLMs) with the Chain-of-Thought structure and user modeling method. It generates usability feedback by simulating the interaction between users and applications, which is influenced by user characteristics and contextual factors. The empirical study (48 human users and 21 designers) validated that in the context of a simple smartwatch interface, SimUser could generate heuristic usability feedback with the similarity varying from 35.7% to 100% according to the user groups and usability category. Our work provides insights into simulating users by LLM to improve future design activities.

著者
Wei Xiang
Zhejiang University, Hangzhou, China
Hanfei Zhu
Zhejiang University, Hangzhou, China
Suqi Lou
Zhejiang University, Hangzhou, China
Xinli Chen
Zhejiang University, Hangzhou, China
Zhenghua Pan
Zhejiang University, Hangzhou, China
Yuping Jin
Zhejiang University, Hangzhou, China
Shi Chen
Zhejiang University, Hangzhou, China
Lingyun Sun
Zhejiang University, Hangzhou, China
論文URL

doi.org/10.1145/3613904.3642481

動画
Generating Automatic Feedback on UI Mockups with Large Language Models
要旨

Feedback on user interface (UI) mockups is crucial in design. However, human feedback is not always readily available. We explore the potential of using large language models for automatic feedback. Specifically, we focus on \changes{applying GPT-4 to automate heuristic evaluation}, which currently entails a human expert assessing a UI’s compliance with a set of design guidelines. We implemented a Figma plugin that takes in a UI design and a set of written heuristics, and renders automatically-generated feedback as constructive suggestions. We assessed performance on 51 UIs using three sets of guidelines, compared GPT-4-generated design suggestions with those from human experts, and conducted a study with 12 expert designers to understand fit with existing practice. We found that GPT-4-based feedback is useful for catching subtle errors, improving text, and considering UI semantics, but feedback also decreased in utility over iterations. Participants described several uses for this plugin despite its imperfect suggestions.

著者
Peitong Duan
UC Berkeley, Berkeley, California, United States
Jeremy Warner
UC Berkeley, Berkeley, California, United States
Yang Li
Google Research, Mountain View, California, United States
Bjoern Hartmann
UC Berkeley, Berkeley, California, United States
論文URL

doi.org/10.1145/3613904.3642782

動画
MUD: Towards a Large-Scale and Noise-Filtered UI Dataset for Modern Style UI Modeling
要旨

The importance of computational modeling of mobile user interfaces (UIs) is undeniable. However, these require a high-quality UI dataset. Existing datasets are often outdated, collected years ago, and are frequently noisy with mismatches in their visual representation. This presents challenges in modeling UI understanding in the wild. This paper introduces a novel approach to automatically mine UI data from Android apps, leveraging Large Language Models (LLMs) to mimic human-like exploration. To ensure dataset quality, we employ the best practices in UI noise filtering and incorporate human annotation as a final validation step. Our results demonstrate the effectiveness of LLMs-enhanced app exploration in mining more meaningful UIs, resulting in a large dataset MUD of 18k human-annotated UIs from 3.3k apps. We highlight the usefulness of MUD in two common UI modeling tasks: element detection and UI retrieval, showcasing its potential to establish a foundation for future research into high-quality, modern UIs.

著者
Sidong Feng
Monash University, Melbourne, Victoria, Australia
Suyu Ma
Monash University, Melbourne, VIC, Australia
Han Wang
Monash University, Melbourne, VIC, Australia
David Kong
Monash University, Melbourne, VIC, Australia
Chunyang Chen
Monash University, Melbourne, Victoria, Australia
論文URL

doi.org/10.1145/3613904.3642350

動画
Surveyor: Facilitating Discovery Within Video Games for Blind and Low Vision Players
要旨

Video games are increasingly accessible to blind and low vision (BLV) players, yet many aspects remain inaccessible. One aspect is the joy players feel when they explore environments and make new discoveries, which is integral to many games. Sighted players experience discovery by surveying environments and identifying unexplored areas. Current accessibility tools, however, guide BLV players directly to items and places, robbing them of that experience. Thus, a crucial challenge is to develop navigation assistance tools that also foster exploration and discovery. To address this challenge, we propose the concept of exploration assistance in games and design Surveyor, an in-game exploration assistance tool that enhances discovery by tracking where BLV players look and highlighting unexplored areas. We designed Surveyor using insights from a formative study and compared Surveyor's effectiveness to approaches found in existing accessible games. Our findings reveal implications for facilitating richer play experiences for BLV users within games.

著者
Vishnu Nair
Columbia University, New York, New York, United States
Hanxiu 'Hazel' Zhu
Columbia University, New York, New York, United States
Peize Song
Columbia University, New York, New York, United States
Jizhong Wang
Columbia University, New York, New York, United States
Brian A.. Smith
Columbia University, New York, New York, United States
論文URL

doi.org/10.1145/3613904.3642615

動画
OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs
要旨

The progression to "Pervasive Augmented Reality" envisions easy access to multimodal information continuously. However, in many everyday scenarios, users are occupied physically, cognitively or socially. This may increase the friction to act upon the multimodal information that users encounter in the world. To reduce such friction, future interactive interfaces should intelligently provide quick access to digital actions based on users' context. To explore the range of possible digital actions, we conducted a diary study that required participants to capture and share the media that they intended to perform actions on (e.g., images or audio), along with their desired actions and other contextual information. Using this data, we generated a holistic design space of digital follow-up actions that could be performed in response to different types of multimodal sensory inputs. We then designed \codename, a pipeline powered by large language models (LLMs) that processes multimodal sensory inputs and predicts follow-up actions on the target information grounded in the derived design space. Using the empirical data collected in the diary study, we performed quantitative evaluations on three variations of LLM techniques (intent classification, in-context learning and finetuning) and identified the most effective technique for our task. Additionally, as an instantiation of the pipeline, we developed an interactive prototype and reported preliminary user feedback about how people perceive and react to the action predictions and its errors.

著者
Jiahao Nick. Li
UCLA, Los Angeles, California, United States
Yan Xu
Meta, Redmond, Washington, United States
Tovi Grossman
University of Toronto, Toronto, Ontario, Canada
Stephanie Santosa
Facebook Reality Labs, Toronto, Ontario, Canada
Michelle Li
Reality Labs Research, Redmond, Washington, United States
論文URL

doi.org/10.1145/3613904.3642068

動画