The conflict between the rapid iteration demand of prototyping and the time-consuming nature of user tests has led researchers to adopt AI methods to identify usability issues. However, these AI-driven methods concentrate on evaluating the feasibility of a system, while often overlooking the influence of specified user characteristics and usage contexts. Our work proposes a tool named SimUser based on large language models (LLMs) with the Chain-of-Thought structure and user modeling method. It generates usability feedback by simulating the interaction between users and applications, which is influenced by user characteristics and contextual factors. The empirical study (48 human users and 21 designers) validated that in the context of a simple smartwatch interface, SimUser could generate heuristic usability feedback with the similarity varying from 35.7% to 100% according to the user groups and usability category. Our work provides insights into simulating users by LLM to improve future design activities.
https://doi.org/10.1145/3613904.3642481
Feedback on user interface (UI) mockups is crucial in design. However, human feedback is not always readily available. We explore the potential of using large language models for automatic feedback. Specifically, we focus on \changes{applying GPT-4 to automate heuristic evaluation}, which currently entails a human expert assessing a UI’s compliance with a set of design guidelines. We implemented a Figma plugin that takes in a UI design and a set of written heuristics, and renders automatically-generated feedback as constructive suggestions. We assessed performance on 51 UIs using three sets of guidelines, compared GPT-4-generated design suggestions with those from human experts, and conducted a study with 12 expert designers to understand fit with existing practice. We found that GPT-4-based feedback is useful for catching subtle errors, improving text, and considering UI semantics, but feedback also decreased in utility over iterations. Participants described several uses for this plugin despite its imperfect suggestions.
https://doi.org/10.1145/3613904.3642782
The importance of computational modeling of mobile user interfaces (UIs) is undeniable. However, these require a high-quality UI dataset. Existing datasets are often outdated, collected years ago, and are frequently noisy with mismatches in their visual representation. This presents challenges in modeling UI understanding in the wild. This paper introduces a novel approach to automatically mine UI data from Android apps, leveraging Large Language Models (LLMs) to mimic human-like exploration. To ensure dataset quality, we employ the best practices in UI noise filtering and incorporate human annotation as a final validation step. Our results demonstrate the effectiveness of LLMs-enhanced app exploration in mining more meaningful UIs, resulting in a large dataset MUD of 18k human-annotated UIs from 3.3k apps. We highlight the usefulness of MUD in two common UI modeling tasks: element detection and UI retrieval, showcasing its potential to establish a foundation for future research into high-quality, modern UIs.
https://doi.org/10.1145/3613904.3642350
Video games are increasingly accessible to blind and low vision (BLV) players, yet many aspects remain inaccessible. One aspect is the joy players feel when they explore environments and make new discoveries, which is integral to many games. Sighted players experience discovery by surveying environments and identifying unexplored areas. Current accessibility tools, however, guide BLV players directly to items and places, robbing them of that experience. Thus, a crucial challenge is to develop navigation assistance tools that also foster exploration and discovery. To address this challenge, we propose the concept of exploration assistance in games and design Surveyor, an in-game exploration assistance tool that enhances discovery by tracking where BLV players look and highlighting unexplored areas. We designed Surveyor using insights from a formative study and compared Surveyor's effectiveness to approaches found in existing accessible games. Our findings reveal implications for facilitating richer play experiences for BLV users within games.
https://doi.org/10.1145/3613904.3642615
The progression to "Pervasive Augmented Reality" envisions easy access to multimodal information continuously. However, in many everyday scenarios, users are occupied physically, cognitively or socially. This may increase the friction to act upon the multimodal information that users encounter in the world. To reduce such friction, future interactive interfaces should intelligently provide quick access to digital actions based on users' context. To explore the range of possible digital actions, we conducted a diary study that required participants to capture and share the media that they intended to perform actions on (e.g., images or audio), along with their desired actions and other contextual information. Using this data, we generated a holistic design space of digital follow-up actions that could be performed in response to different types of multimodal sensory inputs. We then designed \codename, a pipeline powered by large language models (LLMs) that processes multimodal sensory inputs and predicts follow-up actions on the target information grounded in the derived design space. Using the empirical data collected in the diary study, we performed quantitative evaluations on three variations of LLM techniques (intent classification, in-context learning and finetuning) and identified the most effective technique for our task. Additionally, as an instantiation of the pipeline, we developed an interactive prototype and reported preliminary user feedback about how people perceive and react to the action predictions and its errors.
https://doi.org/10.1145/3613904.3642068