Accessible Content Creation

https://doi.org/10.1145/3411764.3445416

The development of accurate machine learning models for sign languages like American Sign Language (ASL) has the potential to break down communication barriers for deaf signers. However, to date, no such models have been robust enough for real-world use. The primary barrier to enabling real-world applications is the lack of appropriate training data. Existing training sets suffer from several shortcomings: small size, limited signer diversity, lack of real-world settings, and missing or inaccurate labels. In this work, we present ASL Sea Battle, a sign language game designed to collect datasets that overcome these barriers, while also providing fun and education to users. We conduct a user study to explore the data quality that the game collects, and the user experience of playing the game. Our results suggest that ASL Sea Battle can reliably collect and label real-world sign language videos, and provides fun and education at the expense of data throughput.

Microsoft Research, Cambridge, Massachusetts, United States

Boston University, Boston, Massachusetts, United States

Northeastern University, Boston, Massachusetts, United States

Boston University, Boston, Massachusetts, United States

Microsoft, Cambridge, Massachusetts, United States

Microsoft Research, Cambridge, Massachusetts, United States

10.1145/3411764.3445416

https://doi.org/10.1145/3411764.3445233

Videos on sites like YouTube have become a primary source for information online. User-generated videos almost universally lack audio descriptions, making most videos inaccessible to blind and visually impaired (BVI) consumers. Our formative studies with BVI people revealed that they used a time-consuming trial-and-error approach when searching for videos: clicking on a video, watching a portion, leaving the video, and repeating the process to find videos that would be accessible — or understandable without additional description of the visual content. BVI people also reported video accessibility heuristics that characterize accessible and inaccessible videos. We instantiate 7 of the identified heuristics (2 audio-related, 2 video-related, and 3 audio-visual) as automated metrics to assess video accessibility. Our automated video accessibility metrics correlate with BVI people’s perception of video accessibility (Adjusted R-squared = 0.642). We augment a video search interface with our video accessibility metrics and find that our system improves BVI peoples’ efficiency in finding accessible videos. With accessibility metrics, participants found videos 40% faster and clicked 54% less videos in our user study. By integrating video accessibility metrics, video hosting platforms could help people surface accessible videos and encourage content creators to author more accessible products, improving video accessibility for all.

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

UCLA, Los Angeles, California, United States

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

10.1145/3411764.3445233

https://doi.org/10.1145/3411764.3445510

Despite recent improvements in online accessibility, the Internet remains an inhospitable place for users with photosensitive epilepsy, a chronic condition in which certain light stimuli can trigger seizures and even lead to death. In this paper, we explore how current risk detection systems have allowed attackers to take advantage of design oversights and target vulnerable users with photosensitivity on popular social media platforms. Through interviews with photosensitive individuals and a critical review of existing systems, we construct design requirements for consumer-driven protective systems and developed a prototype browser extension for actively detecting and disarming potentially seizure-inducing GIFs. We validate our system with a comprehensive dataset of simulated and collected GIFs. Finally, we conduct a novel quantitative analysis of the prevalence of seizure-inducing GIFs across popular social media platforms and contribute recommendations for improving online accessibility for individuals with photosensitivity. All study materials are available at https://osf.io/5a3dy/.

Northeastern University, Boston, Massachusetts, United States

10.1145/3411764.3445510

https://doi.org/10.1145/3411764.3445455

For 15% of the world population with disabilities, accessibility is arguably the most critical software quality attribute. The ever-growing reliance of users with disability on mobile apps further underscores the need for accessible software in this domain. Existing automated accessibility assessment techniques primarily aim to detect violations of predefined guidelines, thereby produce a massive amount of accessibility warnings that often overlook the way software is actually used by users with disability. This paper presents a novel, high-fidelity form of accessibility testing for Android apps, called Latte, that automatically reuses tests written to evaluate an app's functional correctness to assess its accessibility as well. Latte first extracts the use case corresponding to each test, and then executes each use case in the way disabled users would, i.e., using assistive services. Our empirical evaluation on real-world Android apps demonstrates Latte's effectiveness in detecting substantially more useful defects than prior techniques.

University of California, Irvine, Irvine, California, United States

University Of California, Irvine, Irvine, California, United States

University of California, Irvine, Irvine, California, United States

University of California Irvine, Irvine, California, United States

10.1145/3411764.3445455

https://doi.org/10.1145/3411764.3445572

Presenters commonly use slides as visual aids for informative talks.When presenters fail to verbally describe the content on their slides,blind and visually impaired audience members lose access to necessary content, making the presentation difficult to follow. Our analysis of 90 existing presentation videos revealed that 72% of 610 visual elements (e.g., images, text) were insufficiently described. To help presenters create accessible presentations, we introduce Presentation A11y, a system that provides real-time and post-presentation accessibility feedback. Our system analyzes visual elements on the slide and the transcript of the verbal presentation to provide element-level feedback on what visual content needs to be further described or even removed. Presenters using our system with their own slide-based presentations described more of the content on their slides, and identified 3.26 times more accessibility problems to fix after the talk than when using a traditional slide-based presentation interface. Integrating accessibility feedback into content creation tools will improve the accessibility of informational content for all.

Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

10.1145/3411764.3445572

https://doi.org/10.1145/3411764.3445347

Video accessibility is essential for people with visual impairments. Audio descriptions describe what is happening on-screen, e.g., physical actions, facial expressions, and scene changes. Generating high-quality audio descriptions requires a lot of manual description generation. To address this accessibility obstacle, we built a system that analyzes the audiovisual contents of a video and generates the audio descriptions. The system consisted of three modules: AD insertion time prediction, AD generation, and AD optimization. We evaluated the quality of our system on five types of videos by conducting qualitative studies with 20 sighted users and 12 users who were blind or visually impaired. Our findings revealed how audio description preferences varied with user types and video types. Based on our study's analysis, we provided recommendations for the development of future audio description generation technologies.

Beijing Institute of Technology, Beijing, China

George Mason University, Fairfax, Virginia, United States

University of Massachusetts Boston, Boston, Massachusetts, United States

Adobe Research, Seattle, Washington, United States

George Mason University, Fairfax, Virginia, United States

10.1145/3411764.3445347

https://doi.org/10.1145/3411764.3445207

This paper presents a systematic literature review of 292 publications from 97 unique venues on touch-based graphics for people who are blind or have low vision, from 2010 to mid-2020. It is the first review of its kind on touch-based accessible graphics. It is timely because it allows us to assess the impact of new technologies such as commodity 3D printing and low-cost electronics on the production and presentation of accessible graphics. As expected our review shows an increase in publications from 2014 that we can attribute to these developments. It also reveals the need to: broaden application areas, especially to the workplace; broaden end-user participation throughout the full design process; and conduct more in situ evaluation. This work is linked to an online living resource to be shared with the wider community.

Monash University, Melbourne, Australia

Monash University, Melbourne, VIC, Australia

Monash University, Melbourne, Victoria, Australia

Monash University, Melbourne, Australia

10.1145/3411764.3445207

https://doi.org/10.1145/3411764.3445038

Research has explored using Automatic Text Simplification for reading assistance, with prior work identifying benefits and interestsfrom Deaf and Hard-of-Hearing (DHH) adults. While the evaluation of these technologies remains a crucial aspect of research inthe area, researchers lack guidance in terms of how to evaluate text complexity with DHH readers. Thus, in this work we conductmethodological research to evaluate metrics identified from prior work (including reading speed, comprehension questions, andsubjective judgements of understandability and readability) in terms of their effectiveness for evaluating texts modified to be atvarious complexity levels with DHH adults at different literacy levels. Subjective metrics and low-linguistic-complexity comprehensionquestions distinguished certain text complexity levels with participants with lower literacy. Among participants with higher literacy,only subjective judgements of text readability distinguished certain text complexity levels. For all metrics, participants with higherliteracy scored higher or provided more positive subjective judgements overall.

Rochester Institute of Technology, Rochester, New York, United States

10.1145/3411764.3445038

https://doi.org/10.1145/3411764.3445186

Many accessibility features available on mobile platforms require applications (apps) to provide complete and accurate metadata describing user interface (UI) components. Unfortunately, many apps do not provide sufficient metadata for accessibility features to work as expected. In this paper, we explore inferring accessibility metadata for mobile apps from their pixels, as the visual interfaces often best reflect an app's full functionality. We trained a robust, fast, memory-efficient, on-device model to detect UI elements using a dataset of 77,637 screens (from 4,068 iPhone apps) that we collected and annotated. To further improve UI detections and add semantic information, we introduced heuristics (e.g., UI grouping and ordering) and additional models (e.g., recognize UI content, state, interactivity). We built Screen Recognition to generate accessibility metadata to augment iOS VoiceOver. In a study with 9 screen reader users, we validated that our approach improves the accessibility of existing mobile apps, enabling even previously inaccessible apps to be used.

Apple Inc, Seattle, Washington, United States

Apple Inc, Pittsburgh, Pennsylvania, United States

Apple Inc, Seattle, Washington, United States

Apple Inc, San Diego, California, United States

Apple Inc, Pittsburgh, Pennsylvania, United States

Apple Inc, Cupertino, California, United States

Apple Inc, Pittsburgh, Pennsylvania, United States

10.1145/3411764.3445186