Fingerspelling is a critical part of American Sign Language (ASL) recognition and has become an accessible optional text entry method for Deaf and Hard of Hearing (DHH) individuals. In this paper, we introduce SpellRing, a single smart ring worn on the thumb that recognizes words continuously fingerspelled in ASL. SpellRing uses active acoustic sensing (via a microphone and speaker) and an inertial measurement unit (IMU) to track handshape and movement, which are processed through a deep learning algorithm using Connectionist Temporal Classification (CTC) loss. We evaluated the system with 20 ASL signers (13 fluent and 7 learners), using the MacKenzie-Soukoref Phrase Set of 1,164 words and 100 phrases. Offline evaluation yielded top-1 and top-5 word recognition accuracies of 82.45% (±9.67%) and 92.42% (±5.70%), respectively. In real-time, the system achieved a word error rate (WER) of 0.099 (±0.039) on the phrases. Based on these results, we discuss key lessons and design implications for future minimally obtrusive ASL recognition wearables.
Generative AI (GenAI) tools promise to advance non-visual information access but introduce new challenges due to output errors, hallucinations, biases, and constantly changing capabilities. Through interviews with 20 blind screen reader users who use various GenAI applications for diverse tasks, we show how they approached information access with everyday uncertainty, or a mindset of skepticism and criticality towards both AI- and human-mediated assistance as well as information itself. Instead of expecting information to be 'correct' and 'complete', participants extracted cues from error-prone information sources; treated all information as tentative; acknowledged and explored information subjectivity; and constantly adjusted their expectations and strategies considering the politics around access. The concept of everyday uncertainty situates GenAI tools among the interconnected assistive applications, humans, and sociomaterial conditions that both enable and hinder the ongoing production of access. We discuss the implications of everyday uncertainty for future design and research.
Large multimodal models (LMMs) have enabled new AI-powered applications that help people with visual impairments (PVI) receive natural language descriptions of their surroundings through audible text. We investigated how this emerging paradigm of visual assistance transforms how PVI perform and manage their daily tasks. Moving beyond usability assessments, we examined both the capabilities and limitations of LMM-based tools in personal and social contexts, while exploring design implications for their future development. Through interviews with 14 visually impaired users of Be My AI (an LMM-based application) and analysis of its image descriptions from both study participants and social media platforms, we identified two key limitations. First, these systems' context awareness suffers from hallucinations and misinterpretations of social contexts, styles, and human identities. Second, their intent-oriented capabilities often fail to grasp and act on users' intentions. Based on these findings, we propose design strategies for improving both human-AI and AI-AI interactions, contributing to the development of more effective, interactive, and personalized assistive technologies.
This deaf-led work critically explores Deaf Tech, challenging conventional understandings of technologies 'for' deaf people as merely assistive and accessible, since these understandings are predominantly embedded in medical and audist ideologies. By employing participatory speculative workshops, deaf participants from different European countries envisioned technologies on Eyeth - a mythical planet inhabited by deaf people - centered on their perspectives and curiosities. The results present a series of alternative socio-technical narratives that illustrate qualitative aspects of technologies desired by deaf people. This study advocates for expanding the scope of deaf technological landscapes, emphasizing the needs of establishing deaf-centered HCI, including the development of methods and concepts that truly prioritize deaf experiences in the design of technologies intended for their use.
Despite the prevalence of autism spectrum disorder (ASD) and other developmental disabilities (DD) worldwide, children with ASD and DD face tremendous difficulties receiving support due to physical, financial, and psychological barriers to onsite health and education clinics. As a result, researchers and practitioners have designed software solutions aimed at providing accessible support to meet users’ needs. However, we have limited knowledge of whether these solutions indeed work in real-world settings. To address this gap, we conducted a case study on a cognitive training program called Dubupang, designed by Dubu Inc. From in-depth interviews with multiple stakeholders and field observations of children with ASD and DD, we identify Dubu Inc.’s internal development processes, the critical design issues that emerged through a series of field trials (e.g., instructional design and feedback), and the key implications (e.g., importance of caregivers’ strategic human interventions) for design that better supports both children with ASD and DD and their caregivers.
Video components are a central element of user interfaces that deliver content in a signed language (SL), but the potential of video components extends beyond content accessibility. SL videos may be designed as user interface elements: layered with interactive features to create navigation cues, page headings, and menu options. To be effective for signing users, novel SL video-rich interfaces require informed design choices across many parameters. To align with the specific needs and shared conventions of the Deaf community and other ASL-signers in this context, we present a user study involving deaf ASL-signers who interacted with an array of designs for SL video elements. Their responses offer some insights into how the Deaf community may perceive and prefer video elements to be designed, positioned, and implemented to guide user experiences.
Through a qualitative analysis, we take initial steps toward understanding deaf ASL-signers’ perceptions of a set of emerging design principles, paving the way for future SL-centric user interfaces containing customized video elements and layouts with primary consideration for signed language-related usage and requirements.
This paper explores a multimodal approach for translating emotional cues present in speech, designed with Deaf and Hard-of-Hearing (DHH) individuals in mind. Prior work has focused on visual cues applied to captions, successfully conveying whether a speaker's words have a negative or positive tone (valence), but with mixed results regarding the intensity (arousal) of these emotions. We propose a novel method using haptic feedback to communicate a speaker's arousal levels through vibrations on a wrist-worn device. In a formative study with 16 DHH participants, we tested six haptic patterns and found that participants preferred single per-word vibrations at 75 Hz to encode arousal. In a follow-up study with 27 DHH participants, this pattern was paired with visual cues, and narrative engagement with audio-visual content was measured. Results indicate that combining haptics with visuals significantly increased engagement compared to a conventional captioning baseline and a visuals-only affective captioning style.