Towards AI-driven Sign Language Generation with Non-manual Markers

要旨

Sign languages are essential for the Deaf and Hard-of-Hearing (DHH) community. Sign language generation systems have the potential to support communication by translating from written languages, such as English, into signed videos. However, current systems often fail to meet user needs due to poor translation of grammatical structures, the absence of facial cues and body language, and insufficient visual and motion fidelity. We address these challenges by building on recent advances in LLMs and video generation models to translate English sentences into natural-looking AI ASL signers. The text component of our model extracts information for manual and non-manual components of ASL, which are used to synthesize skeletal pose sequences and corresponding video frames. Our findings from a user study with 30 DHH participants and thorough technical evaluations demonstrate significant progress and identify critical areas necessary to meet user needs.

受賞
Honorable Mention
著者
Han Zhang
University of Washington, Seattle, Washington, United States
Rotem Shalev-Arkushin
Tel Aviv University, Tel Aviv, Israel
Vasileios Baltatzis
Apple, Cupertino, California, United States
Connor Gillis
Apple, Cupertino, California, United States
Gierad Laput
Apple Inc., Cupertino , California, United States
Raja Kushalnagar
Gallaudet University, Washington, District of Columbia, United States
Lorna C. Quandt
Gallaudet University, Washington , District of Columbia, United States
Leah Findlater
Apple, Cupertino, California, United States
Abdelkareem Bedri
Apple Inc., Cupertino, California, United States
Colin Lea
Apple, Cupertino, California, United States
DOI

10.1145/3706598.3713855

論文URL

https://dl.acm.org/doi/10.1145/3706598.3713855

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Design for Different User Needs

G318+G319
7 件の発表
2025-04-30 18:00:00
2025-04-30 19:30:00
日本語まとめ
読み込み中…