“It feels like we're not meeting the criteria": Examining and Mitigating the Cascading Effects of Bias in Automatic Speech Recognition in Spoken Language Interfaces.

要旨

Researchers have demonstrated that Automatic Speech Recognition (ASR) systems perform differently across demographic groups (i.e. show bias), yet their downstream impact on spoken language interfaces remains unexplored. We examined this question in the context of a real-world AI-powered interface that provides tutors with feedback on the quality of their discourse. We found that the Whisper ASR had lower accuracy for Black vs. white tutors, likely due to differences in acoustic patterns of speech. The downstream automated discourse classifiers of tutor talk were correspondingly less accurate for Black tutors when presented with ASR input. As a result, although Black tutors demonstrated higher-quality discourse on human transcripts, this trend was not evident on ASR transcripts. We experimented with methods to reduce ASR bias, finding that fine-tuning the ASR on Black speech reduced, but did not eliminate, ASR bias and its downstream effects. We discuss implications for AI-based spoken language interfaces aimed at providing unbiased assessments to improve performance outcomes.

受賞
Honorable Mention
著者
Kelechi Ezema
University of Colorado Boulder, Boulder, Colorado, United States
Chelsea Chandler
University of Colorado Boulder, Boulder, Colorado, United States
Rosy Southwell
University of Colorado Boulder, Boulder, Colorado, United States
Niranjan Cholendiran
University of Colorado Boulder, Boulder, Colorado, United States
Sidney D'Mello
University of Colorado Boulder, Boulder, Colorado, United States
DOI

10.1145/3706598.3714059

論文URL

https://dl.acm.org/doi/10.1145/3706598.3714059

動画

会議: CHI 2025

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2025.acm.org/)

セッション: Bias and Identity

G314+G315
7 件の発表
2025-04-29 20:10:00
2025-04-29 21:40:00
日本語まとめ
読み込み中…