Creating Inclusive Voices for the 21st Century: A Non-Binary Text-to-Speech for Conversational Assistants

要旨

As voice assistant usage continues to grow, their homogeneity becomes even more problematic with the UNESCO report, “I’d Blush if I could” showing that designing only feminine voice assistants encourages negative behavior, both with virtual assistants and with real people [3]. While masculine text-to-speech (TTS) voices exist, ones that cover the full range of gender presentations, such as non-binary or gender-ambiguous voices are largely missing. In this paper, we present a method of creating a non-binary TTS voice and an example voice, Sam, created with input from the non-binary and transgender communities. We have open-sourced the resulting voice, along with the process and data used to create it. Finally, we present results from a large-scale survey showing that non-binary individuals are more likely to prefer a non-binary voice assistant compared to cisgendered individuals and discuss differences across age and gender.

著者
Andreea Danielescu
Accenture Labs, San Francisco, California, United States
Sharone A. Horowit-Hendler
Institute of Reading Development, Boston, Massachusetts, United States
Alexandria Pabst
Accenture Labs, San Francisco, California, United States
Kenneth Michael. Stewart
Accenture Labs, San Francisco, California, United States
Eric M. Gallo
Accenture, San Francisco, California, United States
Matthew Peter. Aylett
CereProc Ltd., Edinburgh, -Select-, United Kingdom
論文URL

https://doi.org/10.1145/3544548.3581281

動画

会議: CHI 2023

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

セッション: Inclusive Futures

Hall G2
6 件の発表
2023-04-24 23:30:00
2023-04-25 00:55:00