Creating Inclusive Voices for the 21st Century: A Non-Binary Text-to-Speech for Conversational Assistants

As voice assistant usage continues to grow, their homogeneity becomes even more problematic with the UNESCO report, “I’d Blush if I could” showing that designing only feminine voice assistants encourages negative behavior, both with virtual assistants and with real people [3]. While masculine text-to-speech (TTS) voices exist, ones that cover the full range of gender presentations, such as non-binary or gender-ambiguous voices are largely missing. In this paper, we present a method of creating a non-binary TTS voice and an example voice, Sam, created with input from the non-binary and transgender communities. We have open-sourced the resulting voice, along with the process and data used to create it. Finally, we present results from a large-scale survey showing that non-binary individuals are more likely to prefer a non-binary voice assistant compared to cisgendered individuals and discuss differences across age and gender.

Accenture Labs, San Francisco, California, United States

Institute of Reading Development, Boston, Massachusetts, United States

Accenture Labs, San Francisco, California, United States

Accenture, San Francisco, California, United States

CereProc Ltd., Edinburgh, -Select-, United Kingdom

https://doi.org/10.1145/3544548.3581281

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)

Hall G2

6 件の発表

開始日時2023-04-24 23:30:00

終了日時2023-04-25 00:55:00

お気に入り