As voice assistant usage continues to grow, their homogeneity becomes even more problematic with the UNESCO report, “I’d Blush if I could” showing that designing only feminine voice assistants encourages negative behavior, both with virtual assistants and with real people [3]. While masculine text-to-speech (TTS) voices exist, ones that cover the full range of gender presentations, such as non-binary or gender-ambiguous voices are largely missing. In this paper, we present a method of creating a non-binary TTS voice and an example voice, Sam, created with input from the non-binary and transgender communities. We have open-sourced the resulting voice, along with the process and data used to create it. Finally, we present results from a large-scale survey showing that non-binary individuals are more likely to prefer a non-binary voice assistant compared to cisgendered individuals and discuss differences across age and gender.
https://doi.org/10.1145/3544548.3581281
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)