A New Uncanny Valley? The Effects of Speech Fidelity and Human Listener Gender on Social Perceptions of a Virtual-Human Speaker

要旨

Virtual humans can be used to deliver persuasive arguments; yet, those with synthetic text-to-speech (TTS) have been perceived less favorably than those with recorded human speech. In this paper, we investigate standard concatenative TTS and more advanced neural TTS. We conducted a 3x2 between-subjects experiment (n=79) to evaluate the effect of a virtual human’s speech fidelity at three levels (Standard TTS, Neural TTS, and Human speech) and the listener’s gender (male or female) on perceptions and persuasion. We found that the virtual human was perceived as significantly less trustworthy by both genders, if they used neural TTS compared to human speech, while male listeners (but not females) also perceived standard TTS as less trustworthy than human speech. Our findings indicate that neural TTS may not be an effective choice for persuasive virtual humans and that gender of the listener plays a role in how virtual humans are perceived.

著者
Tiffany D.. Do
University of Central Florida, Orlando, Florida, United States
Ryan P. McMahan
University of Central Florida, Orlando, Florida, United States
Pamela J.. Wisniewski
University of Central Florida, Orlando, Florida, United States
論文URL

https://dl.acm.org/doi/abs/10.1145/3491102.3517564

動画

会議: CHI 2022

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)

セッション: Virtual Agents and Environments

291
5 件の発表
2022-05-02 20:00:00
2022-05-02 21:15:00