The need to generate convincing simulation of voices often arises in the context of avatar therapy, a treatment approach for disorders such as schizophrenia. This treatment involves patients interacting with simulations of the entity they imagine to be responsible for the voices they hear, for which there is often no external reference available. However, in such scenarios, there is little knowledge of how to design and reproduce these voices in a convincing manner. Existing voice manipulation interfaces are often complex to use, and highly limited in their ability to modify vocal characteristics beyond small adjustments. To address these challenges, we designed a framework that allows users to explore and select from a large set of voices, and thereafter manipulate the voice(s) to converge towards an effective match for one they have in mind. We demonstrated both the usability and superior performance of this system compared to existing voice manipulation interfaces.
https://dl.acm.org/doi/abs/10.1145/3491102.3501871
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2022.acm.org/)