Typical foreign language (L2) pronunciation training focuses mainly on individual sounds. Intonation, the patterns of pitch change across words or phrases is often neglected, despite its key role in word-level intelligibility and in the expression of attitudes and affect. This paper examines hand-controlled real-time vocal synthesis, known as Performative Vocal Synthesis (PVS), as an interaction technique for practicing L2 intonation in computer aided pronunciation training (CAPT). We evaluate a tablet-based interface where users gesturally control the pitch of a pre-recorded utterance by drawing curves on the touchscreen. 24 subjects (12 French learners, 12 British controls) imitated English phrases with their voice and the interface. Results of an acoustic analysis and expert perceptive evaluation showed that learners’ gestural imitations yielded more accurate results than vocal imitations of the fall-rise intonation pattern typically difficult for francophones, suggesting that PVS can help learners produce intonation patterns beyond the capabilities of their natural voice.
https://doi.org/10.1145/3544548.3581210
The ACM CHI Conference on Human Factors in Computing Systems (https://chi2023.acm.org/)