SEMOUR: Scripted EMOtional speech repository for URdu

要旨

Designing reliable Speech Emotion Recognition systems is a complex task that inevitably requires sufficient data for training purposes. Such extensive datasets are currently available in only a few languages, including English, German, and Italian. In this paper, we present SEMOUR, the first scripted database of emotion-tagged speech in the Urdu language, to design an Urdu Speech Recognition System. Our gender-balanced dataset contains 15,040 unique instances recorded by eight professional actors eliciting a syntactically complex script. The dataset is phonetically balanced, and reliably exhibits a varied set of emotions as marked by the high agreement scores among human raters in experiments. We also provide various baseline speech emotion prediction scores on the database, which could be used for various applications like personalized robot assistants, diagnosis of psychological disorders, and getting feedback from a low-tech-enabled population, etc. On a random test sample, our model correctly predicts an emotion with a state-of-the-art 92% accuracy.

著者
Nimra Zaheer
Information Technology University, Lahore, Punjab, Pakistan
Obaid Ullah Ahmad
Information Technology University, Lahore, Punjab, Pakistan
Ammar Ahmed
Information Technology University, Lahore, Pakistan
Muhammad Shehryar Khan
Information Technology University, Lahore, Punjab, Pakistan
Mudassir Shabbir
Information Technology University, Lahore, Punjab, Pakistan
DOI

10.1145/3411764.3445171

論文URL

https://doi.org/10.1145/3411764.3445171

動画

会議: CHI 2021

The ACM CHI Conference on Human Factors in Computing Systems (https://chi2021.acm.org/)

セッション: Computational Human-AI Conversation

[A] Paper Room 02, 2021-05-11 17:00:00~2021-05-11 19:00:00 / [B] Paper Room 02, 2021-05-12 01:00:00~2021-05-12 03:00:00 / [C] Paper Room 02, 2021-05-12 09:00:00~2021-05-12 11:00:00
Paper Room 02
14 件の発表
2021-05-11 17:00:00
2021-05-11 19:00:00
日本語まとめ
読み込み中…