DIS-NV Functions for the Recognition of Emotions in Spoken Dialogue
Divya Gupta1, Poonam Bansal2, Kavita Choudhary3

1Divya Gupta*, CSE Department, Jagan Nath University, Jaipur, India.
2Poonam Bansal, CSE Department, GGSIPU University Delhi India.
3Kavita Choudhary, Jagan Nath University, Jaipur India.

Manuscript received on September 09, 2020. | Revised Manuscript received on September 12, 2020. | Manuscript published on October 10, 2020. | PP: 39-44 | Volume-9 Issue-12, October 2020 | Retrieval Number: 100.1/ijitee.L79111091220 | DOI: 10.35940/ijitee.L7911.1091220
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: We present our studies on the use of characteristics that describe the occurrences of DIS fluence and nonverbal vocalization (DIS-NV) in spoken expressions for the recognition of emotions in 0″turn” to denote the continuous speech made by one speaker without interrupting the other speaker. Note that each speaker tower can contain one or more declarations, and consecutive speaker declarations may or may not belong to the same speaker tour. Here, our definition of speaker tower focuses on feeling and integrity in speech production, which differs from “tower” in the context of a tower system, which focuses on the transition between different speakers. We carried out experiments in the spontaneous dialogue database AVEC2012 to study the effectiveness of the proposed work. Our results show that our DIS-NV functions offer better performance than LLD or PMI functions in predicting all emotional dimensions. The DIS-NV characteristics are particularly predictive of the emotional dimension Waiting linked to the speaker’s uncertainty and allow the best reported result to be obtained. The emotion recognition model using only the 5 DIS-NV functions achieved overall performance linked to the best reported result obtained by a multimodal emotion recognition model using thousands of audiovisual and lexical functionalities. These results confirmed that the proposed characteristics of DIS-NV are predictive of emotions in spontaneous dialogue. 
Keywords: DIS-NV, Spontaneous dialogue, DIS fluence, Cross-correlation score.