Skip to main content

Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion

Publication ,  Conference
Zhou, Z; Xu, Y; Li, M
Published in: Communications in Computer and Information Science
January 1, 2023

Textual escalation detection has been widely applied to e-commerce companies’ customer service systems to pre-alert and prevent potential conflicts. Similarly, acoustic-based escalation detection systems are also helpful in enhancing passengers’ safety and maintaining public order in public areas such as airports and train stations, where many impersonal conversations frequently occur. To this end, we introduce a multimodal system based on acoustic-linguistic features to detect escalation levels from human speech. Voice Activity Detection (VAD) and Label Smoothing are adopted to enhance the performance of this task further. Given the difficulty and high cost of data collection in open scenarios, the datasets we used in this task are subject to severe low resource constraints. To address this problem, we introduce transfer learning using a multi-corpus framework involving emotion detection datasets such as RAVDESS and CREMA-D to integrate emotion features into escalation signals representation learning. On the development set, our proposed system achieves 81.5% unweighted average recall (UAR), which significantly outperforms the baseline of 72.2%.

Duke Scholars

Published In

Communications in Computer and Information Science

DOI

EISSN

1865-0937

ISSN

1865-0929

Publication Date

January 1, 2023

Volume

1765 CCIS

Start / End Page

149 / 161
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhou, Z., Xu, Y., & Li, M. (2023). Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion. In Communications in Computer and Information Science (Vol. 1765 CCIS, pp. 149–161). https://doi.org/10.1007/978-981-99-2401-1_14
Zhou, Z., Y. Xu, and M. Li. “Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion.” In Communications in Computer and Information Science, 1765 CCIS:149–61, 2023. https://doi.org/10.1007/978-981-99-2401-1_14.
Zhou Z, Xu Y, Li M. Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion. In: Communications in Computer and Information Science. 2023. p. 149–61.
Zhou, Z., et al. “Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion.” Communications in Computer and Information Science, vol. 1765 CCIS, 2023, pp. 149–61. Scopus, doi:10.1007/978-981-99-2401-1_14.
Zhou Z, Xu Y, Li M. Detecting Escalation Level from Speech with Transfer Learning and Acoustic-Linguistic Information Fusion. Communications in Computer and Information Science. 2023. p. 149–161.

Published In

Communications in Computer and Information Science

DOI

EISSN

1865-0937

ISSN

1865-0929

Publication Date

January 1, 2023

Volume

1765 CCIS

Start / End Page

149 / 161