Skip to main content
Journal cover image

A New Time–Frequency Attention Tensor Network for Language Identification

Publication ,  Journal Article
Miao, X; McLoughlin, I; Yan, Y
Published in: Circuits Systems and Signal Processing
May 1, 2020

In this paper, we aim to improve traditional DNN x-vector language identification performance by employing wide residual networks (WRN) as a powerful feature extractor which we combine with a novel frequency attention network. Compared with conventional time attention, our method learns discriminative weights for different frequency bands to generate weighted means and standard deviations for utterance-level classification. This mechanism enables the architecture to direct attention to important frequency bands rather than important time frames, as in traditional time attention methods. Furthermore, we then introduce a cross-layer frequency attention tensor network (CLF-ATN) which exploits information from different layers to recapture frame-level language characteristics that have been dropped by aggressive frequency pooling in lower layers. This effectively restores fine-grained discriminative language details. Finally, we explore the joint fusion of frame-level and frequency-band attention in a time–frequency attention network. Experimental results show that firstly, WRN can significantly outperform a traditional DNN x-vector implementation; secondly, the proposed frequency attention method is more effective than time attention; and thirdly, frequency–time score fusion can yield further improvement. Finally, extensive experiments on CLF-ATN demonstrate that it is able to improve discrimination by regaining dropped fine-grained frequency information, particularly for low-dimension frequency features.

Duke Scholars

Published In

Circuits Systems and Signal Processing

DOI

EISSN

1531-5878

ISSN

0278-081X

Publication Date

May 1, 2020

Volume

39

Issue

5

Start / End Page

2744 / 2758

Related Subject Headings

  • Industrial Engineering & Automation
  • 4009 Electronics, sensors and digital hardware
  • 0906 Electrical and Electronic Engineering
  • 0102 Applied Mathematics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Miao, X., McLoughlin, I., & Yan, Y. (2020). A New Time–Frequency Attention Tensor Network for Language Identification. Circuits Systems and Signal Processing, 39(5), 2744–2758. https://doi.org/10.1007/s00034-019-01286-9
Miao, X., I. McLoughlin, and Y. Yan. “A New Time–Frequency Attention Tensor Network for Language Identification.” Circuits Systems and Signal Processing 39, no. 5 (May 1, 2020): 2744–58. https://doi.org/10.1007/s00034-019-01286-9.
Miao X, McLoughlin I, Yan Y. A New Time–Frequency Attention Tensor Network for Language Identification. Circuits Systems and Signal Processing. 2020 May 1;39(5):2744–58.
Miao, X., et al. “A New Time–Frequency Attention Tensor Network for Language Identification.” Circuits Systems and Signal Processing, vol. 39, no. 5, May 2020, pp. 2744–58. Scopus, doi:10.1007/s00034-019-01286-9.
Miao X, McLoughlin I, Yan Y. A New Time–Frequency Attention Tensor Network for Language Identification. Circuits Systems and Signal Processing. 2020 May 1;39(5):2744–2758.
Journal cover image

Published In

Circuits Systems and Signal Processing

DOI

EISSN

1531-5878

ISSN

0278-081X

Publication Date

May 1, 2020

Volume

39

Issue

5

Start / End Page

2744 / 2758

Related Subject Headings

  • Industrial Engineering & Automation
  • 4009 Electronics, sensors and digital hardware
  • 0906 Electrical and Electronic Engineering
  • 0102 Applied Mathematics