Skip to main content

Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition

Publication ,  Journal Article
Xu, H; Jin, X; Wang, Q; Hussain, A; Huang, K
Published in: ACM Transactions on Multimedia Computing Communications and Applications
October 6, 2022

Currently, many action recognition methods mostly consider the information from spatial streams. We propose a new perspective inspired by the human visual system to combine both spatial and temporal streams to measure their attention consistency. Specifically, a branch-independent convolutional neural network (CNN) based algorithm is developed with a novel attention-consistency loss metric, enabling the temporal stream to concentrate on consistent discriminative regions with the spatial stream in the same period. The consistency loss is further combined with the cross-entropy loss to enhance the visual attention consistency. We evaluate the proposed method for action recognition on two benchmark datasets: Kinetics400 and UCF101. Despite its apparent simplicity, our proposed framework with the attention consistency achieves better performance than most of the two-stream networks, i.e., 75.7% top-1 accuracy on Kinetics400 and 95.7% on UCF101, while reducing 7.1% computational cost compared with our baseline. Particularly, our proposed method can attain remarkable improvements on complex action classes, showing that our proposed network can act as a potential benchmark to handle complicated scenarios in industry 4.0 applications.

Duke Scholars

Published In

ACM Transactions on Multimedia Computing Communications and Applications

DOI

EISSN

1551-6865

ISSN

1551-6857

Publication Date

October 6, 2022

Volume

18

Issue

2 S

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4607 Graphics, augmented reality and games
  • 4606 Distributed computing and systems software
  • 4603 Computer vision and multimedia computation
  • 0806 Information Systems
  • 0805 Distributed Computing
  • 0803 Computer Software
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Xu, H., Jin, X., Wang, Q., Hussain, A., & Huang, K. (2022). Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition. ACM Transactions on Multimedia Computing Communications and Applications, 18(2 S). https://doi.org/10.1145/3538749
Xu, H., X. Jin, Q. Wang, A. Hussain, and K. Huang. “Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition.” ACM Transactions on Multimedia Computing Communications and Applications 18, no. 2 S (October 6, 2022). https://doi.org/10.1145/3538749.
Xu H, Jin X, Wang Q, Hussain A, Huang K. Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition. ACM Transactions on Multimedia Computing Communications and Applications. 2022 Oct 6;18(2 S).
Xu, H., et al. “Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition.” ACM Transactions on Multimedia Computing Communications and Applications, vol. 18, no. 2 S, Oct. 2022. Scopus, doi:10.1145/3538749.
Xu H, Jin X, Wang Q, Hussain A, Huang K. Exploiting Attention-Consistency Loss For Spatial-Temporal Stream Action Recognition. ACM Transactions on Multimedia Computing Communications and Applications. 2022 Oct 6;18(2 S).

Published In

ACM Transactions on Multimedia Computing Communications and Applications

DOI

EISSN

1551-6865

ISSN

1551-6857

Publication Date

October 6, 2022

Volume

18

Issue

2 S

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4607 Graphics, augmented reality and games
  • 4606 Distributed computing and systems software
  • 4603 Computer vision and multimedia computation
  • 0806 Information Systems
  • 0805 Distributed Computing
  • 0803 Computer Software