Journal ArticleProceedings of the conference. Association for Computational Linguistics. Meeting · July 2025
Transformer-based models have achieved state-of-the-art performance in document classification but struggle with long-text processing due to the quadratic computational complexity in the self-attention module. Existing solutions, such as sparse attention, ...
Open AccessCite