Journal ArticleProc Conf Assoc Comput Linguist Meet · July 2025
Transformer-based models have achieved state-of-the-art performance in document classification but struggle with long-text processing due to the quadratic computational complexity in the self-attention module. Existing solutions, such as sparse attention, ...
Open AccessLink to itemCite