Scholars@Duke publication: SpanPredict: Extraction of Predictive Document Spans with Neural Attention

SpanPredict: Extraction of Predictive Document Spans with Neural Attention

Publication , Conference

Subramanian, V; Engelhard, M; Berchuck, S; Chen, L; Henao, R; Carin, L

Published in: Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference

January 1, 2021

Published version (DOI)

In many natural language processing applications, identifying predictive text can be as important as the predictions themselves. When predicting medical diagnoses, for example, identifying predictive content in clinical notes not only enhances interpretability, but also allows unknown, descriptive (i.e., text-based) risk factors to be identified. We here formalize this problem as predictive extraction and address it using a simple mechanism based on linear attention. Our method preserves differentiability, allowing scalable inference via stochastic gradient descent. Further, the model decomposes predictions into a sum of contributions of distinct text spans. Importantly, we require only document labels, not ground-truth spans. Results show that our model identifies semantically-cohesive spans and assigns them scores that agree with human ratings, while preserving classification performance.

Duke Scholars

Author Samuel Berchuck Biostatistics & Bioinformatics, Division of Translational Bi ...

Author Matthew M. Engelhard Biostatistics & Bioinformatics, Division of Translational Bi ...

Author Ricardo Henao Biostatistics & Bioinformatics, Division of Translational Bi ...

Author Lawrence Carin Electrical and Computer Engineering

Published In

Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference

DOI

10.18653/v1/2021.naacl-main.413

Publication Date

January 1, 2021

Start / End Page

5234 / 5258

Citation

APA

Chicago

ICMJE

MLA

NLM

Subramanian, V., Engelhard, M., Berchuck, S., Chen, L., Henao, R., & Carin, L. (2021). SpanPredict: Extraction of Predictive Document Spans with Neural Attention. In Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference (pp. 5234–5258). https://doi.org/10.18653/v1/2021.naacl-main.413

Subramanian, V., M. Engelhard, S. Berchuck, L. Chen, R. Henao, and L. Carin. “SpanPredict: Extraction of Predictive Document Spans with Neural Attention.” In Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference, 5234–58, 2021. https://doi.org/10.18653/v1/2021.naacl-main.413.

Subramanian V, Engelhard M, Berchuck S, Chen L, Henao R, Carin L. SpanPredict: Extraction of Predictive Document Spans with Neural Attention. In: Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference. 2021. p. 5234–58.

Subramanian, V., et al. “SpanPredict: Extraction of Predictive Document Spans with Neural Attention.” Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference, 2021, pp. 5234–58. Scopus, doi:10.18653/v1/2021.naacl-main.413.

Subramanian V, Engelhard M, Berchuck S, Chen L, Henao R, Carin L. SpanPredict: Extraction of Predictive Document Spans with Neural Attention. Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference. 2021. p. 5234–5258.

Published In

Naacl Hlt 2021 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference

DOI

10.18653/v1/2021.naacl-main.413

Publication Date

January 1, 2021

Start / End Page

5234 / 5258