Skip to main content

On Understanding Attention-Based In-Context Learning for Categorical Data

Publication ,  Conference
Wang, AT; Convertino, W; Cheng, X; Henao, R; Carin, L
Published in: Proceedings of Machine Learning Research
January 1, 2025

In-context learning based on attention models is examined for data with categorical outcomes, with inference in such models viewed from the perspective of functional gradient descent (GD). We develop a network composed of attention blocks, with each block employing a self-attention layer followed by a cross-attention layer, with associated skip connections. This model can exactly perform multi-step functional GD inference for in-context inference with categorical observations. We perform a theoretical analysis of this setup, generalizing many prior assumptions in this line of work, including the class of attention mechanisms for which it is appropriate. We demonstrate the framework empirically on synthetic data, image classification and language generation.

Duke Scholars

Published In

Proceedings of Machine Learning Research

EISSN

2640-3498

Publication Date

January 1, 2025

Volume

267

Start / End Page

62701 / 62728
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, A. T., Convertino, W., Cheng, X., Henao, R., & Carin, L. (2025). On Understanding Attention-Based In-Context Learning for Categorical Data. In Proceedings of Machine Learning Research (Vol. 267, pp. 62701–62728).
Wang, A. T., W. Convertino, X. Cheng, R. Henao, and L. Carin. “On Understanding Attention-Based In-Context Learning for Categorical Data.” In Proceedings of Machine Learning Research, 267:62701–28, 2025.
Wang AT, Convertino W, Cheng X, Henao R, Carin L. On Understanding Attention-Based In-Context Learning for Categorical Data. In: Proceedings of Machine Learning Research. 2025. p. 62701–28.
Wang, A. T., et al. “On Understanding Attention-Based In-Context Learning for Categorical Data.” Proceedings of Machine Learning Research, vol. 267, 2025, pp. 62701–28.
Wang AT, Convertino W, Cheng X, Henao R, Carin L. On Understanding Attention-Based In-Context Learning for Categorical Data. Proceedings of Machine Learning Research. 2025. p. 62701–62728.

Published In

Proceedings of Machine Learning Research

EISSN

2640-3498

Publication Date

January 1, 2025

Volume

267

Start / End Page

62701 / 62728