Scholars@Duke publication: A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT

A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT

Publication , Journal Article

Zhou, C; Li, Q; Li, C; Yu, J; Liu, Y; Wang, G; Zhang, K; Ji, C; Yan, Q; He, L; Peng, H; Li, J; Wu, J; Liu, Z; Xie, P; Xiong, C; Pei, J; Yu, PS; Sun, L

Published in: International Journal of Machine Learning and Cybernetics

January 1, 2024

Published version (DOI)

Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks across different data modalities. A PFM (e.g., BERT, ChatGPT, GPT-4) is trained on large-scale data, providing a solid parameter initialization for a wide range of downstream applications. In contrast to earlier methods that use convolution and recurrent modules for feature extraction, BERT learns bidirectional encoder representations from Transformers, trained on large datasets as contextual language models. Similarly, the Generative Pretrained Transformer (GPT) method employs Transformers as feature extractors and is trained on large datasets using an autoregressive paradigm. Recently, ChatGPT has demonstrated significant success in large language models, utilizing autoregressive language models with zero-shot or few-shot prompting. The remarkable success of PFMs has driven significant breakthroughs in AI, leading to numerous studies proposing various methods, datasets, and evaluation metrics, which increases the demand for an updated survey. This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, and other data modalities. It covers the basic components and existing pretraining methods used in natural language processing, computer vision, and graph learning, while also exploring advanced PFMs for different data modalities and unified PFMs that address data quality and quantity. Additionally, the review discusses key aspects such as model efficiency, security, and privacy, and provides insights into future research directions and challenges in PFMs. Overall, this survey aims to shed light on the research of the PFMs on scalability, security, logical reasoning ability, cross-domain learning ability, and user-friendly interactive ability for artificial general intelligence.

Duke Scholars

Author Jian Pei Computer Science

Published In

International Journal of Machine Learning and Cybernetics

DOI

10.1007/s13042-024-02443-6

EISSN

1868-808X

ISSN

1868-8071

Publication Date

January 1, 2024

Related Subject Headings

46 Information and computing sciences
40 Engineering
0801 Artificial Intelligence and Image Processing

Citation

APA

Chicago

ICMJE

MLA

NLM

Zhou, C., Li, Q., Li, C., Yu, J., Liu, Y., Wang, G., … Sun, L. (2024). A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT. International Journal of Machine Learning and Cybernetics. https://doi.org/10.1007/s13042-024-02443-6

Zhou, C., Q. Li, C. Li, J. Yu, Y. Liu, G. Wang, K. Zhang, et al. “A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT.” International Journal of Machine Learning and Cybernetics, January 1, 2024. https://doi.org/10.1007/s13042-024-02443-6.

Zhou C, Li Q, Li C, Yu J, Liu Y, Wang G, et al. A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT. International Journal of Machine Learning and Cybernetics. 2024 Jan 1;

Zhou, C., et al. “A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT.” International Journal of Machine Learning and Cybernetics, Jan. 2024. Scopus, doi:10.1007/s13042-024-02443-6.

Zhou C, Li Q, Li C, Yu J, Liu Y, Wang G, Zhang K, Ji C, Yan Q, He L, Peng H, Li J, Wu J, Liu Z, Xie P, Xiong C, Pei J, Yu PS, Sun L. A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT. International Journal of Machine Learning and Cybernetics. 2024 Jan 1;

Published In

International Journal of Machine Learning and Cybernetics

DOI

10.1007/s13042-024-02443-6

EISSN

1868-808X

ISSN

1868-8071

Publication Date

January 1, 2024

Related Subject Headings

46 Information and computing sciences
40 Engineering
0801 Artificial Intelligence and Image Processing