Scholars@Duke publication: TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING

TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING

Publication , Conference

Zhang, J; Vahidian, S; Kuo, M; Li, C; Zhang, R; Yu, T; Wang, G; Chen, Y

Published in: ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings

January 1, 2024

While”instruction-tuned” generative large language models (LLMs) have demonstrated an impressive ability to generalize to new tasks, the training phases heavily rely on large amounts of diverse and high-quality instruction data (such as ChatGPT and GPT-4). Unfortunately, acquiring high-quality data, especially when it comes to human-written data, can pose significant challenges both in terms of cost and accessibility. Moreover, concerns related to privacy can further limit access to such data, making the process of obtaining it a complex and nuanced undertaking. To tackle this issue, our study introduces a new approach called Federated Instruction Tuning (FedIT), which leverages federated learning (FL) as the learning framework for the instruction tuning of LLMs. This marks the first exploration of FL-based instruction tuning for LLMs. This is especially important since text data is predominantly generated by end users. For example, collecting extensive amounts of everyday user conversations can be a useful approach to improving the generalizability of LLMs, allowing them to generate authentic and natural responses. Therefore, it is imperative to design and adapt FL approaches to effectively leverage these users’ diverse instructions stored on local devices while mitigating concerns related to the data sensitivity and the cost of data transmission. In this study, we leverage extensive qualitative analysis, including the prevalent GPT-4 auto-evaluation to illustrate how our FedIT framework enhances the performance of LLMs. Utilizing diverse instruction sets on the client side, FedIT outperforms centralized training with only limited local instructions.

Duke Scholars

Author Yiran Chen Electrical and Computer Engineering

Published In

ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings

DOI

10.1109/ICASSP48485.2024.10447454

ISSN

1520-6149

Publication Date

January 1, 2024

Start / End Page

6915 / 6919

Citation

APA

Chicago

ICMJE

MLA

NLM

Zhang, J., Vahidian, S., Kuo, M., Li, C., Zhang, R., Yu, T., … Chen, Y. (2024). TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING. In ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings (pp. 6915–6919). https://doi.org/10.1109/ICASSP48485.2024.10447454

Zhang, J., S. Vahidian, M. Kuo, C. Li, R. Zhang, T. Yu, G. Wang, and Y. Chen. “TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING.” In ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 6915–19, 2024. https://doi.org/10.1109/ICASSP48485.2024.10447454.

Zhang J, Vahidian S, Kuo M, Li C, Zhang R, Yu T, et al. TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING. In: ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. 2024. p. 6915–9.

Zhang, J., et al. “TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING.” ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2024, pp. 6915–19. Scopus, doi:10.1109/ICASSP48485.2024.10447454.

Zhang J, Vahidian S, Kuo M, Li C, Zhang R, Yu T, Wang G, Chen Y. TOWARDS BUILDING THE FEDERATEDGPT: FEDERATED INSTRUCTION TUNING. ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. 2024. p. 6915–6919.

Published In

ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings

DOI

10.1109/ICASSP48485.2024.10447454

ISSN

1520-6149

Publication Date

January 1, 2024

Start / End Page

6915 / 6919