Skip to main content

Inference Serving System for Stable Diffusion as a Service

Publication ,  Conference
Ray, A; Dannull, L; Firouzi, F; Lafata, K; Chakrabarty, K
Published in: Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024
January 1, 2024

We present a model-less, privacy-preserving, low-latency inference framework to satisfy user-defined System-Level Objectives (SLO) for Stable Diffusion as a Service (SDaaS). Developers of Stable Diffusion (SD) models register their trained models on our proposed system through a declarative API. Users, on the other hand, can specify SLOs in terms of the style of the generated image for their input text, the requested processing latency, and the minimum requested text-to-image similarity (CLIP score) for inference through the user API. Assuming black-box access to the registered models, we profile them on hardware accelerators to design an inference predictor module. It heuristically predicts the required number of inference steps for the user-requested text-to-image CLIP score and the requested latency, for a specific SD model over a hardware accelerator, to satisfy the SLO. In combination with the inference predictor module, we propose a shortest-job first algorithm for our inference framework. Compared to traditional Deep Neural Network (DNN) and Large Language Model (LLM) inference scheduling algorithms, our proposed method outperforms on average job completion time, and the average number of SLOs satisfied in a user-defined SLO scenario.

Duke Scholars

Published In

Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024

DOI

Publication Date

January 1, 2024

Start / End Page

13 / 16
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Ray, A., Dannull, L., Firouzi, F., Lafata, K., & Chakrabarty, K. (2024). Inference Serving System for Stable Diffusion as a Service. In Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024 (pp. 13–16). https://doi.org/10.1109/Cloud-Summit61220.2024.00009
Ray, A., L. Dannull, F. Firouzi, K. Lafata, and K. Chakrabarty. “Inference Serving System for Stable Diffusion as a Service.” In Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024, 13–16, 2024. https://doi.org/10.1109/Cloud-Summit61220.2024.00009.
Ray A, Dannull L, Firouzi F, Lafata K, Chakrabarty K. Inference Serving System for Stable Diffusion as a Service. In: Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024. 2024. p. 13–6.
Ray, A., et al. “Inference Serving System for Stable Diffusion as a Service.” Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024, 2024, pp. 13–16. Scopus, doi:10.1109/Cloud-Summit61220.2024.00009.
Ray A, Dannull L, Firouzi F, Lafata K, Chakrabarty K. Inference Serving System for Stable Diffusion as a Service. Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024. 2024. p. 13–16.

Published In

Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024

DOI

Publication Date

January 1, 2024

Start / End Page

13 / 16