Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems
The growing demand for advanced analytics beyond statistical aggregation calls for database systems that support effective model selection of deep neural networks (DNNs). However, existing model selection strategies are based on either training-based algorithms that deliver high-performing models at the expense of high computational cost, or training-free algorithms that enhance computational efficiency with reduced effectiveness. These strategies often disregard computational cost and response time Service-Level Objectives (SLOs), which are of concern to average or budgetconscious machine learning users. In addition, they lack a welldesigned integration of the model selection algorithms with DBMSs, which hinders efficient in-database model selection. This paper presents TRAILS, a resource-efficient and SLO-aware in-database model selection system. To leverage the strengths of both trainingfree and training-based model selection, we first characterize nine state-of-the-art training-free model evaluation metrics and propose a more effective one named JacFlow, and then, restructure the conventional model selection procedure into two phases: filtering and refinement. A novel coordinator is also introduced to strike a balance between the high efficiency of train-free algorithms and the high effectiveness of training-based algorithms, ensuring high-performing model selection while adhering to target SLOs. Moreover, we incorporate the proposed algorithm into PostgreSQL to develop TRAILS, thereby both enhancing resource efficiency and reducing model selection latency. This integration establishes a foundation for declarative model definition and selection within DBMSs. Empirical results demonstrate that our TRAILS reduces model selection time and computational expenses considerably by up to 24.38x and 29.32x respectively compared to existing model selection systems.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- 4605 Data management and data science
- 0807 Library and Information Studies
- 0806 Information Systems
- 0802 Computation Theory and Mathematics