Skip to main content
Journal cover image

Data pricing in machine learning pipelines

Publication ,  Journal Article
Cong, Z; Luo, X; Pei, J; Zhu, F; Zhang, Y
Published in: Knowledge and Information Systems
June 1, 2022

Machine learning is disruptive. At the same time, machine learning can only succeed by collaboration among many parties in multiple steps naturally as pipelines in an eco-system, such as collecting data for possible machine learning applications, collaboratively training models by multiple parties and delivering machine learning services to end users. Data are critical and penetrating in the whole machine learning pipelines. As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating those many parties. In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. Then, we focus on pricing in three important steps in machine learning pipelines. To understand pricing in the step of training data collection, we review pricing raw data sets and data labels. We also investigate pricing in the step of collaborative training of machine learning models and overview pricing machine learning models for end users in the step of machine learning deployment. We also discuss a series of possible future directions.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Knowledge and Information Systems

DOI

EISSN

0219-3116

ISSN

0219-1377

Publication Date

June 1, 2022

Volume

64

Issue

6

Start / End Page

1417 / 1455

Related Subject Headings

  • Information Systems
  • 46 Information and computing sciences
  • 0806 Information Systems
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cong, Z., Luo, X., Pei, J., Zhu, F., & Zhang, Y. (2022). Data pricing in machine learning pipelines. Knowledge and Information Systems, 64(6), 1417–1455. https://doi.org/10.1007/s10115-022-01679-4
Cong, Z., X. Luo, J. Pei, F. Zhu, and Y. Zhang. “Data pricing in machine learning pipelines.” Knowledge and Information Systems 64, no. 6 (June 1, 2022): 1417–55. https://doi.org/10.1007/s10115-022-01679-4.
Cong Z, Luo X, Pei J, Zhu F, Zhang Y. Data pricing in machine learning pipelines. Knowledge and Information Systems. 2022 Jun 1;64(6):1417–55.
Cong, Z., et al. “Data pricing in machine learning pipelines.” Knowledge and Information Systems, vol. 64, no. 6, June 2022, pp. 1417–55. Scopus, doi:10.1007/s10115-022-01679-4.
Cong Z, Luo X, Pei J, Zhu F, Zhang Y. Data pricing in machine learning pipelines. Knowledge and Information Systems. 2022 Jun 1;64(6):1417–1455.
Journal cover image

Published In

Knowledge and Information Systems

DOI

EISSN

0219-3116

ISSN

0219-1377

Publication Date

June 1, 2022

Volume

64

Issue

6

Start / End Page

1417 / 1455

Related Subject Headings

  • Information Systems
  • 46 Information and computing sciences
  • 0806 Information Systems
  • 0801 Artificial Intelligence and Image Processing