Scholars@Duke publication: Counterfactual Explanation of Shapley Value in Data Coalitions

Counterfactual Explanation of Shapley Value in Data Coalitions

Publication , Journal Article

Si, M; Pei, J

Published in: Proceedings of the VLDB Endowment

January 1, 2024

The Shapley value is widely used for data valuation in data markets. However, explaining the Shapley value of an owner in a data coalition is an unexplored and challenging task. To tackle this, we formulate the problem of finding the counterfactual explanation of Shapley value in data coalitions. Essentially, given two data owners AandBsuch that A has a higher Shapley value than B, a counter factual explanation is a smallest subset of data entries in Asuch that transferring the subset from A to Bmakes the Shapley value of A less than that of B. We show that counterfactual explanations always exist, but finding an exact counterfactual explanation is NP-hard. Using Monte Carlo estimation to approximate counter factual explanations directly according to the definition is still very costly, since we have to estimate the Shapley values of owners Aand B after each possible subset shift. We develop a series of heuristic techniques to speed up computation by estimating differential Shapley values, computing the power of singular data entries, and shifting subsets greedily, culminating in the SV-Exp algorithm. Our experimental results on real datasets clearly demonstrate the efficiency of our method and the effectiveness of counterfactuals in interpreting the Shapley value of an owner.

Duke Scholars

Author Jian Pei Computer Science

Published In

Proceedings of the VLDB Endowment

DOI

10.14778/3681954.3682004

EISSN

2150-8097

Publication Date

January 1, 2024

Volume

Issue

Start / End Page

3332 / 3345

Related Subject Headings

4605 Data management and data science
0807 Library and Information Studies
0806 Information Systems
0802 Computation Theory and Mathematics

Citation

APA

Chicago

ICMJE

MLA

NLM

Si, M., & Pei, J. (2024). Counterfactual Explanation of Shapley Value in Data Coalitions. Proceedings of the VLDB Endowment, 17(11), 3332–3345. https://doi.org/10.14778/3681954.3682004

Si, M., and J. Pei. “Counterfactual Explanation of Shapley Value in Data Coalitions.” Proceedings of the VLDB Endowment 17, no. 11 (January 1, 2024): 3332–45. https://doi.org/10.14778/3681954.3682004.

Si M, Pei J. Counterfactual Explanation of Shapley Value in Data Coalitions. Proceedings of the VLDB Endowment. 2024 Jan 1;17(11):3332–45.

Si, M., and J. Pei. “Counterfactual Explanation of Shapley Value in Data Coalitions.” Proceedings of the VLDB Endowment, vol. 17, no. 11, Jan. 2024, pp. 3332–45. Scopus, doi:10.14778/3681954.3682004.

Si M, Pei J. Counterfactual Explanation of Shapley Value in Data Coalitions. Proceedings of the VLDB Endowment. 2024 Jan 1;17(11):3332–3345.

Published In

Proceedings of the VLDB Endowment

DOI

10.14778/3681954.3682004

EISSN

2150-8097

Publication Date

January 1, 2024

Volume

Issue

Start / End Page

3332 / 3345

Related Subject Headings

4605 Data management and data science
0807 Library and Information Studies
0806 Information Systems
0802 Computation Theory and Mathematics