Applications and Computation of the Shapley Value in Databases and Machine Learning
Recently, the Shapley value, a concept rooted in cooperative game theory, has found more and more applications in databases and machine learning. Due to its combinatoric nature, the computation of the Shapley value is #P-hard. To address this challenge, numerous studies are actively engaged in developing efficient computation methods or exploring alternative solutions in specific application contexts. Applications of the Shapley value in databases and machine learning as well as fast computation or approximation of the Shapley value in those applications are becoming a new research frontier in the database community. This tutorial presents a comprehensive and systematic overview of Shapley value applications and computation within both database and machine learning domains. We survey the existing methods from a unique perspective that diverges from the current literature. Unlike most reviews, which mainly focus on applications, our approach focuses on the underlying algorithmic mechanisms and application specific assumptions in these methods. This approach allows us to highlight the similarities and differences among the various Shapley value applications and computation techniques more effectively. Our tutorial categorizes these methods based on their intrinsic processes, cutting across different applications. The tutorial begins with an introduction to the Shapley value and its diverse applications in databases and machine learning. Subsequently, it delves into the computational challenges of the Shapley value, presents cutting-edge solutions for its efficient computation, and explores alternative solutions.