Skip to main content

Offline Policy Evaluation in Large Action Spaces via Outcome-Oriented Action Grouping

Publication ,  Conference
Peng, J; Zou, H; Liu, J; Li, S; Jiang, Y; Pei, J; Cui, P
Published in: ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023
April 30, 2023

Offline policy evaluation (OPE) aims to accurately estimate the performance of a hypothetical policy using only historical data, which has drawn increasing attention in a wide range of applications including recommender systems and personalized medicine. With the presence of rising granularity of consumer data, many industries started exploring larger action candidate spaces to support more precise personalized action. While inverse propensity score (IPS) is a standard OPE estimator, it suffers from more severe variance issues with increasing action spaces. To address this issue, we theoretically prove that the estimation variance can be reduced by merging actions into groups while the distinction among these action effects on the outcome can induce extra bias. Motivated by these, we propose a novel IPS estimator with outcome-oriented action Grouping (GroupIPS), which leverages a Lipschitz regularized network to measure the distance of action effects in the embedding space and merges nearest action neighbors. This strategy enables more robust estimation by achieving smaller variances while inducing minor additional bias. Empirically, extensive experiments on both synthetic and real world datasets demonstrate the effectiveness of our proposed method.

Duke Scholars

Published In

ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023

DOI

Publication Date

April 30, 2023

Start / End Page

1220 / 1230
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Peng, J., Zou, H., Liu, J., Li, S., Jiang, Y., Pei, J., & Cui, P. (2023). Offline Policy Evaluation in Large Action Spaces via Outcome-Oriented Action Grouping. In ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 (pp. 1220–1230). https://doi.org/10.1145/3543507.3583448
Peng, J., H. Zou, J. Liu, S. Li, Y. Jiang, J. Pei, and P. Cui. “Offline Policy Evaluation in Large Action Spaces via Outcome-Oriented Action Grouping.” In ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023, 1220–30, 2023. https://doi.org/10.1145/3543507.3583448.
Peng J, Zou H, Liu J, Li S, Jiang Y, Pei J, et al. Offline Policy Evaluation in Large Action Spaces via Outcome-Oriented Action Grouping. In: ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023. 2023. p. 1220–30.
Peng, J., et al. “Offline Policy Evaluation in Large Action Spaces via Outcome-Oriented Action Grouping.” ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023, 2023, pp. 1220–30. Scopus, doi:10.1145/3543507.3583448.
Peng J, Zou H, Liu J, Li S, Jiang Y, Pei J, Cui P. Offline Policy Evaluation in Large Action Spaces via Outcome-Oriented Action Grouping. ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023. 2023. p. 1220–1230.

Published In

ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023

DOI

Publication Date

April 30, 2023

Start / End Page

1220 / 1230