Scholars@Duke publication: High-dimensional robust mean estimation in nearly-linear time

High-dimensional robust mean estimation in nearly-linear time

Publication , Conference

Cheng, Y; Diakonikolas, I; Ge, R

Published in: Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms

January 1, 2019

We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions. In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and sub-gaussian tails, and (ii) unknown bounded covariance. Given N samples on R^d, an -fraction of which may be arbitrarily corrupted, our algorithms run in time O^e(Nd)/poly() and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times Ω(^eNd²), for = Ω(1). Our algorithms rely on a natural family of SDPs parameterized by our current guess ν for the unknown mean µ^?. We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for µ^?— independent of our current guess ν — or a near-optimal solution to the dual SDP yields a new guess ν⁰whose distance from µ^?is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.

Duke Scholars

Author Rong Ge Computer Science

Published In

Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms

DOI

10.1137/1.9781611975482.171

Publication Date

January 1, 2019

Start / End Page

2755 / 2771

Citation

APA

Chicago

ICMJE

MLA

NLM

Cheng, Y., Diakonikolas, I., & Ge, R. (2019). High-dimensional robust mean estimation in nearly-linear time. In Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms (pp. 2755–2771). https://doi.org/10.1137/1.9781611975482.171

Cheng, Y., I. Diakonikolas, and R. Ge. “High-dimensional robust mean estimation in nearly-linear time.” In Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms, 2755–71, 2019. https://doi.org/10.1137/1.9781611975482.171.

Cheng Y, Diakonikolas I, Ge R. High-dimensional robust mean estimation in nearly-linear time. In: Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms. 2019. p. 2755–71.

Cheng, Y., et al. “High-dimensional robust mean estimation in nearly-linear time.” Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms, 2019, pp. 2755–71. Scopus, doi:10.1137/1.9781611975482.171.

Cheng Y, Diakonikolas I, Ge R. High-dimensional robust mean estimation in nearly-linear time. Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms. 2019. p. 2755–2771.

Published In

Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms

DOI

10.1137/1.9781611975482.171

Publication Date

January 1, 2019

Start / End Page

2755 / 2771