Skip to main content

High-dimensional robust mean estimation in nearly-linear time

Publication ,  Conference
Cheng, Y; Diakonikolas, I; Ge, R
Published in: Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms
January 1, 2019

We study the fundamental problem of high-dimensional mean estimation in a robust model where a constant fraction of the samples are adversarially corrupted. Recent work gave the first polynomial time algorithms for this problem with dimension-independent error guarantees for several families of structured distributions. In this work, we give the first nearly-linear time algorithms for high-dimensional robust mean estimation. Specifically, we focus on distributions with (i) known covariance and sub-gaussian tails, and (ii) unknown bounded covariance. Given N samples on Rd, an -fraction of which may be arbitrarily corrupted, our algorithms run in time Oe(Nd)/poly() and approximate the true mean within the information-theoretically optimal error, up to constant factors. Previous robust algorithms with comparable error guarantees have running times Ω(eNd2), for = Ω(1). Our algorithms rely on a natural family of SDPs parameterized by our current guess ν for the unknown mean µ?. We give a win-win analysis establishing the following: either a near-optimal solution to the primal SDP yields a good candidate for µ?— independent of our current guess ν — or a near-optimal solution to the dual SDP yields a new guess ν0whose distance from µ?is smaller by a constant factor. We exploit the special structure of the corresponding SDPs to show that they are approximately solvable in nearly-linear time. Our approach is quite general, and we believe it can also be applied to obtain nearly-linear time algorithms for other high-dimensional robust learning problems.

Duke Scholars

Published In

Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms

DOI

Publication Date

January 1, 2019

Start / End Page

2755 / 2771
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Cheng, Y., Diakonikolas, I., & Ge, R. (2019). High-dimensional robust mean estimation in nearly-linear time. In Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms (pp. 2755–2771). https://doi.org/10.1137/1.9781611975482.171
Cheng, Y., I. Diakonikolas, and R. Ge. “High-dimensional robust mean estimation in nearly-linear time.” In Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms, 2755–71, 2019. https://doi.org/10.1137/1.9781611975482.171.
Cheng Y, Diakonikolas I, Ge R. High-dimensional robust mean estimation in nearly-linear time. In: Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms. 2019. p. 2755–71.
Cheng, Y., et al. “High-dimensional robust mean estimation in nearly-linear time.” Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms, 2019, pp. 2755–71. Scopus, doi:10.1137/1.9781611975482.171.
Cheng Y, Diakonikolas I, Ge R. High-dimensional robust mean estimation in nearly-linear time. Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms. 2019. p. 2755–2771.

Published In

Proceedings of the Annual ACM SIAM Symposium on Discrete Algorithms

DOI

Publication Date

January 1, 2019

Start / End Page

2755 / 2771