Journal ArticleJournal of Machine Learning Research · January 1, 2025
When data are distributed across multiple sites or machines rather than centralized in one location, researchers face the challenge of extracting meaningful information without directly sharing individual data points. While there are many distributed metho ...
Cite
Journal ArticleSIAM Journal on Scientific Computing · October 1, 2024
Conditional Monte Carlo or pre-integration is a powerful tool for reducing variance and improving the regularity of integrands when using Monte Carlo and quasi-Monte Carlo (QMC) methods. To select the variable to pre-integrate, one must consider both the v ...
Full textCite
Journal ArticleSIAM Journal on Numerical Analysis · January 1, 2023
Preintegration is an extension of conditional Monte Carlo to quasi-Monte Carlo and randomized quasi-Monte Carlo. Conditioning can reduce but not increase the variance in Monte Carlo. For quasi-Monte Carlo it can bring about improved regularity of the integ ...
Full textCite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2023
Langevin Monte Carlo (LMC) and its stochastic gradient versions are powerful algorithms for sampling from complex high-dimensional distributions. To sample from a distribution with density π(θ) ∝ exp(−U(θ)), LMC iteratively generates the next sample by tak ...
Cite
Journal ArticleAnnals of Statistics · October 1, 2022
In network applications, it has become increasingly common to obtain datasets in the form of multiple networks observed on the same set of subjects, where each network is obtained in a related but different experiment condition or application scenario. Suc ...
Full textCite
Journal ArticleStatistical Science · May 1, 2022
Genomic surveillance of SARS-CoV-2 has been instrumental in tracking the spread and evolution of the virus during the pandemic. The availability of SARS-CoV-2 molecular sequences isolated from infected individuals, coupled with phylodynamic methods, have p ...
Full textCite
Journal ArticleIEEE Transactions on Information Theory · December 1, 2021
In our 'big data' age, the size and complexity of data is steadily increasing. Methods for dimension reduction are ever more popular and useful. Two distinct types of dimension reduction are 'data-oblivious' methods such as random projections and sketching ...
Full textCite
Journal ArticleJournal of Machine Learning Research · January 1, 2021
Many machine learning problems optimize an objective that must be measured with noise. The primary method is a first order stochastic gradient descent using one or more Monte Carlo (MC) samples at each step. There are settings where ill-conditioning makes ...
Cite
Conference8th International Conference on Learning Representations Iclr 2020 · January 1, 2020
We study the following three fundamental problems about ridge regression: (1) what is the structure of the estimator? (2) how to correctly use cross-validation to choose the regularization parameter? and (3) how to accelerate computation without losing too ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2020
Random projections or sketching are widely used in many algorithmic and learning contexts. Here we study the performance of iterative Hessian sketch for least-squares problems. By leveraging and extending recent results from random matrix theory on the lim ...
Cite
ConferenceAdvances in Neural Information Processing Systems · January 1, 2019
We consider a least squares regression problem where the data has been generated from a linear model, and we are interested to learn the unknown regression parameters. We consider "sketch-and-solve" methods that randomly project the data first, and do regr ...
Cite