Xiuyuan Cheng

Journal Article Communications biology · July 2025 Identifying accurate cell markers in single-cell RNA-seq data is crucial for understanding cellular diversity and function. Localized Marker Detector (LMD) is a novel tool to identify "localized genes"-genes exclusively expressed in groups of highly simila ... Full text Cite

Gene trajectory inference for single-cell data by optimal transport metrics.

Journal Article Nature biotechnology · February 2025 Single-cell RNA sequencing has been widely used to investigate cell state transitions and gene dynamics of biological processes. Current strategies to infer the sequential dynamics of genes in a process typically rely on constructing cell pseudotime throug ... Full text Cite

Computing high-dimensional optimal transport by flow neural networks

Conference Proceedings of Machine Learning Research · January 1, 2025 Computing optimal transport (OT) for general high-dimensional data has been a longstanding challenge. Despite much progress, most of the efforts including neural network methods have been focused on the static formulation of the OT problem. The current wor ... Cite

Consistency Posterior Sampling for Diverse Image Synthesis

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2025 Posterior sampling in high-dimensional spaces using generative models holds significant promise for various applications, including but not limited to inverse problems and guided generation tasks. Generating diverse posterior samples remains expensive, as ... Full text Cite

Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise.

Journal Article Information and inference : a journal of the IMA · December 2024 Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn-Knopp (SK) iterations. This paper proves the convergence of bi-stochastically normalized graph La ... Full text Cite

Convergence Analysis and Acceleration of Fictitious Play for General Mean-Field Games via the Best Response

Preprint · November 12, 2024 Link to item Cite

Kernel two-sample tests for manifold data

Journal Article Bernoulli · November 1, 2024 We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level a ... Full text Cite

The G -invariant graph Laplacian part II: Diffusion maps.

Journal Article Applied and computational harmonic analysis · November 2024 The diffusion maps embedding of data lying on a manifold has shown success in tasks such as dimensionality reduction, clustering, and data visualization. In this work, we consider embedding data sets that were sampled from a manifold which is closed under ... Full text Cite

Posterior sampling via Langevin dynamics based on generative priors

Preprint · October 2, 2024 Link to item Cite

The G -invariant graph Laplacian Part I: Convergence rate and eigendecomposition.

Journal Article Applied and computational harmonic analysis · July 2024 Graph Laplacian based algorithms for data lying on a manifold have been proven effective for tasks such as dimensionality reduction, clustering, and denoising. In this work, we consider data sets whose data points lie on a manifold that is closed under the ... Full text Cite

Flow-Based Distributionally Robust Optimization

Journal Article IEEE Journal on Selected Areas in Information Theory · January 1, 2024 We present a computationally efficient framework, called FlowDRO, for solving flow-based distributionally robust optimization (DRO) problems with Wasserstein uncertainty sets while aiming to find continuous worst-case distribution (also called the Least Fa ... Full text Cite

Convergence of Flow-Based Generative Models via Proximal Gradient Descent in Wasserstein Space

Journal Article IEEE Transactions on Information Theory · January 1, 2024 Flow-based generative models enjoy certain advantages in computing the data generation and the likelihood, and have recently shown competitive empirical performance. Compared to the accumulating theoretical studies on related score-based diffusion models, ... Full text Cite

STAGE-REGULARIZED NEURAL STEIN CRITICS FOR TESTING GOODNESS-OF-FIT OF GENERATIVE MODELS

Conference ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings · January 1, 2024 Learning to differentiate model distributions from observed data is a fundamental problem in statistics and machine learning, and high-dimensional data remains a challenging setting for such problems. Metrics that quantify the disparity in probability dist ... Full text Cite

Neural Stein Critics with Staged L2-Regularization

Journal Article IEEE Transactions on Information Theory · November 1, 2023 Learning to differentiate model distributions from observed data is a fundamental problem in statistics and machine learning, and high-dimensional data remains a challenging setting for such problems. Metrics that quantify the disparity in probability dist ... Full text Cite

Convergence of flow-based generative models via proximal gradient descent in Wasserstein space

Preprint · October 26, 2023 Link to item Cite

Training Neural Networks for Sequential Change-Point Detection

Conference ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings · January 1, 2023 Detecting an abrupt distributional shift of a data stream, known as change-point detection, is a fundamental problem in statistics and machine learning. We introduce a novel approach for online change-point detection using neural net-works. To be specific, ... Full text Cite

Robust Inference of Manifold Density and Geometry by Doubly Stochastic Scaling

Journal Article SIAM Journal on Mathematics of Data Science · January 1, 2023 The Gaussian kernel and its traditional normalizations (e.g., row-stochastic) are popular approaches for assessing similarities between data points. Yet, they can be inaccurate under high-dimensional noise, especially if the noise magnitude varies consider ... Full text Cite

The G-Invariant Graph Laplacian Part Ii: Diffusion Maps

Preprint · 2023 Full text Cite

The G-Invariant Graph Laplacian

Preprint · 2023 Full text Cite

Eigen-convergence of Gaussian kernelized graph Laplacian by manifold heat interpolation

Journal Article Applied and Computational Harmonic Analysis · November 1, 2022 We study the spectral convergence of graph Laplacians to the Laplace-Beltrami operator when the kernelized graph affinity matrix is constructed from N random samples on a d-dimensional manifold in an ambient Euclidean space. By analyzing Dirichlet form con ... Full text Cite

Classification logit two-sample testing by neural networks for differentiating near manifold densities.

Journal Article IEEE transactions on information theory · October 2022 The recent success of generative adversarial networks and variational learning suggests that training a classification network may work well in addressing the classical two-sample problem, which asks to differentiate two densities given finite samples from ... Full text Cite

Statistical inference using GLEaM model with spatial heterogeneity and correlation between regions.

Journal Article Scientific reports · October 2022 A better understanding of various patterns in the coronavirus disease 2019 (COVID-19) spread in different parts of the world is crucial to its prevention and control. Motivated by the previously developed Global Epidemic and Mobility (GLEaM) model, this pa ... Full text Cite

Convergence of graph Laplacian with kNN self-tuned kernels

Journal Article Information and Inference: A Journal of the IMA · September 8, 2022 AbstractKernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma ^2} ) $ is widely used in graph-based geometric data analysis and unsupervised learning. A ... Full text Cite

Invertible Neural Networks for Graph Prediction

Journal Article IEEE Journal on Selected Areas in Information Theory · September 1, 2022 Graph prediction problems prevail in data analysis and machine learning. The inverse prediction problem, namely to infer input data from given output labels, is of emerging interest in various applications. In this work, we develop invertible graph neural ... Full text Cite

Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters

Journal Article Journal of Machine Learning Research · January 1, 2022 Encoding the scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many computer vision tasks especially when dealing with multiscale inputs. We study, in this paper, a scaling-translation-eq ... Cite

NEURAL SPECTRAL MARKED POINT PROCESSES

Conference Iclr 2022 10th International Conference on Learning Representations · January 1, 2022 Self- and mutually-exciting point processes are popular models in machine learning and statistics for dependent discrete event data. To date, most existing models assume stationary kernels (including the classical Hawkes processes) and simple parametric mo ... Cite

SpecNet2: Orthogonalization-free Spectral Embedding by Neural Networks

Conference Proceedings of Machine Learning Research · January 1, 2022 Spectral methods which represent data points by eigenvectors of kernel matrices or graph Laplacian matrices have been a primary tool in unsupervised data analysis. In many application scenarios, parametrizing the spectral embedding by a neural network that ... Cite

Detection of differentially abundant cell subpopulations in scRNA-seq data.

Journal Article Proceedings of the National Academy of Sciences of the United States of America · June 2021 Comprehensive and accurate comparisons of transcriptomic distributions of cells from samples taken from two different biological states, such as healthy versus diseased individuals, are an emerging challenge in single-cell RNA sequencing (scRNA-seq) analys ... Full text Cite

Convergence of Gaussian-smoothed optimal transport distance with sub-gamma distributions and dependent samples

Conference Proceedings of Machine Learning Research · January 1, 2021 The Gaussian-smoothed optimal transport (GOT) framework, recently proposed by Goldfeld et al., scales to high dimensions in estimation and provides an alternative to entropy regularization. This paper provides convergence guarantees for estimating the GOT ... Cite

Spatiotemporal Joint Filter Decomposition in 3D Convolutional Neural Networks

Conference Advances in Neural Information Processing Systems · January 1, 2021 In this paper, we introduce spatiotemporal joint filter decomposition to decouple spatial and temporal learning, while preserving spatiotemporal dependency in a video. A 3D convolutional filter is now jointly decomposed over a set of spatial and temporal f ... Cite

Neural Tangent Kernel Maximum Mean Discrepancy

Conference Advances in Neural Information Processing Systems · January 1, 2021 We present a novel neural network Maximum Mean Discrepancy (MMD) statistic by identifying a new connection between neural tangent kernel (NTK) and MMD. This connection enables us to develop a computationally efficient and memory-efficient approach to compu ... Cite

Graph Convolution with Low-rank Learn-able Local Filters

Conference Iclr 2021 9th International Conference on Learning Representations · January 1, 2021 Geometric variations like rotation, scaling, and viewpoint changes pose a significant challenge to visual understanding. One common solution is to directly model certain intrinsic structures, e.g., using landmarks. However, it then becomes non-trivial to b ... Cite

Butterfly-net: Optimal function representation based on convolutional neural networks

Journal Article Communications in Computational Physics · November 1, 2020 Deep networks, especially convolutional neural networks (CNNs), have been successfully applied in various areas of machine learning as well as to challenging problems in other scientific and engineering fields. This paper introduces Butterfly-net, a low-co ... Full text Open Access Cite

A Witness Function Based Construction of Discriminative Models Using Hermite Polynomials

Journal Article Frontiers in Applied Mathematics and Statistics · August 18, 2020 In machine learning, we are given a dataset of the form (Formula presented.), drawn as i.i.d. samples from an unknown probability distribution μ; the marginal distribution for the xj's being μ*, and the marginals of the kth ... Full text Cite

On matrix rearrangement inequalities

Journal Article Proceedings of the American Mathematical Society · January 1, 2020 Given two symmetric and positive semidefinite square matrices A,B, is it true that any matrix given as the product of m copies of A and n copies of B in a particular sequence must be dominated in the spectral norm by the ordered matrix product AmBn? For ex ... Full text Cite

STOCHASTIC CONDITIONAL GENERATIVE NETWORKS WITH BASIS DECOMPOSITION

Conference 8th International Conference on Learning Representations Iclr 2020 · January 1, 2020 While generative adversarial networks (GANs) have revolutionized machine learning, a number of open questions remain to fully understand them and exploit their power. One of these questions is how to efficiently achieve proper diversity and sampling of the ... Cite

Spectral Embedding Norm: Looking Deep into the Spectrum of the Graph Laplacian.

Journal Article SIAM journal on imaging sciences · January 2020 The extraction of clusters from a dataset which includes multiple clusters and a significant background component is a non-trivial task of practical importance. In image analysis this manifests for example in anomaly detection and target detection. The tra ... Full text Cite

Variational Diffusion Autoencoders with Random Walk Sampling

Conference Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · January 1, 2020 Variational autoencoders (VAEs) and generative adversarial networks (GANs) enjoy an intuitive connection to manifold learning: in training the decoder/generator is optimized to approximate a homeomorphism between the data distribution and the sampling spac ... Full text Cite

Butterfly-Net2: Simplified Butterfly-Net and Fourier Transform Initialization

Conference Proceedings of Machine Learning Research · January 1, 2020 Structured CNN designed using the prior information of problems potentially improves efficiency over conventional CNNs in various tasks in solving PDEs and inverse problems in signal processing. This paper introduces BNet2, a simplified Butterfly-Net and i ... Cite

Two-sample statistics based on anisotropic kernels

Journal Article Information and Inference: A Journal of the IMA · December 10, 2019 AbstractThe paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions ... Full text Link to item Cite

Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters

Journal Article · September 24, 2019 Encoding the scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many computer vision tasks especially when dealing with multiscale inputs. We study, in this paper, a scaling-translation-eq ... Link to item Cite

RotDCF: Decomposition of convolutional filters for rotation-equivariant deep networks

Conference · May 6, 2019 Link to item Cite

On the diffusion geometry of graph Laplacians and applications

Journal Article Applied and Computational Harmonic Analysis · May 2019 Full text Cite

Provable estimation of the number of blocks in block models

Conference Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (AISTATS'18) · April 9, 2018 Link to item Cite

The geometry of nodal sets and outlier detection

Journal Article Journal of Number Theory · April 2018 Full text Cite

DCFNet: Deep Neural Network with Decomposed Convolutional Filters.

Conference ICML · 2018 Cite

DCFNet: Deep Neural Network with Decomposed Convolutional Filters

Conference Proceedings of Machine Learning Research · January 1, 2018 Filters in a Convolutional Neural Network (CNN) contain model parameters learned from enormous amounts of data. In this paper, we suggest to decompose convolutional filters in CNN as a truncated expansion with pre-fixed bases, namely the Decomposed Convolu ... Cite

Provable Estimation of the Number of Blocks in Block Models

Conference Proceedings of Machine Learning Research · January 1, 2018 Community detection is a fundamental unsupervised learning problem for unlabeled networks which has a broad range of applications. Many community detection algorithms assume that the number of clusters r is known apriori. In this paper, we propose an appro ... Cite

Prevalence, awareness, treatment, and control of hypertension in China: data from 1·7 million adults in a population-based screening study (China PEACE Million Persons Project)

Journal Article The Lancet · December 2017 Full text Cite