Skip to main content

Xiaobai Sun

Professor of Computer Science
Computer Science
Box 90129, Durham, NC 27708-0129
D107 Lev Sci Res Ctr, Durham, NC 27708

Selected Presentations & Appearances


Space-Time Efficient Compression of all Resistance Distances on a Large Sparse Network - SIAM-PP 2026 · March 3, 2026 - March 6, 2026 International Meeting or Conference SIAM-PP committee , Berlin, Germany
Sparsify Latent Factor Matrix by Householder Transformations - Householder Symposium · June 8, 2025 - June 13, 2025 Invited Talk Householder Symposium Committee (internationa) , Cornell University, Ithaca, NY

In 1958 A. S. Householder (1904-1993) introduced the reflection transformation in his highly influential paper, Unitary Triangularization of a Nonsymmetric Matrix, published in the Journal of the ACM. He presented the reflection as a special case of nonsingular transformation matrices in the form of a rank-1 deviation from the identity matrix. In that same year, H. F. Kaiser (1927-1992) published the seminal paper the Varimax Criterion for Analytic Rotations in Factor Analysis in Psychometrika. Both papers have seen increasing citations in recent years, as will be demonstrated. This work introduces the use of Householder transformations for effective and efficient rotations and sparsification of latent factors. It has several advantages over the state-of-the-art factor rotation methods. This appears to be the first connection between these two lines of research.

The Dominant Spectral Subspace for Nodal Decomposition of a Network Interconnecting Tight-Knit Communities - Workshop -- Hidden structures in dynamical systems, optimization, and machine learning · May 19, 2025 - May 23, 2025 Invited Talk Gran Sasso Science Institute, L’Aquila, Italy

We introduce a new connection established through both theoretical
analysis and empirical investigation between graph spectral analysis
and community detection in networks, i.e., graph clustering.

The new finding has three key components: (1) A pronounced, dominant
gap in the Laplacian spectrum of a graph is indicative of the presence
of tight-knit community clusters. (2) The underlying cluster
structure emerges when the graph is embedded into a low-dimensional
invariant subspace associated with the dominant gap. In this space,
the cluster subgraphs are identifiable with the nodal domains and can
be robustly recognized, without supervision. and (3) The dominant gap
and the dominant subspace are spectral characteristics of the graph
cluster structure, as the Fiedler value and vector to the graph
connectivity. In comparison, a nodal partition by an eigenvector
associated with a Laplacian eigenvalue below (or above) the gap merges
(or splits) the clusters, which is a known resolution problem, or
worse, it blends the clusters into false groups.

Vertex-to-vector encoding is an indispensable upstream task in machine
learning . Spectral graph embedding for vector encoding typically uses
a dimension-reduced spectral space at the lower end of the Laplacian
spectrum to preserve near-neighbor connectivity . Unfortunately, this
particular subspace selection poses challenges for the downstream task
of data clustering, which aims to separate adjacent neighbors into
distinct functional or structural units for scientific analysis or
group them by common attributes for recommendation systems .

Theoretical studies of spectral graph analysis have shown success with
regular graphs (both random and non-random) , where the adjacency
matrix, combinatorial Laplacian, and normalized Laplacian share the
same eigenvectors and give the same nodal partitions. Yet,
perturbation theory is limited in extending such analysis to graphs
with varying degree distributions. To address this, we construct and
analyze two sets of ideally parameterized networks with homogeneous
clusters interconnected through simple and elementary typologies. The
graphs are topologically defined in one set and probabilistically
characterized in the other set. They are not regular, except in
degenerate cases. They serve as structural reference graphs for
studying a broader class of graphs with heterogeneous clusters that
fall within the scope of perturbation analysis. Together, their
spectral properties offer new insights into community structures in
real-world networks.

We present both theoretical and empirical results from our
investigations on synthetic graphs and real-world networks. We
conclude with comments on remaining questions.

Outreach & Engaged Scholarship


Bass Connections Faculty Team Member - Feature Extraction and Quantitative Analysis of Large Scientific Document Corpora · August 2015 - May 2016 Projects & Field Work flag United States of America
Bass Connections Faculty Team Member - Modeling and Simulation · August 2014 - July 2015 Projects & Field Work flag United States of America
Bass Connections Faculty Team Member - Modeling Tools for Energy Systems Analysis (MOTESA) · July 2014 - May 2015 Projects & Field Work flag United States of America
Bass Connections Faculty Team Member - Modeling Tools for Energy Systems Analysis (MOTESA) · August 2013 - May 2014 Projects & Field Work flag United States of America

Service to the Profession


Tech. Committee member - IEEE HPEC-2025 · 2025 - 2025 Committee Service MIT-LL , Boston, MA

HPEC is the largest computing conference in New England and is the premier conference in the world on the convergence of High Performance and Embedded Computing. We are passionate about performance. Our community is interested in computing hardware, software, systems and applications where performance matters. We welcome experts and people who are new to the field.

Service to Duke


faculty promotion (Department) · 2025 - 2025 Committee Service Computer Science , Duke CS

Academic & Administrative Activities


design and develop new course material to bridge classical analysis methods to modern data analysis, AI and ML