Skip to main content

Xiaobai Sun

Professor of Computer Science
Computer Science
Box 90129, Durham, NC 27708-0129
D107 Lev Sci Res Ctr, Durham, NC 27708

Selected Publications


Algebraic Vertex Ordering of a Sparse Graph for Adjacency Access Locality and Graph Compression

Conference 2024 IEEE High Performance Extreme Computing Conference Hpec 2024 · January 1, 2024 In this work, we establish theoretical and practical connections between vertex indexing for sparse graph/network compression and matrix ordering for sparse matrix-vector multiplication and variable elimination. We present a fundamental analysis of adjacen ... Full text Cite

Parallel Clustering with Resolution Variation

Conference 2023 IEEE High Performance Extreme Computing Conference Hpec 2023 · January 1, 2023 We introduce a novel approach for parallel data clustering with resolution variation. Conventional graph clustering is typically governed by a function defined over all possible cluster configurations but at a fixed value of the resolution hyperparameter d ... Full text Cite

Corrections to "iPhantom: A Framework for Automated Creation of Individualized Computational Phantoms and its Application to CT Organ Dosimetry".

Journal Article IEEE J Biomed Health Inform · January 2022 In [1], the dose estimation accuracy using the alternative baseline method under modulated tube current was not correctly calculated due to an unintentional simulation error. ... Full text Link to item Cite

Fast Graph Algorithms for Superpixel Segmentation

Conference 2022 IEEE High Performance Extreme Computing Conference Hpec 2022 · January 1, 2022 We introduce the novel graph-based algorithm SLAM (simultaneous local assortative mixing) for fast and high-quality superpixel segmentation of any large color image. Super-pixels are compact semantic image elements; superpixel segmen-tation is fundamental ... Full text Cite

iPhantom: A Framework for Automated Creation of Individualized Computational Phantoms and Its Application to CT Organ Dosimetry.

Journal Article IEEE J Biomed Health Inform · August 2021 OBJECTIVE: This study aims to develop and validate a novel framework, iPhantom, for automated creation of patient-specific phantoms or "digital-twins (DT)" using patient medical images. The framework is applied to assess radiation dose to radiosensitive or ... Full text Link to item Cite

Digraph Clustering by the BlueRed Method

Conference 2021 IEEE High Performance Extreme Computing Conference Hpec 2021 · January 1, 2021 We introduce a new method for vertex clustering or community detection on directed graphs (digraphs). The new method is an extension of the BlueRed method introduced initially for undirected graphs. Complementary to supervised or semisupervised classificat ... Full text Cite

Fast Graphlet Transform of Sparse Graphs

Journal Article 2020 IEEE High Performance Extreme Computing Conference Hpec 2020 · September 22, 2020 We introduce the computational problem of graphlet transform of a sparse graph. Graphlets are fundamental topology elements of all graphs/networks. They can be used as coding elements to encode graph-topological information at multiple granularity levels, ... Full text Cite

Using Graphlet Spectrograms for Temporal Pattern Analysis of Virus-Research Collaboration Networks.

Journal Article ArXiv · September 1, 2020 We introduce a new method for temporal pattern analysis of scientific collaboration networks. We investigate in particular virus research activities through five epidemic or pandemic outbreaks in the recent two decades and in the ongoing pandemic with COVI ... Link to item Cite

Spaceland embedding of sparse stochastic graphs

Journal Article 2019 IEEE High Performance Extreme Computing Conference Hpec 2019 · September 1, 2019 We introduce SG-t-SNE, a nonlinear method for embedding stochastic graphs/networks into d-dimensional spaces, d = 1, 2, 3, without requiring vertex features to reside in, or be transformed into, a metric space. Graphs/networks are relational data, prevalen ... Full text Cite

Auxiliary maximum likelihood estimation for noisy point cloud registration

Conference 2019 IEEE High Performance Extreme Computing Conference Hpec 2019 · September 1, 2019 We establish first a theoretical foundation for the use of Gromov-Hausdorff (GH) distance for point set registration with homeomorphic deformation maps perturbed by Gaussian noise. We then present a probabilistic, deformable registration framework. At the ... Full text Cite

Sparse Dual of the Density Peaks Algorithm for Cluster Analysis of High-dimensional Data

Conference 2018 IEEE High Performance Extreme Computing Conference Hpec 2018 · November 26, 2018 The density peaks (DP) algorithm for cluster analysis, introduced by Rodriguez and Laio in 2014, has proven empirically competitive or superior in multiple aspects to other contemporary clustering algorithms. Yet, it suffers from certain drawbacks and limi ... Full text Cite

Damping Effect on PageRank Distribution

Conference 2018 IEEE High Performance Extreme Computing Conference Hpec 2018 · November 26, 2018 We extend the personalized PageRank model invented by Brin and Page to a model family, endowing each model with a characteristic damping scheme. On social bio-physical networks of today, actions, reactions and counteractions vary or unfold more than ever. ... Full text Cite

Iterative inversion of deformation vector fields with feedback control.

Journal Article Med Phys · July 2018 PURPOSE: Often, the inverse deformation vector field (DVF) is needed together with the corresponding forward DVF in four-dimesional (4D) reconstruction and dose calculation, adaptive radiation therapy, and simultaneous deformable registration. This study a ... Full text Link to item Cite

Intrinsic structure study of whale vocalizations

Conference Oceans 2016 Mts IEEE Monterey Oce 2016 · November 28, 2016 Whale vocalizations can be modeled as polynomial-phase signals, which are widely used in radar and sonar applications. Such signals lie on a nonlinear manifold parameterized by polynomial phase coefficients. In this paper, we apply manifold learning method ... Full text Cite

RECFMM: Recursive Parallelization of the Adaptive Fast Multipole Method for Coulomb and Screened Coulomb Interactions

Journal Article Communications in Computational Physics · August 1, 2016 We present RECFMM, a program representation and implementation of a recursive scheme for parallelizing the adaptive fast multipole method (FMM) on shared-memory computers. It achieves remarkable high performance while maintaining mathematical clarity and f ... Full text Cite

Parallel AFMPB solver with automatic surface meshing for calculation of molecular solvation free energy

Journal Article Computer Physics Communications · May 1, 2015 We present PAFMPB, an updated and parallel version of the AFMPB software package for fast calculation of molecular solvation-free energy. The new version has the following new features: (1) The adaptive fast multipole method and the boundary element method ... Full text Cite

Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy

Conference International Conference for High Performance Computing Networking Storage and Analysis Sc · January 16, 2014 In this work, we provide energy-efficient architectural support for floating point accuracy. For each floating point addition performed, we "recycle" that operation's rounding error. We make this error architecturally visible such that it can be used, when ... Full text Cite

HDR Deghosting: How to deal with saturation?

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · November 15, 2013 We present a novel method for aligning images in an HDR (high-dynamic-range) image stack to produce a new exposure stack where all the images are aligned and appear as if they were taken simultaneously, even in the case of highly dynamic scenes. Our method ... Full text Cite

Big snapshot stitching with scarce overlap

Journal Article 2013 IEEE High Performance Extreme Computing Conference Hpec 2013 · January 1, 2013 We address certain properties that arise in gigapixel-scale image stitching for snapshot images captured with a novel micro-camera array system, AWARE-2. This system features a greatly extended field of view and high optical resolution, offering unique sen ... Full text Cite

Mathematical and numerical aspects of the adaptive fast multipole Poisson-Boltzmann solver

Journal Article Communications in Computational Physics · January 1, 2013 This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary inte ... Full text Cite

Parallel search of k-nearest neighbors with synchronous operations

Journal Article 2012 IEEE Conference on High Performance Extreme Computing Hpec 2012 · December 1, 2012 We present a new study of parallel algorithms for locating k-nearest neighbors (kNN) of each single query in a high dimensional (feature) space on a many-core processor or accelerator that favors synchronous operations, such as on a graphics processing uni ... Full text Cite

Automatic parallel code generation for NuFFT data translation on multicores

Journal Article Journal of Circuits Systems and Computers · April 1, 2012 The nonuniform FFT (NuFFT) is widely used in many applications. Focusing on the most time-consuming part of the NuFFT computation, the data translation step, in this paper, we develop an automatic parallel code generation tool for data translation targetin ... Full text Cite

A Fourier-series-based kernel-independent fast multipole method

Journal Article Journal of Computational Physics · January 1, 2011 We present in this paper a new kernel-independent fast multipole method (FMM), named as FKI-FMM, for pairwise particle interactions with translation-invariant kernel functions. FKI-FMM creates, using numerical techniques, sufficiently accurate and compress ... Full text Cite

Revision of FMM–Yukawa: An adaptive fast multipole method for screened Coulomb interactions

Journal Article Computer Physics Communications · December 1, 2010 FMM–YUKAWA is a mathematical software package primarily for rapid evaluation of the screened Coulomb interactions of N particles in three dimensional space. Since its release, we have revised and re-organized the data structure, software architecture, and ... Full text Cite

Scalable parallelization strategies to accelerate NuFFT data translation on multicores

Journal Article Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics · November 19, 2010 The non-uniform FFT (NuFFT) has been widely used in many applications. In this paper, we propose two new scalable parallelization strategies to accelerate the data translation step of the NuFFT on multicore machines. Both schemes employ geometric tiling an ... Full text Cite

A special-purpose compiler for look-up table and code generation for function evaluation

Journal Article Proceedings Design Automation and Test in Europe Date · June 9, 2010 Elementary functions are extensively used in computer graphics, signal and image processing, and communication systems. This paper presents a special-purpose compiler that automatically generates customized look-up tables and implementations for elementary ... Cite

Exploring parallelization strategies for NUFFT data translation

Journal Article Embedded Systems Week 2009 Proceedings of the 7th ACM International Conference on Embedded Software Emsoft 09 · December 24, 2009 This paper introduces parallelization strategies for the Non-Uniform FFT (NUFFT) data translation on multicore architectures. The NUFFT enables the use of the celebrated FFT with un-equally spaced data in numerous situations in signal and image processing ... Full text Cite

An automated framework for accelerating numerical algorithms on reconfigurable platforms using algorithmic/architectural optimization

Journal Article IEEE Transactions on Computers · December 1, 2009 This paper describes TANOR, an automated framework for designing hardware accelerators for numerical computation on reconfigurable platforms. Applications utilizing numerical algorithms on large-size data sets require high-throughput computation platforms. ... Full text Cite

Automated optimization of look-up table implementation for function evaluation on FPGAs

Journal Article Proceedings of SPIE the International Society for Optical Engineering · November 11, 2009 This paper presents a systematic approach for automatic generation of look-up-table (LUT) for function evaluations and minimization in hardware resource on field programmable gate arrays (FPGAs). The class of functions supported by this approach includes s ... Full text Cite

Fast computation of local correlation coefficients on graphics processing units

Journal Article Proceedings of SPIE the International Society for Optical Engineering · November 11, 2009 This paper presents an acceleration method, using both algorithmic and architectural means, for fast calculation of local correlation coefficients, which is a basic image-based information processing step for template or pattern matching, image registratio ... Full text Cite

Design and characterization of thin multiple aperture infrared cameras.

Journal Article Applied optics · April 2009 We describe a multiple-aperture long-wave infrared camera built on an uncooled microbolometer array with the objective of decreasing camera thickness. The 5 mm thick optical system is an f/1.2 design with a 6.15 mm effective focal length. An integrated ima ... Full text Cite

Video rate spectral imaging using a coded aperture snapshot spectral imager.

Journal Article Optics express · April 2009 We have previously reported on coded aperture snapshot spectral imagers (CASSI) that can capture a full frame spectral image in a snapshot. Here we describe the use of CASSI for spectral imaging of a dynamic scene at video rate. We describe significant adv ... Full text Cite

Fast computation of local correlation coefficients

Journal Article Proceedings of SPIE the International Society for Optical Engineering · December 18, 2008 This paper presents an acceleration method, using both algorithmic and architectural means, for fast calculation of local correlation coefficients , which is a basic image-based information processing step for template or pattern matching, image registrati ... Full text Cite

Spectral image estimation for coded aperture snapshot spectral imagers

Journal Article Proceedings of SPIE the International Society for Optical Engineering · December 17, 2008 This paper describes numerical estimation techniques for coded aperture snapshot spectral imagers (CASSI). In a snapshot, a CASSI captures a two-dimensional (2D) array of measurements that is an encoded representation of both spectral information and 2D sp ... Full text Cite

Accelerating nonuniform fast Fourier transform via reduction in memory access latency

Journal Article Proceedings of SPIE the International Society for Optical Engineering · December 17, 2008 We address the discrepancy that existed between the low arithmetic complexity of nonuniform Fast Fourier Transform (NUFFT) algorithms and high latency in practical use of NUFFTs with large data sets, especially, in multi-dimensional domains. The execution ... Full text Cite

Solving non-negative linear inverse problems with the NeAREst method

Journal Article Proceedings of SPIE the International Society for Optical Engineering · December 17, 2008 This paper introduces the theoretical development of a numerical method, named NeAREst, for solving non-negative linear inverse problems, which arise often from physical or probabilistic models, especially, in image estimation with limited and indirect mea ... Full text Cite

Multichannel sampling schemes for optical imaging systems.

Journal Article Applied optics · April 2008 We introduce a framework of focal-plane coding schemes for multichannel sampling in optical systems. A particular objective is to develop an ultrathin imager without compromising image resolution. We present a complete f/2.1 optical system with a thickness ... Full text Cite

Decompositional analysis of Kronecker structured Markov chains

Journal Article Electronic Transactions on Numerical Analysis · January 1, 2008 This contribution proposes a decompositional iterative method with low memory requirements for the steadystate analysis ofKronecker structured Markov chains. The Markovian system is formed by a composition of subsystems using the Kronecker sum operator for ... Cite

TANOR: A tool for accelerating N-body simulations on reconfigurable platform

Journal Article Proceedings 2007 International Conference on Field Programmable Logic and Applications Fpl · December 1, 2007 Algorithm-architecture co-exploration is hindered by the lack of efficient tools. As a consequence, designers are currently able to explore only a limited set of points in the whole design space. There-fore, a tool that can allow fast exploration of algori ... Full text Cite

Efficient function evaluations with lookup tables for structured matrix operations

Journal Article IEEE Workshop on Signal Processing Systems Sips Design and Implementation · December 1, 2007 A hardware efficient approach is introduced for elementary function evaluations in certain structured matrix computations. It is a comprehensive approach that utilizes lookup tables for compactness, employs interpolations with adders and multipliers for th ... Full text Cite

The MONTAGE least gradient image reconstruction

Journal Article Optics InfoBase Conference Papers · January 1, 2007 We introduce an image reconstruction algorithm for the Compressive Optical MONTAGE Photography Initiative, for recovering the resolution of an image from a set of aliased subimages acquired by a lenslet array optical system. © 2007 Optical Society of Ameri ... Cite

Compressive measurements for video

Journal Article Optics InfoBase Conference Papers · January 1, 2007 Redundancies present in video streams could be used to implement compressive sampling to achieve low power video sensors. We explore the possibilities of using this in the design of compressive video sensors and corresponding algorithms. © 2007 Optical Soc ... Cite

Compressive measurements for video

Conference Optics Infobase Conference Papers · January 1, 2007 Redundancies present in video streams could be used to implement compressive sampling to achieve low power video sensors. We explore the possibilities of using this in the design of compressive video sensors and corresponding algorithms. © 2007 Optical Soc ... Full text Cite

The MONTAGE least gradient image reconstruction

Conference Optics Infobase Conference Papers · January 1, 2007 We introduce an image reconstruction algorithm for the Compressive Optical MONTAGE Photography Initiative, for recovering the resolution of an image from a set of aliased subimages acquired by a lenslet array optical system. © 2007 Optical Society of Ameri ... Full text Cite

Compressive imaging sensors

Journal Article Proceedings of SPIE the International Society for Optical Engineering · September 18, 2006 This paper describes a compressive sensing strategy developed under the Compressive Optical MONTAGE Photography Initiative. Multiplex and multi-channel measurements are generally necessary for compressive sensing. In a compressive imaging system described ... Full text Cite

Compressive sampling strategies for integrated microspectrometers

Journal Article Proceedings of SPIE - The International Society for Optical Engineering · 2006 We consider compressive sensing in the context of optical spectroscopy. With compressive sensing, the ratio between the number of measurements and the number of estimated values is less than one, without compromising the fidelity in estimation. A compressi ... Full text Link to item Cite

Geometric tiling for reducing power consumption in structured matrix operations

Journal Article 2006 IEEE International Systems on Chip Conference Soc · January 1, 2006 This work focuses on reducing power consumption while maintaining the efficiency and accuracy of matrix computations using both algorithmic and architectural means. We transform the algorithms, in adaptation to application specifics, to translate the matri ... Full text Cite

Sensor-layer image compression based on the quantized cosine transform

Journal Article Proceedings of SPIE the International Society for Optical Engineering · November 15, 2005 We introduce a novel approach for compressive coding at the sensor layer for an integrated imaging system. Compression at the physical layer reduces the measurements-to-pixels ratio and the data volume for storage and transmission, without confounding imag ... Full text Cite

A workload-based analysis of software aging, and rejuvenation

Journal Article IEEE Transactions on Reliability · September 1, 2005 We present a hierarchical model for the analysis of proactive fault management in the presence of system resource leaks. At the low level of the model hierarchy is a degradation model in which we use a nonhomogeneous Markov chain to establish an explicit c ... Full text Cite

The quantized cosine transform for sensor-layer image compression

Journal Article Optics Infobase Conference Papers · January 1, 2005 We introduce a compressive encoding at the sensor layer based on the quantized cosine transform. Compression at the physical layer of integrated imaging systems reduces the measurements-to-pixels ratio, the data volume and accelerates image estimation. © 2 ... Cite

The quantized cosine transform for sensor-layer image compression

Journal Article Optics InfoBase Conference Papers · January 1, 2005 We introduce a compressive encoding at the sensor layer based on the quantized cosine transform. Compression at the physical layer of integrated imaging systems reduces the measurements-to-pixels ratio, the data volume and accelerates image estimation. © 2 ... Cite

The quantized cosine transform for sensor-layer image compression

Conference Optics Infobase Conference Papers · January 1, 2005 We introduce a compressive encoding at the sensor layer based on the quantized cosine transform. Compression at the physical layer of integrated imaging systems reduces the measurements-to-pixels ratio, the data volume and accelerates image estimation. © 2 ... Full text Cite

Spectral division methods for block generalized Schur decompositions

Journal Article Mathematics of Computation · October 1, 2004 We provide a different perspective of the spectral division methods for block generalized Schur decompositions of matrix pairs. The new approach exposes more algebraic structures of the successive matrix pairs in the spectral division iterations and reveal ... Full text Cite

Reference structure tomography.

Journal Article Journal of the Optical Society of America. A, Optics, image science, and vision · July 2004 Reference structure tomography (RST) uses multidimensional modulations to encode mappings between radiating objects and measurements. RST may be used to image source-density distributions, estimate source parameters, or classify sources. The RST paradigm p ... Full text Cite

Adaptive Software Rejuvenation: Degradation Model and Rejuvenation Scheme

Journal Article Proceedings of the International Conference on Dependable Systems and Networks · December 1, 2003 We present a framework of adaptive estimation and rejuvenation of software system performance in the presence of aging sources. The framework specifies that a degradation model not only describe an aging process but also enable the adaptation of model-base ... Cite

A Kronecker product representation of the fast Gauss transform

Journal Article SIAM Journal on Matrix Analysis and Applications · January 1, 2003 We present a matrix representation for the fast Gauss transform (FGT) originally proposed by Greengard and Strain. With the matrix representation we reveal the matrix structures explored and exploited in the FGT, relate the multidimensional FGT to the one- ... Full text Cite

The generalized Newton iteration for the matrix sign function

Journal Article SIAM Journal on Scientific Computing · January 1, 2003 In this paper we present modified algorithms for computing deflating subspaces of matrix pairs using the matrix sign function. Our new algorithms achieve a considerable reduction of the computational cost of the generalized Newton iteration for the matrix ... Full text Cite

A methodology towards automatic implementation of N-body algorithms

Journal Article Applied Numerical Mathematics · January 1, 2002 We propose a methodology aimed at automating the software development of fast discrete transforms for N-body problems. The methodology starts with a representation of the transform matrix in compact form. Then, two translation phases are applied. One trans ... Full text Cite

FASTSLIM: Prefetch-Safe Trace Reduction for I/O Cache Simulation

Journal Article ACM Transactions on Modeling and Computer Simulation · April 1, 2001 Trace-driven simulation is a valuable tool for evaluating I/O systems. This article presents a new algorithm, called FASTSLIM, that reduces the size of I/O traces and improves simulation performance without compromising simulation accuracy. FASTSLIM is mor ... Full text Cite

A matrix version of the fast multipole method

Journal Article SIAM Review · January 1, 2001 We present a matrix interpretation of the three-dimensional fast multipole method (FMM). The FMM is for efficient computation of gravitational/electrostatic potentials and fields. It has found various applications and inspired the design of many efficient ... Full text Cite

A note on parallel matrix inversion

Journal Article SIAM Journal on Scientific Computing · January 1, 2001 We present one-sweep parallel algorithms for the inversion of general and symmetric positive definite matrices. The algorithms feature simple programming and performance optimization while maintaining the same arithmetic cost and numerical properties of co ... Full text Cite

Parallel spectral division using the matrix sign function for the generalized eigenproblem

Journal Article International Journal of High Speed Computing · December 1, 2000 In this paper we demonstrate the parallelism of the spectral division using the matrix sign function for the generalized nonsymmetric eigenproblem. We employ the so-called generalized Newton iterative scheme in order to compute the sign function of the mat ... Full text Cite

Structured matrix representations of two-parameter Hankel transforms in adaptive optics

Journal Article Linear Algebra and Its Applications · September 1, 2000 We derive efficient approaches for two-parameter Hankel transforms. Such transforms arise, for example, in covariance matrix computations for performance modeling and evaluation of adaptive optics (AO) systems. Fast transforms are highly desirable since th ... Full text Cite

A mathematical framework for the linear reconstructor problem in adaptive optics

Journal Article Linear Algebra and Its Applications · September 1, 2000 The wave front field aberrations induced by atmospheric turbulence can severely degrade the performance of an optical imaging system. Adaptive optics refers to the process of removing unwanted wave front distortions in real time, i.e., before the image is ... Full text Cite

Explicit generation of unitary transformations in a single atom or molecule

Journal Article Physical Review A Atomic Molecular and Optical Physics · January 1, 2000 A constructive procedure for generating a prescribed unitary transform via the optically driven evolution of a multilevel atom is described. Assuming a clean separation of the coupled levels, the procedure employs the rotating wave approximation together w ... Full text Cite

A framework for symmetric band reduction

Journal Article ACM Transactions on Mathematical Software · January 1, 2000 We develop an algorithmic framework for reducing the bandwidth of symmetric matrices via orthogonal similarity transformations. This framework includes the reduction of full matrices to banded or tridiagonal form and the reduction of banded matrices to nar ... Full text Cite

Algorithm 807: The SBR toolbox - Software for successive band reduction

Journal Article ACM Transactions on Mathematical Software · January 1, 2000 We present a software toolbox for symmetric band reduction via orthogonal transformations, together with a testing and timing program. The toolbox contains drivers and computational routines for the reduction of full symmetric matrices to banded form and t ... Full text Cite

Explicit generation of unitary transformations in a single atom or molecule

Journal Article Physical Review A Atomic Molecular and Optical Physics · January 1, 2000 A constructive procedure for generating a prescribed unitary transform via the optically driven evolution of a multilevel atom is described. Assuming a clean separation of the coupled levels, the procedure employs the rotating wave approximation together w ... Cite

Performance modeling of adaptive-optics imaging systems using fast Hankel transforms

Journal Article Proceedings of SPIE the International Society for Optical Engineering · December 1, 1998 Real-time adaptive-optics is a means for enhancing the resolution of ground based, optical telescopes beyond the limits previously imposed by the turbulent atmosphere. One approach for linear performance modeling of closed-loop adaptive-optics systems invo ... Full text Cite

A BLAS-3 version of the QR factorization with column pivoting

Journal Article SIAM Journal on Scientific Computing · January 1, 1998 The QR factorization with column pivoting (QRP), originally suggested by Golub [Numer. Math., 7 (1965), 206-216], is a popular approach to computing rank-revealing factorizations. Using Level 1 BLAS, it was implemented in LINPACK, and, using Level 2 BLAS, ... Full text Cite

On tridiagonalizing and diagonalizing symmetric matrices with repeated eigenvalues

Journal Article SIAM Journal on Matrix Analysis and Applications · January 1, 1996 We describe a divide-and-conquer tridiagonalization approach for matrices with repeated eigenvalues. Our algorithm hinges on the fact that, under easily constructively verifiable conditions, a symmetric matrix with band width b and k distinct eigenvalues m ... Full text Cite

Parallel performance of a symmetric eigensolver based on the invariant subspace decomposition approach

Journal Article Proceedings of the Scalable High Performance Computing Conference · December 1, 1994 In this paper, we discuss work in progress on a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). We describe a recently developed acceleration technique that substantially reduces the overa ... Cite

Parallel tridiagonalization through two-step band reduction

Journal Article Proceedings of the Scalable High Performance Computing Conference · December 1, 1994 We present a two-step variant of the `successive band reduction' paradigm for the tridiagonalization of symmetric matrices. Here we first reduce a full matrix to narrow-banded form, and from there to tridiagonal form. The first step allows easy exploitatio ... Cite

The PRISM project: Infrastructure and algorithms for parallel eigensolvers

Conference Proceedings of Scalable Parallel Libraries Conference Splc 1993 · January 1, 1993 The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the invariant Subspace Decomposition Algorithm for dense symm ... Full text Cite