Skip to main content

Carlo Tomasi

Iris Einheuser Distinguished Professor
Computer Science
Box 90129, Durham, NC 27708-0129
D213 LSRC, Durham, NC

Selected Publications


Cross-Attention Transformer for Video Interpolation

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2023 We propose TAIN (Transformers and Attention for video INterpolation), a residual neural network for video interpolation, which aims to interpolate an intermediate frame given two consecutive image frames around it. We first present a novel vision transform ... Full text Cite

SemARFlow: Injecting Semantics into Unsupervised Optical Flow Estimation for Autonomous Driving

Conference Proceedings of the IEEE International Conference on Computer Vision · January 1, 2023 Unsupervised optical flow estimation is especially hard near occlusions and motion boundaries and in low-texture regions. We show that additional information such as semantics and domain knowledge can help better constrain this problem. We introduce SemARF ... Full text Cite

Optical Flow Training Under Limited Label Budget via Active Learning

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2022 Supervised training of optical flow predictors generally yields better accuracy than unsupervised training. However, the improved performance comes at an often high annotation cost. Semi-supervised training trades off accuracy against annotation cost. We u ... Full text Cite

Unsupervised Flow Refinement near Motion Boundaries

Conference BMVC 2022 - 33rd British Machine Vision Conference Proceedings · January 1, 2022 Unsupervised optical flow estimators based on deep learning have attracted increasing attention due to the cost and difficulty of annotating for ground truth. Although performance measured by average End-Point Error (EPE) has improved over the years, flow ... Cite

Joint Detection of Motion Boundaries and Occlusions

Conference 32nd British Machine Vision Conference, BMVC 2021 · January 1, 2021 We propose MONet, a convolutional neural network that jointly detects motion boundaries (MBs) and occlusion regions (Occs) in video both forward and backward in time. Detection is difficult because optical flow is discontinuous along MBs and undefined in O ... Cite

Applying machine learning to investigate long-term insect-plant interactions preserved on digitized herbarium specimens.

Journal Article Applications in plant sciences · June 2020 PremiseDespite the economic significance of insect damage to plants (i.e., herbivory), long-term data documenting changes in herbivory are limited. Millions of pressed plant specimens are now available online and can be used to collect big data on ... Full text Open Access Cite

Person re-identification from gait using an autocorrelation network

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · June 1, 2019 We propose a new biometric feature based on autocorrelation using an end-to-end trained network to capture human gait from different viewpoints. Our method condenses an unbounded image stream into a fixed size descriptor, and capitalizes on the periodic na ... Full text Cite

Features for Multi-target Multi-camera Tracking and Re-identification

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · December 14, 2018 Multi-Target Multi-Camera Tracking (MTMCT) tracks many people through video taken from several cameras. Person Re-Identification (Re-ID) retrieves from a gallery images of people similar to a person query image. We learn good features for both MTMCT and Re ... Full text Cite

Tracking social groups within and across cameras

Journal Article IEEE Transactions on Circuits and Systems for Video Technology · March 1, 2017 We propose a method for tracking groups from single and multiple cameras with disjointed fields of view. Our formulation follows the tracking-by-detection paradigm in which groups are the atomic entities and are linked over time to form long and consistent ... Full text Cite

Using an Image Fusion Methodology to Improve Efficiency and Traceability of Posterior Pole Vessel Analysis by ROPtool.

Journal Article Open Ophthalmol J · 2017 BACKGROUND: The diagnosis of plus disease in retinopathy of prematurity (ROP) largely determines the need for treatment; however, this diagnosis is subjective. To make the diagnosis of plus disease more objective, semi-automated computer programs (e.g. ROP ... Full text Link to item Cite

Deformable Graph Model for Tracking Epithelial Cell Sheets in Fluorescence Microscopy.

Journal Article IEEE transactions on medical imaging · July 2016 We propose a novel method for tracking cells that are connected through a visible network of membrane junctions. Tissues of this form are common in epithelial cell sheets and resemble planar graphs where each face corresponds to a cell. We leverage this st ... Full text Cite

Single-Frame Indexing for 3D Hand Pose Estimation

Conference Proceedings of the IEEE International Conference on Computer Vision · February 11, 2016 Hand pose estimation from 3D sensor data matches a point cloud to a hand model, and has broad applications from gestural interfaces to scene understanding. We propose a novel scheme to index into a database of precomputed hand poses to initialize the match ... Full text Cite

Performance measures and a data set for multi-target, multi-camera tracking

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2016 To help accelerate progress in multi-target, multi-camera tracking systems, we present (i) a new pair of precision-recall measures of performance that treats errors of all types uniformly and emphasizes correct identification over sources of error; (ii) th ... Full text Cite

Distance minimization for reward learning from scored trajectories

Conference 30th AAAI Conference on Artificial Intelligence, AAAI 2016 · January 1, 2016 Many planning methods rely on the use of an immediate reward function as a portable and succinct representation of desired behavior. Rewards are often inferred from demonstrated behavior that is assumed to be near-optimal. We examine a framework, Distance ... Cite

Retinal Artery-Vein Classification via Topology Estimation.

Journal Article IEEE Trans Med Imaging · December 2015 We propose a novel, graph-theoretic framework for distinguishing arteries from veins in a fundus image. We make use of the underlying vessel topology to better classify small and midsized vessels. We extend our previously proposed tree topology estimation ... Full text Link to item Cite

Tree topology estimation

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · August 1, 2015 Tree-like structures are fundamental in nature, and it is often useful to reconstruct the topology of a tree-what connects to what-from a two-dimensional image of it. However, the projected branches often cross in the image: the tree projects to a planar g ... Full text Cite

Tracking multiple people online and in real time

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2015 We cast the problem of tracking several people as a graph partitioning problem that takes the form of an NP-hard binary integer program. We propose a tractable, approximate, online solution through the combination of a multi-stage cascade and a sliding tem ... Full text Cite

A linear system form solution to compute the local space average color

Journal Article Machine Vision and Applications · October 1, 2013 In this document, we present an alternative to the method introduced by Ebner (Pattern Recognit 60-67, 2003; J Parallel Distrib Comput 64(1):79-88, 2004; Color constancy using local color shifts, pp 276-287, 2004; Color Constancy, 2007; Mach Vis Appl 20(5) ... Full text Cite

Automated non-rigid registration and mosaicing for robust imaging of distinct retinal capillary beds using speckle variance optical coherence tomography.

Journal Article Biomedical optics express · June 2013 Variance processing methods in Fourier domain optical coherence tomography (FD-OCT) have enabled depth-resolved visualization of the capillary beds in the retina due to the development of imaging systems capable of acquiring A-scan data in the 100 kHz regi ... Full text Cite

Cross-Attention Transformer for Video Interpolation

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2023 We propose TAIN (Transformers and Attention for video INterpolation), a residual neural network for video interpolation, which aims to interpolate an intermediate frame given two consecutive image frames around it. We first present a novel vision transform ... Full text Cite

SemARFlow: Injecting Semantics into Unsupervised Optical Flow Estimation for Autonomous Driving

Conference Proceedings of the IEEE International Conference on Computer Vision · January 1, 2023 Unsupervised optical flow estimation is especially hard near occlusions and motion boundaries and in low-texture regions. We show that additional information such as semantics and domain knowledge can help better constrain this problem. We introduce SemARF ... Full text Cite

Optical Flow Training Under Limited Label Budget via Active Learning

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2022 Supervised training of optical flow predictors generally yields better accuracy than unsupervised training. However, the improved performance comes at an often high annotation cost. Semi-supervised training trades off accuracy against annotation cost. We u ... Full text Cite

Unsupervised Flow Refinement near Motion Boundaries

Conference BMVC 2022 - 33rd British Machine Vision Conference Proceedings · January 1, 2022 Unsupervised optical flow estimators based on deep learning have attracted increasing attention due to the cost and difficulty of annotating for ground truth. Although performance measured by average End-Point Error (EPE) has improved over the years, flow ... Cite

Joint Detection of Motion Boundaries and Occlusions

Conference 32nd British Machine Vision Conference, BMVC 2021 · January 1, 2021 We propose MONet, a convolutional neural network that jointly detects motion boundaries (MBs) and occlusion regions (Occs) in video both forward and backward in time. Detection is difficult because optical flow is discontinuous along MBs and undefined in O ... Cite

Applying machine learning to investigate long-term insect-plant interactions preserved on digitized herbarium specimens.

Journal Article Applications in plant sciences · June 2020 PremiseDespite the economic significance of insect damage to plants (i.e., herbivory), long-term data documenting changes in herbivory are limited. Millions of pressed plant specimens are now available online and can be used to collect big data on ... Full text Open Access Cite

Person re-identification from gait using an autocorrelation network

Conference IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · June 1, 2019 We propose a new biometric feature based on autocorrelation using an end-to-end trained network to capture human gait from different viewpoints. Our method condenses an unbounded image stream into a fixed size descriptor, and capitalizes on the periodic na ... Full text Cite

Features for Multi-target Multi-camera Tracking and Re-identification

Conference Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · December 14, 2018 Multi-Target Multi-Camera Tracking (MTMCT) tracks many people through video taken from several cameras. Person Re-Identification (Re-ID) retrieves from a gallery images of people similar to a person query image. We learn good features for both MTMCT and Re ... Full text Cite

Tracking social groups within and across cameras

Journal Article IEEE Transactions on Circuits and Systems for Video Technology · March 1, 2017 We propose a method for tracking groups from single and multiple cameras with disjointed fields of view. Our formulation follows the tracking-by-detection paradigm in which groups are the atomic entities and are linked over time to form long and consistent ... Full text Cite

Using an Image Fusion Methodology to Improve Efficiency and Traceability of Posterior Pole Vessel Analysis by ROPtool.

Journal Article Open Ophthalmol J · 2017 BACKGROUND: The diagnosis of plus disease in retinopathy of prematurity (ROP) largely determines the need for treatment; however, this diagnosis is subjective. To make the diagnosis of plus disease more objective, semi-automated computer programs (e.g. ROP ... Full text Link to item Cite

Deformable Graph Model for Tracking Epithelial Cell Sheets in Fluorescence Microscopy.

Journal Article IEEE transactions on medical imaging · July 2016 We propose a novel method for tracking cells that are connected through a visible network of membrane junctions. Tissues of this form are common in epithelial cell sheets and resemble planar graphs where each face corresponds to a cell. We leverage this st ... Full text Cite

Single-Frame Indexing for 3D Hand Pose Estimation

Conference Proceedings of the IEEE International Conference on Computer Vision · February 11, 2016 Hand pose estimation from 3D sensor data matches a point cloud to a hand model, and has broad applications from gestural interfaces to scene understanding. We propose a novel scheme to index into a database of precomputed hand poses to initialize the match ... Full text Cite

Performance measures and a data set for multi-target, multi-camera tracking

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2016 To help accelerate progress in multi-target, multi-camera tracking systems, we present (i) a new pair of precision-recall measures of performance that treats errors of all types uniformly and emphasizes correct identification over sources of error; (ii) th ... Full text Cite

Distance minimization for reward learning from scored trajectories

Conference 30th AAAI Conference on Artificial Intelligence, AAAI 2016 · January 1, 2016 Many planning methods rely on the use of an immediate reward function as a portable and succinct representation of desired behavior. Rewards are often inferred from demonstrated behavior that is assumed to be near-optimal. We examine a framework, Distance ... Cite

Retinal Artery-Vein Classification via Topology Estimation.

Journal Article IEEE Trans Med Imaging · December 2015 We propose a novel, graph-theoretic framework for distinguishing arteries from veins in a fundus image. We make use of the underlying vessel topology to better classify small and midsized vessels. We extend our previously proposed tree topology estimation ... Full text Link to item Cite

Tree topology estimation

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · August 1, 2015 Tree-like structures are fundamental in nature, and it is often useful to reconstruct the topology of a tree-what connects to what-from a two-dimensional image of it. However, the projected branches often cross in the image: the tree projects to a planar g ... Full text Cite

Tracking multiple people online and in real time

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2015 We cast the problem of tracking several people as a graph partitioning problem that takes the form of an NP-hard binary integer program. We propose a tractable, approximate, online solution through the combination of a multi-stage cascade and a sliding tem ... Full text Cite

A linear system form solution to compute the local space average color

Journal Article Machine Vision and Applications · October 1, 2013 In this document, we present an alternative to the method introduced by Ebner (Pattern Recognit 60-67, 2003; J Parallel Distrib Comput 64(1):79-88, 2004; Color constancy using local color shifts, pp 276-287, 2004; Color Constancy, 2007; Mach Vis Appl 20(5) ... Full text Cite

Automated non-rigid registration and mosaicing for robust imaging of distinct retinal capillary beds using speckle variance optical coherence tomography.

Journal Article Biomedical optics express · June 2013 Variance processing methods in Fourier domain optical coherence tomography (FD-OCT) have enabled depth-resolved visualization of the capillary beds in the retina due to the development of imaging systems capable of acquiring A-scan data in the 100 kHz regi ... Full text Cite

Video motion for every visible point

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 2013 Dense motion of image points over many video frames can provide important information about the world. However, occlusions and drift make it impossible to compute long motion paths by merely concatenating optical flow vectors between consecutive frames. In ... Full text Cite

Fast tiered labeling with topological priors

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · October 30, 2012 We consider labeling an image with multiple tiers. Tiers, one on top of another, enforce a strict vertical order among objects (e.g. sky is above the ground). Two new ideas are explored: First, under a simplification of the general tiered labeling framewor ... Full text Cite

Simultaneous compaction and factorization of sparse image motion matrices

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · October 30, 2012 Matrices that collect the image coordinates of point features tracked through video - one column per feature - have often low rank, either exactly or approximately. This observation has led to many matrix factorization methods for 3D reconstruction, motion ... Full text Cite

Nested pictorial structures

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · October 30, 2012 We propose a theoretical construct coined nested pictorial structure to represent an object by parts that are recursively nested. Three innovative ideas are proposed: First, the nested pictorial structure finds a part configuration that is allowed to be de ... Full text Cite

Oscillation regularization

Journal Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · October 23, 2012 We measure the degree of oscillation of a sampled function f by the number of its local extrema. The greater this number, the more oscillatory and complex f becomes. In signal denoising, we want a restored function g that is simple and fits the data f well ... Full text Cite

Topological persistence on a Jordan curve

Journal Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · October 23, 2012 Topological persistence measures the resilience of extrema of a function to perturbations, and has received increasing attention in computer graphics, visualization and computer vision. While the notion of topological persistence for piece-wise linear func ... Full text Cite

Shape from point features

Journal Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · October 23, 2012 We present a nonparametric and efficient method for shape localization that improves on the traditional sub-window search in capturing the fine geometry of an object from a small number of feature points. Our method implies that the discrete set of feature ... Full text Cite

Dense Lagrangian motion estimation with occlusions

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · October 1, 2012 We couple occlusion modeling and multi-frame motion estimation to compute dense, temporally extended point trajectories in video with significant occlusions. Our approach combines robust spatial regularization with spatially and temporally global occlusion ... Full text Cite

Twisted window search for efficient shape localization

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · October 1, 2012 Many computer vision systems approximate targets' shape with rectangular bounding boxes. This choice trades localization accuracy for efficient computation. We propose twisted window search, a strict generalization over rectangular window search, for the g ... Full text Cite

Exploratory Dijkstra forest based automatic vessel segmentation: applications in video indirect ophthalmoscopy (VIO).

Journal Article Biomed Opt Express · February 1, 2012 We present a methodology for extracting the vascular network in the human retina using Dijkstra's shortest-path algorithm. Our method preserves vessel thickness, requires no manual intervention, and follows vessel branching naturally and efficiently. To te ... Full text Link to item Cite

Detecting motion synchrony by video tubes

Journal Article MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops · December 29, 2011 Motion synchrony, i.e., the coordinated motion of a group of individuals, is an interesting phenomenon in nature or daily life. Fish swim in schools, birds fly in flocks, soldiers march in platoons, etc. Our goal is to detect motion synchrony that may be p ... Full text Cite

Detailed reconstruction of 3D plant root shape

Journal Article Proceedings of the IEEE International Conference on Computer Vision · December 1, 2011 We study the 3D reconstruction of plant roots from multiple 2D images. To meet the challenge caused by the delicate nature of thin branches, we make three innovations to cope with the sensitivity to image quality and calibration. First, we model the backgr ... Full text Cite

Linear time offline tracking and lower envelope algorithms

Journal Article Proceedings of the IEEE International Conference on Computer Vision · December 1, 2011 Offline tracking of visual objects is particularly helpful in the presence of significant occlusions, when a frame-by-frame, causal tracker is likely to lose sight of the target. In addition, the trajectories found by offline tracking are typically smoothe ... Full text Cite

Enhanced video indirect ophthalmoscopy (VIO) via robust mosaicing.

Journal Article Biomed Opt Express · October 1, 2011 Indirect ophthalmoscopy (IO) is the standard of care for evaluation of the neonatal retina. When recorded on video from a head-mounted camera, IO images have low quality and narrow Field of View (FOV). We present an image fusion methodology for converting ... Full text Link to item Cite

Technical perspective:visual reconstruction

Journal Article Communications of the ACM · October 1, 2011 Full text Cite

People detection using color and depth images

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · July 14, 2011 We present a strategy that combines color and depth images to detect people in indoor environments. Similarity of image appearance and closeness in 3D position over time yield weights on the edges of a directed graph that we partition greedily into trackle ... Full text Cite

Efficient visual object tracking with online nearest neighbor classifier

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · March 16, 2011 A tracking-by-detection framework is proposed that combines nearest-neighbor classification of bags of features, efficient subwindow search, and a novel feature selection and pruning method to achieve stability and plasticity in tracking targets of changin ... Full text Cite

Branch and track

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2011 We present a new paradigm for tracking objects in video in the presence of other similar objects. This branch-and-track paradigm is also useful in the absence of motion, for the discovery of repetitive patterns in images. The object of interest is the lead ... Full text Cite

Fingerspelling recognition through classification of letter-to-letter transitions

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · December 29, 2010 We propose a new principle for recognizing .ngerspelling sequences from American Sign Language (ASL). Instead of training a system to recognize the static posture for each letter from an isolated frame, we recognize the dynamic gestures corresponding to tr ... Full text Cite

Critical nets and beta-stable features for image matching

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2010 We propose new ideas and efficient algorithms towards bridging the gap between bag-of-features and constellation descriptors for image matching. Specifically, we show how to compute connections between local image features in the form of a critical net who ... Full text Cite

Semi-Supervised Fisher Linear Discriminant (SFLD)

Journal Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · January 1, 2010 Supervised learning uses a training set of labeled examples to compute a classifier which is a mapping from feature vectors to class labels. The success of a learning algorithm is evaluated by its ability to generalize, i.e., to extend this mapping accurat ... Full text Cite

Manuscript bleed-through removal via hysteresis thresholding

Journal Article Proceedings of the International Conference on Document Analysis and Recognition, ICDAR · December 10, 2009 Many types of degradation can render ancient manuscripts very hard to read. In bleed-through, the text from the reverse, or verso, side of a page seeps through into the front, or recto. In this paper, we propose hysteresis thresholding to greatly reduce bl ... Full text Cite

Phase diffusion for the synchronization of heterogenous sensor streams

Journal Article ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings · September 23, 2009 The analysis of complex human activity typically requires multiple sensors: cameras that take videos from different directions and in different areas, microphones, proximity sensors, range finders, and more. Scenarios where it is not possible to associate ... Full text Cite

International Journal of Computer Vision: Editorial

Journal Article International Journal of Computer Vision · February 1, 2008 Full text Cite

Robust shape normalization based on implicit representations

Journal Article Proceedings - International Conference on Pattern Recognition · January 1, 2008 We introduce a new shape normalization method based on implicit shape representations. The proposed method is robust with respect to deformations and invariant to similarity transformations (translation, isotropic scaling and rotation). The new method has ... Full text Cite

Outlier robust ICP for minimizing fractional RMSD

Journal Article 3DIM 2007 - Proceedings 6th International Conference on 3-D Digital Imaging and Modeling · December 1, 2007 We describe a variation of the iterative closest point (ICP) algorithm for aligning two point sets under a set of transformations. Our algorithm is superior to previous algorithms because (1) in determining the optimal alignment, it identifies and discards ... Full text Cite

Finite-element level-set curve particles

Journal Article Proceedings of the IEEE International Conference on Computer Vision · December 1, 2007 Particle filters encode a time-evolving probability density by maintaining a random sample from it. Level sets represent closed curves as zero crossings of functions of two variables. The combination of level sets and particle filters presents many concept ... Full text Cite

Correspondence as energy-based segmentation

Journal Article Image and Vision Computing · August 1, 2007 We pose the correspondence problem as one of energy-based segmentation. In this framework, correspondence assigns each pixel in an image to exactly one of several non-overlapping regions, and it also computes a displacement function for each region. The fr ... Full text Cite

How to dispatch observers to track an evolving boundary

Journal Article 2007 1st ACM/IEEE International Conference on Distributed Smart Cameras, ICDSC · 2007 Some distributed-sensing applications make it necessary to dispatch a limited number of observers (ships, vehicles, or airplanes with cameras; field workers with chemical kits; high-flying balloons with atmospheric sensors) to track the evolving boundary o ... Full text Cite

How to dispatch observers to track an evolving boundary

Conference 2007 First ACM/IEEE International Conference on Distributed Smart Cameras · 2007 Cite

Level-set curve particles

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2006 In many applications it is necessary to track a moving and deforming boundary on the plane from infrequent, sparse measurements. For instance, each of a set of mobile observers may be able to tell the position of a point on the boundary. Often boundary com ... Full text Cite

Mean shift is a bound optimization.

Journal Article IEEE transactions on pattern analysis and machine intelligence · March 2005 We build on the current understanding of mean shift as an optimization procedure. We demonstrate that, in the case of piecewise constant kernels, mean shift is equivalent to Newton's method. Further, we prove that, for all kernels, the mean shift procedure ... Full text Cite

Proceedings of the 2005 Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05): Preface

Journal Article Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005 · January 1, 2005 Full text Cite

3D head tracking based on recognition and interpolation using a time-of-flight depth sensor

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · October 19, 2004 This paper describes a head-tracking algorithm that is based on recognition and correlation-based weighted interpolation. The input is a sequence of 3D depth images generated by a novel time-of-flight depth sensor. These are processed to segment the backgr ... Cite

Surfaces with occlusions from layered stereo.

Journal Article IEEE transactions on pattern analysis and machine intelligence · August 2004 We propose a new binocular stereo algorithm that estimates scene structure as a collection of smooth surface patches. The disparities within each patch are modeled by a continuous-valued spline, while the extent of each patch is represented via a pixelwise ... Full text Cite

Preface

Journal Article IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops · January 1, 2004 Full text Cite

Preface

Chapter · 2004 Full text Cite

Image similarity using mutual information of regions

Journal Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2004 Mutual information (MI) has emerged in recent years as an effective similarity measure for comparing images. One drawback of MI, however, is that it is calculated on a pixel by pixel basis, meaning that it takes into account only the relationships between ... Full text Cite

Typing in thin air the Canesta projection Keyboard - A new method of interaction with electronic devices

Journal Article Conference on Human Factors in Computing Systems - Proceedings · December 1, 2003 Canesta Keyboard™ is a novel interface to electronic devices that consists of a projection system and a sensor module instead of the mechanical switches of a traditional keyboard. Users input text by pressing keys on a projectedv image of a keyboard. This ... Full text Cite

Surfaces with occlusions from layered stereo

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · September 1, 2003 Although steady progress has been made in recent stereo algorithms, producing accurate results in the neighborhood of depth discontinuities remains a challenge. Moreover, among the techniques that best localize depth discontinuities, it is common to work o ... Cite

Full-size projection keyboard for handheld devices

Journal Article Communications of the ACM · July 1, 2003 The various features of full-size projection keyboard designed to improve the functionability of handheld devices are discussed. The efficacy of these keyboards is high as it might replace laptops and mechanical keyboards, making the machines more thinner ... Full text Cite

Full-size projection keyboard for handheld devices

Journal Article COMMUNICATIONS OF THE ACM · July 1, 2003 Link to item Cite

3D tracking = classification + interpolation

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 2003 Hand gestures are examples of fast and complex motions. Computers fail to track these in fast video, but sleight of hand fools humans as well: what happens too quickly we just cannot see. We show a 3D tracker for these types of motions that relies on the r ... Full text Cite

Edge displacement field-based classification for improved detection of polyps in CT colonography.

Journal Article IEEE transactions on medical imaging · December 2002 Colorectal cancer can easily be prevented provided that the precursors to tumors, small colonic polyps, are detected and removed. Currently, the only definitive examination of the colon is fiber-optic colonoscopy, which is invasive and expensive. Computed ... Full text Cite

Model-based face tracking for view-independent facial expression recognition

Journal Article Proceedings - 5th IEEE International Conference on Automatic Face Gesture Recognition, FGR 2002 · January 1, 2002 Facial expression recognition is necessary for designing any realistic human-machine interfaces. Previous published facial expression recognition systems achieve good recognition rates, but most of them perform well only when the user faces the camera and ... Full text Cite

On the consistency of instantaneous rigid motion estimation

Journal Article International Journal of Computer Vision · January 1, 2002 Instantaneous camera motion estimation is an important research topic in computer vision. Although in the theory more than five points uniquely determine the solution in an ideal situation, in practice one can usually obtain better estimates by using more ... Full text Cite

A statistical 3-D pattern processing method for computer-aided detection of polyps in CT colonography.

Journal Article IEEE transactions on medical imaging · December 2001 Adenomatous polyps in the colon are believed to be the precursor to colorectal carcinoma, the second leading cause of cancer deaths in United States. In this paper, we propose a new method for computer-aided detection of polyps in computed tomography (CT) ... Full text Cite

Assessment of an optical flow field-based polyp detector for CT colonography

Journal Article Annual Reports of the Research Reactor Institute, Kyoto University · December 1, 2001 Most current computer-aided detection (CAD) algorithms for the fully automatic detection of colonic polyps from 3D CT data suffer from high false positive rates. We developed and evaluated a post-processing algorithm to decrease the false positive rate of ... Cite

A new 3-D pattern recognition technique with application to computer aided colonoscopy

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · December 1, 2001 To utilize CT or MRI images for computer aided diagnosis applications, robust features that represent 3-D image data need to be constructed and subsequently used by a classification method. In this paper, we present a computer aided diagnosis system for ea ... Cite

Medical image compression based on region of interest, with application to colon CT images

Journal Article Annual Reports of the Research Reactor Institute, Kyoto University · December 1, 2001 CT or MRI Medical imaging produce human body pictures in digital form. Since these imaging techniques produce prohibitive amounts of data, compression is necessary for storage and communication purposes. Many current compression schemes provide a very high ... Cite

A new 3-D volume processing method for polyp detection

Journal Article Annual Reports of the Research Reactor Institute, Kyoto University · December 1, 2001 Early diagnosis and removal of colonic polyps is effective in the elimination of subsequent carcinoma. This paper presents a new approach for computer-aided detection of polyps. The approach mimics the way the radiologists view CT abdomen images and utiliz ... Cite

Edge, junction, and corner detection using color distributions

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · November 1, 2001 For over 30 years researchers in computer vision have been proposing new methods for performing low-level vision tasks such as detecting edges and corners. One key element shared by most methods is that they represent local image neighborhoods as constant ... Full text Cite

A learning method for automated polyp detection

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2001 Adenomatous polyps in the colon have a high probability of developing into subsequent colorectal carcinoma, the second leading cause of cancer deaths in United States. In this paper, we propose a new method for computer-aided diagnosis of polyps. Initial w ... Full text Cite

Using optical flow fields for Polyp detection in virtual colonoscopy

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 2001 Since the introduction of Computed Tomographic Colonography (CTC), research has mainly focused on visualization and navigation techniques. Recently, efforts have shifted towards computer aided detection (CAD) of polyps. We propose a new approach to CAD in ... Full text Cite

Empirical evaluation of dissimilarity measures for color and texture

Journal Article Computer Vision and Image Understanding · January 1, 2001 This paper empirically compares nine families of image dissimilarity measures that are based on distributions of color and texture features summarizing over 1000 CPU hours of computational experiments. Ground truth is collected via a novel random sampling ... Full text Cite

Earth mover's distance as a metric for image retrieval

Journal Article International Journal of Computer Vision · November 1, 2000 We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sens ... Full text Cite

Alpha estimation in natural images

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 2000 Many boundaries between objects in the world project onto curves in an image. However, boundaries involving natural objects (e.g., trees, hair, water, smoke) are often unworkable under this model because many pixels receive light from more than one object. ... Cite

How to rotate a camera

Journal Article Proceedings - International Conference on Image Analysis and Processing, ICIAP 1999 · December 1, 1999 A procedure is proposed that, given any rotating device to support a camera, places the camera's center of projection to within a tenth of a millimeter from the axis of the rotating device, even with wide angle lenses with severe distortion. Results are ex ... Full text Cite

Distinctiveness maps for image matching

Journal Article Proceedings - International Conference on Image Analysis and Processing, ICIAP 1999 · December 1, 1999 Stereo correspondence is hard because different image features can look alike. We propose a measure for the ambiguity of image points that allows matching of distinctive points first and breaks down the matching task into smaller and separate subproblems. ... Full text Cite

Depth discontinuities by pixel-to-pixel stereo

Journal Article International Journal of Computer Vision · December 1, 1999 An algorithm to detect depth discontinuities from a stereo pair of images is presented. The algorithm matches individual pixels in corresponding scanline pairs, while allowing occluded pixels to remain unmatched, then propagates the information between sca ... Full text Cite

Autonomous observer: a tool for remote experimentation in robotics

Journal Article Proceedings of SPIE - The International Society for Optical Engineering · December 1, 1999 This paper describes a robotics technology - the Autonomous Observer (AO) - developed to facilitate experimentation over the Internet. The AO is a mobile robot equipped with visual sensors. It applies visual tracking and motion planning techniques to track ... Cite

Multiway cut for stereo and motion with slanted surfaces

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 1999 Slanted surfaces pose a problem for correspondence algorithms utilizing search because of the greatly increased number of possibilities, when compared with fronto-parallel surfaces. In this paper we propose an algorithm to compute correspondence between st ... Full text Cite

Texture-based image retrieval without segmentation

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 1999 Image segmentation is not only hard and unnecessary for texture-based image retrieval, but can even be harmful. Images of either individual or multiple textures are best described by distributions of spatial frequency descriptors, rather than single descri ... Full text Cite

Representation issues in the ML estimation of camera motion

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 1999 The computation of camera motion from image measurements is a parameter estimation problem. We show that for the analysis of the problem's sensitivity, the parametrization must enjoy the property of fairness, which makes sensitivity results invariant to ch ... Full text Cite

Empirical evaluation of dissimilarity measures for color and texture

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 1999 This paper empirically compares nine image dissimilarity measures that are based on distributions of color and texture features summarizing over 1,000 CPU hours of computational experiments. Ground truth is collected via a novel random sampling scheme for ... Cite

Corner detection in textured color images

Journal Article Proceedings of the IEEE International Conference on Computer Vision · January 1, 1999 Corner models in the literature have lagged behind edge models with respect to color and shading. We use both a region model, based on distributions of pixel colors, and an edge model, which removes false positives, to perform corner detection on color ima ... Cite

Color edge detection with the compass operator

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 1999 The compass operator detects step edges without assuming that the regions on either side have constant color. Using distributions of pixel colors rather than the mean, the operator finds the orientation of a diameter that maximizes the difference between t ... Cite

Fast, robust, and consistent camera motion estimation

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 1999 Previous algorithms that recover camera motion from image velocities suffer from both bias and excessive variance in the results. We propose a robust estimator of camera motion that is statistically consistent when image noise is isotropic. Consistency mea ... Cite

Bilateral filtering for gray and color images

Journal Article Proceedings of the IEEE International Conference on Computer Vision · December 1, 1998 Bilateral filtering smooths images while preserving edges, by means of a nonlinear combination of nearby image values. The method is noniterative, local, and simple. It combines gray levels or colors based on both their geometric closeness and their photom ... Cite

Depth discontinuities by pixel-to-pixel stereo

Journal Article Proceedings of the IEEE International Conference on Computer Vision · December 1, 1998 An algorithm to detect depth discontinuities from a stereo pair of images is presented. The algorithm matches individual pixels in corresponding scanline pairs while allowing occluded pixels to remain unmatched, then propagates the information between scan ... Cite

Metric for distributions with applications to image databases

Journal Article Proceedings of the IEEE International Conference on Computer Vision · December 1, 1998 We introduce a new distance between two distributions that we call the Earth Mover's Distance (EMD), which reflects the minimal amount of work that must be performed to transform one distribution into the other by moving `distribution mass' around. This is ... Cite

Texture metrics

Journal Article Proceedings of the IEEE International Conference on Systems, Man and Cybernetics · December 1, 1998 We introduce a class of metric perceptual distances between textures. The first metric is sensitive to both rotation and scale differences, and provides a basis for two other metrics, one invariant to rotation, and the other invariant to both rotation and ... Cite

A pixel dissimilarity measure that is insensitive to image sampling

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · December 1, 1998 Because of image sampling, traditional measures of pixel dissimilarity can assign a large value to two corresponding pixels in a stereo pair, even in the absence of noise and other degrading effects. We propose a measure of dissimilarity that is provably i ... Full text Cite

Stereo matching as a nearest-neighbor problem

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · December 1, 1998 We propose a representation of images, called intrinsic curves, that transforms stereo matching from a search problem into a nearest-neighbor problem. Intrinsic curves are the paths that a set of local image descriptors trace as an image scanline is traver ... Full text Cite

Visual routines for mobile robots: Experimental results

Journal Article Expert Systems with Applications · January 1, 1998 In this paper, we present a set of visual related routines. Our objective is to provide a mobile robot with the visual capabilities necessary to acquire information to execute a given command. We study a generic problem that we call the pickup and delivery ... Full text Cite

Mobile robot obstacle avoidance via depth from focus

Journal Article Robotics and Autonomous Systems · November 20, 1997 A critical challenge in the creation of autonomous mobile robots is the reliable detection of moving and static obstacles. In this paper, we present a passive vision system that recovers coarse depth information reliably and efficiently. This system is bas ... Full text Cite

Adaptive color-image embeddings for database navigation

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 1997 We present a novel approach to the problem of navigating through a database of color images for the purpose of image retrieval. We endow the database with a metric for the color distributions of the images. We then use multi-dimensional scaling techniques ... Full text Cite

Stereo without search

Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) · January 1, 1996 In its traditional formulation, stereo correspondence involves both searching and selecting. Given a feature in one scanline, the corresponding scanline in the other image is searched for the positions of similar features. Often more than one candidate is ... Full text Cite

Comparison of approaches to egomotion computation

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 1996 We evaluated six algorithms for computing egomotion from image velocities. We established benchmarks for quantifying bias and sensitivity to noise, and for quantifying the convergence properties of those algorithms that require numerical search. Our simula ... Full text Cite

Image deformations are better than optical flow

Journal Article Mathematical and Computer Modelling · January 1, 1996 In many computer vision applications it is necessary to compute the direction of heading of a moving camera from the images it produces. Traditionally, this computation has been based on the optical flow, that is, on the motion of point features in the fie ... Full text Cite

Fixed-window image descriptors for image retrieval

Journal Article Proceedings of SPIE - The International Society for Optical Engineering · December 1, 1995 We work towards a content-based image retrieval system, where queries can be image-like objects. At entry time, each image is processed to yield a large number of indices into its windows. A window is a square in a fixed quad-tree decomposition of the imag ... Cite

Linear and Incremental Acquisition of Invariant Shape Models from Image Sequences

Journal Article IEEE Transactions on Pattern Analysis and Machine Intelligence · January 1, 1995 We show how to automatically acquire Euclidian shape representations of objects from noisy image sequences under weak perspective. The proposed method is linear and incremental, requiring no more than pseudoinverse. A nonlinear, but numerically sound prepr ... Full text Cite

Pictures and trails: a new framework for the computation of shape and motion from perspective image sequences

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 1994 This paper presents a new framework for the computation of shape and motion from a sequence of images taken under perspective projection. The framework is based on two abstractions, the picture and trail loci, that represent respectively the set of all pic ... Full text Cite

Good features to track

Journal Article Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition · January 1, 1994 No feature-based vision system can work unless good features can be identified and tracked from frame to frame. Although tracking itself is by and large a solved problem, selecting features that can be tracked well and correspond to physical points in the ... Cite

Direction of heading from image deformations

Journal Article IEEE Computer Vision and Pattern Recognition · December 1, 1993 We propose a method to compute the direction of heading from the differential changes in the angles between the projection rays of pairs of point features. These angles, the image deformations, do not depend on viewer rotation, so the key problem of separa ... Cite

Shape and motion from image streams: A factorization method

Journal Article Proceedings of the National Academy of Sciences of the United States of America · November 1, 1993 Inferring scene geometry and camera motion from a stream of images is possible in principle, but it is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this diffi ... Full text Cite

Linear and incremental acquisition of invariant shape models from image sequences

Journal Article 1993 IEEE 4th International Conference on Computer Vision · January 1, 1993 We show how to automatically acquire similarity-invariant shape representations of objects from noisy image sequences under weak perspective. The proposed method is linear and incremental, requiring no more than pseudo-inverse. It is based on the observati ... Cite

Shape and motion from image streams under orthography: a factorization method

Journal Article International Journal of Computer Vision · January 1, 1992 Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficul ... Full text Cite

Factoring image sequences into shape and motion

Journal Article Proceedings of the IEEE Workshop on Visual Motion · December 1, 1991 Recovering scene geometry and camera motion from a sequence of images is an important problem in computer vision. If the scene geometry is specified by depth measurements, that is, by specifying distances between the camera and feature points in the scene, ... Cite

Shape and motion without depth

Journal Article · December 1, 1990 Inferring the depth and shape of remote objects and the camera motion from a sequence of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. This problem is overcome by inferring shape ... Cite

A Constraint on the Zeros of Ternary Polynomials

Journal Article IEEE Transactions on Circuits and Systems · January 1, 1987 Information on the location of the zeros of polynomials with coefficients in {0,1,–1} is useful in the study of some structures for the VLSI implementation of FIR digital filters. This letter shows that those zeros are confined to a ring-like region of the ... Full text Cite

CONSTRAINT ON THE ZEROS OF TERNARY POLYNOMIALS.

Journal Article IEEE transactions on circuits and systems · 1987 Information on the location of the zeros of polynomials with coefficients in left brace 0,1, minus 1 right brace is useful in the study of some structures for the VLSI implementation of FIR digital filters. It is shown that those zeros are confined to a ri ... Cite

Spectral Analysis of Line Regenerator Time Jitter

Journal Article IEEE Transactions on Communications · January 1, 1984 A closed form expression for the spectral density of the time jitter produced in a line regenerator in the presence of a general polynomial nonlinear circuit, band-limited baseband pulses, and an arbitrary tuned filter is developed. The method applies to a ... Full text Cite

SPECTRAL ANALYSIS OF LINE REGENERATOR TIME JITTER.

Journal Article IEEE Transactions on Communications · 1984 A closed form expression for the spectral density of the time jitter produced in a line regenerator in the presence of a general polynomial nonlinear circuit, band-limited baseband pulses, and an arbitrary tuned filter is developed. The method applies to a ... Cite

MOMENTS OF THE WEIGHTS OF PSEUDO-NOISE SUBSEQUENCES.

Journal Article · December 1, 1982 Cite

ON THE EVALUATION OF THE POWER SPECTRAL DENSITY OF TIME JITTER PRODUCED IN A LINE REGENERATOR.

Journal Article Proceedings - Annual Allerton Conference on Communication, Control, and Computing · December 1, 1982 Cite

TWO SIMPLE ALGORITHMS FOR THE GENERATION OF PARTITIONS OF AN INTEGER.

Journal Article Alta frequenza · January 1, 1982 A very simple recursive techniqe to compute the partitions of an integer k into p parts is described, starting by those of k-1 into p parts or, alternatively, by those of k into p-1 parts. This result is then used to solve some problems encountered in the ... Cite