Subtrajectory clustering: Models and algorithms
We propose a model for subtrajectory clustering'the clustering of subsequences of trajectories; each cluster of subtrajectories is represented as a pathlet, a sequence of points that is not necessarily a subsequence of an input trajectory. Given a set of trajectories, our clustering model attempts to capture the shared portions between them by assuming each trajectory is a concatenation of a small set of pathlets, with possible gaps in between. We present a single objective function for finding the optimal collection of pathlets that best represents the trajectories taking into account noise and other artifacts of the data. We show that the subtrajectory clustering problem is NP-Hard and present fast approximation algorithms for subtrajectory clustering. We further improve the running time of our algorithm if the input trajectories are “well-behaved." Finally, we present experimental results on both real and synthetic data sets. We show via visualization and quantitative analysis that the algorithm indeed handles the desiderata of being robust to variations, being efficient and accurate, and being data-driven.
Agarwal, PK; Fox, K; Munagala, K; Nath, A; Pan, J; Taylor, E
Proceedings of the Acm Sigact Sigmod Sigart Symposium on Principles of Database Systems
Start / End Page
International Standard Book Number 13 (ISBN-13)
Digital Object Identifier (DOI)