Subtrajectory clustering: Models and algorithms

Conference Paper

We propose a model for subtrajectory clustering'the clustering of subsequences of trajectories; each cluster of subtrajectories is represented as a pathlet, a sequence of points that is not necessarily a subsequence of an input trajectory. Given a set of trajectories, our clustering model attempts to capture the shared portions between them by assuming each trajectory is a concatenation of a small set of pathlets, with possible gaps in between. We present a single objective function for finding the optimal collection of pathlets that best represents the trajectories taking into account noise and other artifacts of the data. We show that the subtrajectory clustering problem is NP-Hard and present fast approximation algorithms for subtrajectory clustering. We further improve the running time of our algorithm if the input trajectories are “well-behaved." Finally, we present experimental results on both real and synthetic data sets. We show via visualization and quantitative analysis that the algorithm indeed handles the desiderata of being robust to variations, being efficient and accurate, and being data-driven.

Full Text

Duke Authors

Cited Authors

  • Agarwal, PK; Fox, K; Munagala, K; Nath, A; Pan, J; Taylor, E

Published Date

  • May 27, 2018

Published In

  • Proceedings of the Acm Sigact Sigmod Sigart Symposium on Principles of Database Systems

Start / End Page

  • 75 - 87

International Standard Book Number 13 (ISBN-13)

  • 9781450347068

Digital Object Identifier (DOI)

  • 10.1145/3196959.3196972

Citation Source

  • Scopus