Finding Patterns with a Rotten Core: Data Mining for Crime Series with Cores

Published

Journal Article

© Tong Wang et al. 2015; Published by Mary Ann Liebert, Inc. One of the most challenging problems facing crime analysts is that of identifying crime series, which are sets of crimes committed by the same individual or group. Detecting crime series can be an important step in predictive policing, as knowledge of a pattern can be of paramount importance toward finding the offenders or stopping the pattern. Currently, crime analysts detect crime series manually; our goal is to assist them by providing automated tools for discovering crime series from within a database of crimes. Our approach relies on a key hypothesis that each crime series possesses at least one core of crimes that are very similar to each other, which can be used to characterize the modus operandi (M.O.) of the criminal. Based on this assumption, as long as we find all of the cores in the database, we have found a piece of each crime series. We propose a subspace clustering method, where the subspace is the M.O. of the series. The method has three steps: We first construct a similarity graph to link crimes that are generally similar, second we find cores of crime using an integer linear programming approach, and third we construct the rest of the crime series by merging cores to form the full crime series. To judge whether a set of crimes is indeed a core, we consider both pattern-general similarity, which can be learned from past crime series, and pattern-specific similarity, which is specific to the M.O. of the series and cannot be learned. Our method can be used for general pattern detection beyond crime series detection, as cores exist for patterns in many domains.

Full Text

Duke Authors

Cited Authors

  • Wang, T; Rudin, C; Wagner, D; Sevieri, R

Published Date

  • March 1, 2015

Published In

Volume / Issue

  • 3 / 1

Start / End Page

  • 3 - 21

Electronic International Standard Serial Number (EISSN)

  • 2167-647X

International Standard Serial Number (ISSN)

  • 2167-6461

Digital Object Identifier (DOI)

  • 10.1089/big.2014.0021

Citation Source

  • Scopus