Skip to main content

Interpreting self-organizing maps through space-time data models

Publication ,  Journal Article
Sang, H; Gelfand, AE; Lennard, C; Hegerl, G; Hewitson, B
Published in: Annals of Applied Statistics
December 1, 2008

Self-organizing maps (SOMs) are a technique that has been used with high-dimensional data vectors to develop an archetypal set of states (nodes) that span, in some sense, the high-dimensional space. Noteworthy applications include weather states as described by weather variables over a region and speech patterns as characterized by frequencies in time. The SOM approach is essentially a neural network model that implements a nonlinear projection from a high-dimensional input space to a low-dimensional array of neurons. In the process, it also becomes a clustering technique, assigning to any vector in the high-dimensional data space the node (neuron) to which it is closest (using, say, Euclidean distance) in the data space. The number of nodes is thus equal to the number of clusters. However, the primary use for the SOM is as a representation technique, that is, finding a set of nodes which representatively span the high-dimensional space. These nodes are typically displayed using maps to enable visualization of the continuum of the data space. The technique does not appear to have been discussed in the statistics literature so it is our intent here to bring it to the attention of the community. The technique is implemented algorithmically through a training set of vectors. However, through the introduction of stochasticity in the form of a space-time process model, we seek to illuminate and interpret its performance in the context of application to daily data collection. That is, the observed daily state vectors are viewed as a time series of multivariate process realizations which we try to understand under the dimension reduction achieved by the SOM procedure. The application we focus on here is to synoptic climatology where the goal is to develop an array of atmospheric states to capture a collection of distinct circulation patterns. In particular, we have daily weather data observed in the form of 11 variables measured for each of 77 grid cells yielding an 847×1 vector for each day. We have such daily vectors for a period of 31 years (11,315 days). Twelve SOM nodes have been obtained by the meteorologists to represent the space of these data vectors. Again, we try to enhance our understanding of dynamic SOM node behavior arising from this dataset. © Institute of Mathematical Statistics.

Duke Scholars

Published In

Annals of Applied Statistics

DOI

EISSN

1941-7330

ISSN

1932-6157

Publication Date

December 1, 2008

Volume

2

Issue

4

Start / End Page

1194 / 1216

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 1403 Econometrics
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Sang, H., Gelfand, A. E., Lennard, C., Hegerl, G., & Hewitson, B. (2008). Interpreting self-organizing maps through space-time data models. Annals of Applied Statistics, 2(4), 1194–1216. https://doi.org/10.1214/08-AOAS174
Sang, H., A. E. Gelfand, C. Lennard, G. Hegerl, and B. Hewitson. “Interpreting self-organizing maps through space-time data models.” Annals of Applied Statistics 2, no. 4 (December 1, 2008): 1194–1216. https://doi.org/10.1214/08-AOAS174.
Sang H, Gelfand AE, Lennard C, Hegerl G, Hewitson B. Interpreting self-organizing maps through space-time data models. Annals of Applied Statistics. 2008 Dec 1;2(4):1194–216.
Sang, H., et al. “Interpreting self-organizing maps through space-time data models.” Annals of Applied Statistics, vol. 2, no. 4, Dec. 2008, pp. 1194–216. Scopus, doi:10.1214/08-AOAS174.
Sang H, Gelfand AE, Lennard C, Hegerl G, Hewitson B. Interpreting self-organizing maps through space-time data models. Annals of Applied Statistics. 2008 Dec 1;2(4):1194–1216.

Published In

Annals of Applied Statistics

DOI

EISSN

1941-7330

ISSN

1932-6157

Publication Date

December 1, 2008

Volume

2

Issue

4

Start / End Page

1194 / 1216

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 1403 Econometrics
  • 0104 Statistics