Linear Latent Structure Analysis: Modeling High-Dimensional Survey Data
The formulation of the recently developed Linear Latent Structures (LLS) analysis, its statistical properties, the algorithm for parameter estimation and its implementation, simulation studies, and application of LLS model to the National Long Term Care Survey data are discussed. The results of analyses are compared to predictions of the Latent Class model (LCM) and Grade of Membership (GoM) analyses. The LLS analysis assumes that the mutual correlations observed in survey variables reflect a hidden property of subjects that can be described by a low-dimensional random vector. Basic steps of LLS analysis include (i) determining the dimensionality of the explanatory vector, (ii) identifying the linear subspace over which the explanatory vector ranges, (iii) choosing a basis in the indicated subspace using methods of cluster analysis and/or prior knowledge of the phenomenon of interest, (iv) calculating empirical distributions of the LLS scores which reflect individual responses in the linear subspace, and (v) investigating properties of the LLS score distributions to capture population and individual effects (e.g., heterogeneity). Applying the LLS model to the 1994 and 1999 NLTCS datasets (5,000+ individuals) with responses to over 200 questions on behavior factors, functional status, and comorbidities resulted in the identification of a population structure with its basis representing “pure-type individuals”, e.g., healthy, highly disabled, having chronic diseases, etc. The components of the vectors of individual LLS scores are used to make predictions of individual lifespans.