Since 1995, more significant books have been published on latent class (LC) and finite mixture models than any other class of statistical models. The recent increase in interest in latent class models is due to the development of extended algorithms which allow today's computers to perform LC analyses on data containing more than just a few variables, and the recent realization that the use of such models can yield powerful improvements over traditional approaches to segmentation, as well as to cluster, factor, regression and other kinds of analysis.
Today’s fast computers together with the efficient algorithms used in Latent GOLD make it possible to estimate LC models with many cases, many observed responses (indicators), and many explanatory variables. Extensions and variants of the basic model have been developed to include:
- response variables of mixed scale types, such as nominal, ordinal, (censored/truncated) continuous, and (truncated) counts
- several ordered categorical latent variables called discrete factors ( DFactors)
- discrete and continuous covariates predicting class membership
- predictors of a repeatedly observed response variable
- provisions to relax the local independence assumption
- tools for dealing with sparse tables (bootstrap p values), boundary solutions (Bayes constants), local maxima (multiple start sets), and other problems.
The Advanced version of Latent GOLD 4.5 implements the following additional extensions:
- the option to specify LC models that contain one or more continuous latent variables called continuous factors (CFactors)
- multilevel extensions of the LC model, which involves inclusion of group-level latent classes (GClasses) and/or group-level continuous factors (GCFactors)
- the option to take into account a complex sampling design
What are latent classes or latent segments?
Latent classes are unobservable (latent) subgroups or segments. Cases within the same latent class are homogeneous on certain criteria, while cases in different latent classes are dissimilar from each other in certain important ways. Formally, latent classes are represented by K distinct categories of a nominal latent variable X. .
How do latent class models differ from other latent variable models?
Since the latent variable is categorical, LC modeling differs from more traditional latent variable approaches such as factor analysis, structural equation models, and random-effects regression models that are based on continuous latent variables..
Why is latent class modeling important?
Latent class (LC) modeling, also known as Finite Mixture Modeling, provides a powerful way of identifying latent segments (types) for which parameters in a specified model differ. Latent GOLD® , the most windows-friendly program for latent class modeling, focuses on the three most important kinds of statistical models used in practice - cluster, factor and regression.
How does LC analysis, as implemented by Latent GOLD® 4.5, compare with traditional procedures for cluster or regression?
CLUSTER - Traditional clustering procedures (K-Means, hierarchical clustering) are not model-based and therefore quite limited. LC clustering consistently recovers true structural groups where the traditional algorithms fail. See articles:
Latent Class Cluster Analysis
Comparison of SPSS 2-Step Cluster with Latent GOLD
1. Adequate
Assumptions
K-Means makes assumptions such as local independence or equal within class variance that often conflict with the real world.
Latent GOLD® 4.5 can be used to test these and relax them if they are found to be invalid. This typically yields easier to interpret and simpler (=fewer segments) segmentation in practice.
See Articles:
Latent Class Modeling as a Probabilistic Extension of K-Means Clustering (2002) Quirk's Marketing Research Review, March 2002, 20 & 77-80.
Latent Class Models for Clustering: a Comparison with K-means (2002) Canadian Journal of Marketing Research, 20, 36-43.

Back to Cluster Method table
2. Different Scale Types
Latent GOLD® 4.5 allows for variables to be nominal, ordinal, continuous, count or any mixture of these, any of which may contain missing values. Different scale types are handled by automatically specifying the appropriate distribution.
Moreover, additional scale types such as ranks, partial ranks, and discrete choice data can be analyzed using the Latent GOLD Choice 4.5 add-on program.

Back to Cluster Method table
3. Covariate-Based Profiling
After doing a traditional clustering, discriminant analysis or cross-tabs are often used to describe the resulting clusters, an approach confounded by misclassification and other errors. Latent GOLD® 4.5 allows the inclusion of covariates for simultaneous parameter estimation (based on indicators) and descriptive profiling based on covariates. Covariate based prediction/classification is now available so that new cases for which indicators are not present may be classified based solely on the covariates. Covariates can be continuous as well as categorical.
In addition, SI-CHAID 4.0 links directly to Latent GOLD 4.5 for improved profiling capabilities.
See article:
An Extension of the CHAID Tree-based Segmentation Algorithm to Multiple Dependent Variables (2005) Forthcoming in: C. Weihs & W. Gaul, Classification: The Ubiquitous Challenge, Heidelberg: Springer

Back to Cluster Method table
4. Optimal determination of number of clusters
In traditional clustering procedures, rules of thumb and ad-hoc guess-work are used to determine the number of clusters. Since LC is based on a statistical model, statistics are available to help determine the number of clusters.
Latent GOLD 4.5 includes formal statistical assessment of the improvement resulting from an additional latent class. See Tutorial 1: Using Latent GOLD® 4.5 to Estimate LC Cluster Models

Back to Cluster Method table
REGRESSION - Traditional regression assumes homogeneity across an entire population, which does not allow for the existence of different segments. LC or mixture regression involves estimating a regression model under the assumption that the regression coefficients differ across unobserved (latent) segments, yielding improved predictions.
1. Accounts for Heterogeneity
Traditional regression programs assume that the model holds true for the entire population. Latent GOLD® explores whether model heterogeneity can be explained by unobserved latent segments.
Latent GOLD Advanced also allows for continuous heterogeneity (CFactors). See Article:
Application of Latent Class Models to Food Product Development: a Case Study

Back to Regression Method table
2. Allows differing dependent variable scale types
Latent GOLD® 's mixture regression module is in the General Linear Models (GLM) framework. It allows for dependent variables that are dichotomous, nominal, ordinal, continuous or count. Just select the scale type and the appropriate model is used (logit, multinomial logit, ordinal logit, normal, poisson or binomial count).

Back to Regression Method table
3. Repeated measures structure
Repeated measures structure allows for latent class growth models, latent class conjoint models, Rasch type IRT models, survival models, and many other repeated measure type applications.
Latent GOLD® 4.5 uses a non-parametric random-coefficient model - the random effects are not assumed to come from a multivariate normal distribution. Besides less restrictive assumptions, the LC regression model has the advantage of being extremely fast compared to parametric random-coefficient models, when the outcome variable is non-normal. There are several special outputs for this: mean and standard deviation of coefficients, as well as individual effects.
Latent GOLD® 4.5 Advanced also allows the estimation of parametric random coefficient models.

Back to Regression Method table
4. Complex Sample Design
Latent GOLD® 4.5 Advanced allows the use of sampling weights, stratum, etc., and estimate the design effect.

Back to Regression Method table
5. Multilevel Regression
For global segmentation, Latent GOLD Advanced provides for a sumultaneous segmentation at both the individual and country level.
See articles:
- Bijmolt, T.H., Paas, L.J., Vermunt , J.K. (2004).
Country and Consumer Segmentation: Multi-level Latent Class Analysis of Financial Product Ownership International Journal of Research in Marketing, 21, 323-340
- Vermunt, J.K, and Magidson, J. (2005).
Hierarchical mixture models for nested data structures In C. Weihs und W. Gaul (eds), Classification: The Ubiquitous Challenge. Heidelberg: Springer.

Back to Regression Method table
How was latent class modeling developed?
Latent class (LC) analysis was originally introduced by Lazarsfeld (1950) as a way of explaining respondent heterogeneity in survey response patterns involving dichotomous items. During the 1970s, LC methodology was formalized and extended to nominal variables by Goodman (1974a, 1974b) who also developed the maximum likelihood algorithm that serves as the basis for the Latent GOLD program.
Are latent class models the same as finite mixture models?
Over the same period that latent class models evolved, the related field of finite mixture (FM) models for multivariate normal distributions began to emerge, through the work of Day (1969), Wolfe (1965, 1967, 1970) and others. FM models seek to separate out or ‘un-mix’ data that is assumed to arise as a mixture from a finite number of distinctly different populations. In recent years, the fields of LC and FM modeling have come together and the terms LC model and FM model have become interchangeable with each other. A LC model now refers to any statistical model in which some of the parameters differ across unobserved subgroups (Vermunt and Magidson, 2003a).
|
 |

| Related Links |
 |
|
|
|
|