Understanding Latent Class Analysis (LCA)

 
An Introduction to
Latent Class Analysis (LCA)
 
 
Slides courtesy of The Methodology Center at Penn State
 
Outline
 
Conceptual introduction to latent class analysis (LCA)
 
An example:
Latent classes of adolescent drinking behavior
 
Types of research questions LCA can address
 
Types of data that can be used with LCA
 
Parameters estimated in LCA and the LCA mathematical model
 
Abbreviations
 
LCA = latent class analysis
Categorical latent variable measured with categorical items
 
LPA = latent profile analysis
Categorical latent variable measured with continuous items
 
Conceptual introduction to
latent class analysis (LCA)
 
 
Latent variable frameworks
 
The basic ideas underlying LCA
 
Individuals can be divided into subgroups based on unobservable
construct
The construct of interest is the latent variable
Subgroups are called latent classes
 
True class membership is unknown
Unknown due to measurement error
Measurement of the construct is based on several categorical indicators
 
Latent classes are mutually exclusive & exhaustive
 
Graphical representation
 
Graphical representation
 
Latent class variable
 
Observed indicators
 
An example:
Latent classes of
adolescent drinking behavior
 
 
Drinking in 12
th
 grade
 
Data from 2004 cohort of Monitoring the Future public release
 
n
 = 2490 high school seniors who answered at least one question
about alcohol use (48% boys, 52% girls)
 
Goals of the Lanza, Collins, Lemmon, & Schafer, 2007 study:
Investigate alcohol use behavior among U.S. 12
th
 graders
Examine gender differences in measurement and behavior
Predict class membership from skipping school and grades
 
Drinking in 12th grade
 
Here, we will…
 
Review the results of the first research question addressed by Lanza,
Collins, Lemmon, & Schafer, 2007:
 
Identify and describe underlying latent classes of drinking behavior in
U.S. 12
th
 grade students
 
The 5-class model
 
The 5-class model
 
Types of research questions
LCA can address
 
 
Weight control strategies 
(Lanza, Savage, & Birch, 2010)
 
What types of weight-loss strategies are used by women?
 
Identified classes:
No Weight Loss Strategy (10%)
Dietary Guidelines (27%)
Guidelines + Macronutrients (39%)
Guidelines + Macronutrients + Restrictive (24%)
 
Substance use behaviors 
(Lanza, Patrick, & Maggs, 2010)
 
What are the substance use behavior profiles among first-year college
students?
 
Identified classes:
Non-users (58%)
Cigarette Smokers (5%)
Binge Drinkers (29%)
Bingers + Marijuana Users (8%)
 
Risky sexual behavior 
(Lanza & Collins, 2008)
 
What are the profiles of dating and sexual risk-taking behaviors
among adolescents and young adults?
 
Identified classes:
Non-Daters (19%)
Daters (29%)
Monogamous (12%)
Multi-partner Safe Sex (23%)
Multi-Partner STI-Exposed Sex (18%)
 
Ecological risk profiles 
(Lanza & Rhoades, 2013)
 
What are the patterns of ecological risk factors experienced by
adolescents that may help explain
differential response to intervention?
 
Identified classes:
Low Risk (31%)
Peer Risk (28%)
Economic Risk (20%)
Household + Peer Risk (12%)
Multi-context Risk (8%)
 
Social network roles 
(Smith & Lanza, 2011)
 
Do people's social connections fall into types of social capital that
represent theorized network roles relevant
for HIV intervention in Namibia?
 
Identified classes:
Single-Group Members (59%)
Connectors (24%)
Single-Group Loyalists (15%)
Selective Connectors (2%)
 
Types of data that can be used
with LCA
 
 
Individuals’ responses to multiple items…
 
Using all categorical indicators usually called LCA
Interested in latent class prevalences and item-response probabilities
 
Using all continuous indicators usually called LPA
Interested in latent profile prevalences and item-response means (and
variances)
 
How many indicators can be used?
 
When many indicators with many response options are used, it can
be difficult to identify reliably the maximum likelihood estimates
 
When few indicators with few response options are used, only a very
small number of latent classes can be identified
 
Practically speaking, it is often a good idea to start with 5-12 binary
indicators
 
A note on missing data
 
Most LCA software can handle missing data
 
Missing data mechanisms:
MAR (missing at random)
Missingness is completely random, or related to observed items
MNAR (missing not at random)
Missingness is related to unobserved items
 
Software assumes data are MAR
 
Parameters estimated in LCA and
the LCA mathematical model
 
 
Estimated parameters
 
Latent class prevalences
e.g., probability of membership in EXPERIMENTERS latent class
 
Item-response probabilities
e.g., probability of reporting PAST-YEAR ALCOHOL USE given membership in
EXPERIMENTERS latent class
 
Latent class notation
 
Y
 represents the vector of all possible response patterns
y
 represents a particular response pattern
Example response pattern for the 7 items from the example of drinking in 12
th
 grade:
y
 = (Y, Y, N, N, N, N, N)
 
X
 
represents the vector of all covariates of interest
x
 represents a particular covariate
 
 
Latent class notation
 
The latent class model can be expressed as
where
 
Latent class notation
 
…with (
c
 = 1,2,…,
K
) latent classes and (
m
 = 1,2,…,
M
) indicators, each
with (
r
m
 = 1,2,…,
R
m
) response options.
 
 
 
= probability of membership in latent class 
c
 
(latent class membership probabilities)
           
  
= probability of response 
r
m
 to indicator 
m
, conditional on
  
membership in latent class 
c
  
(item-response probabilities)
 
Item-response probabilities
 
     parameters express the relation between…
the discrete latent variable in an LCA
and
the observed indicator variables
 
Similar conceptually to factor loadings
Basis for interpretation of latent classes
 
Are probabilities (between 0 and 1)
 
Item-response probabilities
 
     parameters analogous to factor loadings; both…
express the relation between manifest and latent variables
form the basis for interpreting latent structure
 
But…
Factor loadings are     -weights
    parameters are 
probabilities
 
Exercise 1
 
Fitting a latent class model
 
Interpreting the parameters
 
References
 
Lanza, S. T., & Collins, L. M. (2008). A new SAS procedure for latent transition analysis: Transitions in
dating and sexual risk behavior. 
Developmental Psychology, 44
(2), 446.
Lanza, S. T., Collins, L. M., Lemmon, D. R., & Schafer, J. L. (2007). PROC LCA: A SAS procedure for latent
class analysis. 
Structural Equation Modeling, 14
(4), 671-694.
Lanza, S. T., Patrick, M. E., & Maggs, J. L. (2010). Latent transition analysis: benefits of a latent variable
approach to modeling transitions in substance use. 
Journal of Drug Issues, 40
(1), 93-120.
Lanza, S. T., & Rhoades, B. L. (2013). Latent class analysis: An alternative perspective on subgroup analysis
in prevention and treatment. 
Prevention Science, 14
(2), 157-168.
Lanza, S. T., Savage, J. S., & Birch, L. L. (2010). Identification and prediction of latent classes of weight‐loss
strategies among women. 
Obesity, 18
(4), 833-840.
Smith, R. A., & Lanza, S. T. (2011). Testing theoretical network classes and HIV-related correlates with
latent class analysis. 
AIDS Care, 23
(10), 1274-1281.
Slide Note

SBM 4/11/2012

Learn. Apply. Innovate. www.methodswork.com

Handouts provided by Methods Work, LLC

Embed
Share

Latent Class Analysis (LCA) is a powerful statistical method for identifying subgroups within a population based on unobservable constructs. This method helps in addressing various research questions and can be applied to different types of data. Learn about the basic ideas, models, and applications of LCA from this comprehensive slide presentation.


Uploaded on Jul 25, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. An Introduction to Latent Class Analysis (LCA) Slides courtesy of The Methodology Center at Penn State methodology.psu.edu

  2. Outline Conceptual introduction to latent class analysis (LCA) An example: Latent classes of adolescent drinking behavior Types of research questions LCA can address Types of data that can be used with LCA Parameters estimated in LCA and the LCA mathematical model

  3. Abbreviations LCA = latent class analysis Categorical latent variable measured with categorical items LPA = latent profile analysis Categorical latent variable measured with continuous items

  4. Conceptual introduction to latent class analysis (LCA)

  5. Latent variable frameworks Categorical latent variable Continuous latent variable Categorical observed variables Latent Class Analysis (LCA) Latent Trait Analysis or Item Response Theory Continuous observed variables Latent Profile Analysis (LPA) Factor Analysis or Structural Equation Modeling

  6. The basic ideas underlying LCA Individuals can be divided into subgroups based on unobservable construct The construct of interest is the latent variable Subgroups are called latent classes True class membership is unknown Unknown due to measurement error Measurement of the construct is based on several categorical indicators Latent classes are mutually exclusive & exhaustive

  7. Graphical representation C Y1 Y2 Yj

  8. Graphical representation Latent class variable C Observed indicators Y1 Y2 Yj

  9. An example: Latent classes of adolescent drinking behavior

  10. Drinking in 12th grade Data from 2004 cohort of Monitoring the Future public release n = 2490 high school seniors who answered at least one question about alcohol use (48% boys, 52% girls) Goals of the Lanza, Collins, Lemmon, & Schafer, 2007 study: Investigate alcohol use behavior among U.S. 12th graders Examine gender differences in measurement and behavior Predict class membership from skipping school and grades

  11. Drinking in 12th grade Seven items about drinking behavior Proportion who answered Yes Lifetime alcohol use 82% Past-year alcohol use 73% Past-month alcohol use 50% Lifetime drunkenness 57% Past-year drunkenness 49% Past-month drunkenness 29% 5+ drinks in past 2 weeks 26%

  12. Here, we will Review the results of the first research question addressed by Lanza, Collins, Lemmon, & Schafer, 2007: Identify and describe underlying latent classes of drinking behavior in U.S. 12th grade students

  13. The 5-class model Probability of Yes response Class 1 (18%) Class 2 (22%) Class 3 (9%) Class 4 (17%) Class 5 (34%) Item Lifetime alcohol use .00 1.00 1.00 1.00 1.00 Past-year alcohol .00 .61 1.00 1.00 1.00 Past-month alcohol .00 .00 1.00 .39 1.00 Lifetime drunk .00 .24 .29 1.00 1.00 Past-year drunk .00 .00 .00 1.00 1.00 Past-month drunk .00 .00 .00 .00 .92 5+ drinks past 2 wk .00 .00 .16 .00 .73

  14. The 5-class model Probability of Yes response Non- Drinkers Experi- menters Light Drinkers Past Partiers Heavy Drinkers Item Lifetime alcohol use .00 1.00 1.00 1.00 1.00 Past-year alcohol .00 .61 1.00 1.00 1.00 Past-month alcohol .00 .00 1.00 .39 1.00 Lifetime drunk .00 .24 .29 1.00 1.00 Past-year drunk .00 .00 .00 1.00 1.00 Past-month drunk .00 .00 .00 .00 .92 5+ drinks past 2 wk .00 .00 .16 .00 .73

  15. Types of research questions LCA can address

  16. Weight control strategies (Lanza, Savage, & Birch, 2010) What types of weight-loss strategies are used by women? Identified classes: No Weight Loss Strategy (10%) Dietary Guidelines (27%) Guidelines + Macronutrients (39%) Guidelines + Macronutrients + Restrictive (24%)

  17. Substance use behaviors (Lanza, Patrick, & Maggs, 2010) What are the substance use behavior profiles among first-year college students? Identified classes: Non-users (58%) Cigarette Smokers (5%) Binge Drinkers (29%) Bingers + Marijuana Users (8%)

  18. Risky sexual behavior (Lanza & Collins, 2008) What are the profiles of dating and sexual risk-taking behaviors among adolescents and young adults? Identified classes: Non-Daters (19%) Daters (29%) Monogamous (12%) Multi-partner Safe Sex (23%) Multi-Partner STI-Exposed Sex (18%)

  19. Ecological risk profiles (Lanza & Rhoades, 2013) What are the patterns of ecological risk factors experienced by adolescents that may help explain differential response to intervention? Identified classes: Low Risk (31%) Peer Risk (28%) Economic Risk (20%) Household + Peer Risk (12%) Multi-context Risk (8%)

  20. Social network roles (Smith & Lanza, 2011) Do people's social connections fall into types of social capital that represent theorized network roles relevant for HIV intervention in Namibia? Identified classes: Single-Group Members (59%) Connectors (24%) Single-Group Loyalists (15%) Selective Connectors (2%)

  21. Types of data that can be used with LCA

  22. Individuals responses to multiple items Using all categorical indicators usually called LCA Interested in latent class prevalences and item-response probabilities Using all continuous indicators usually called LPA Interested in latent profile prevalences and item-response means (and variances)

  23. How many indicators can be used? When many indicators with many response options are used, it can be difficult to identify reliably the maximum likelihood estimates When few indicators with few response options are used, only a very small number of latent classes can be identified Practically speaking, it is often a good idea to start with 5-12 binary indicators

  24. A note on missing data Most LCA software can handle missing data Missing data mechanisms: MAR (missing at random) Missingness is completely random, or related to observed items MNAR (missing not at random) Missingness is related to unobserved items Software assumes data are MAR

  25. Parameters estimated in LCA and the LCA mathematical model

  26. Estimated parameters Latent class prevalences e.g., probability of membership in EXPERIMENTERS latent class Item-response probabilities e.g., probability of reporting PAST-YEAR ALCOHOL USE given membership in EXPERIMENTERS latent class

  27. Latent class notation Y represents the vector of all possible response patterns y represents a particular response pattern Example response pattern for the 7 items from the example of drinking in 12th grade: y = (Y, Y, N, N, N, N, N) Xrepresents the vector of all covariates of interest x represents a particular covariate

  28. Latent class notation The latent class model can be expressed as R M K m = c 1 m = = = I y mr c ( m r ) P Y [ Y y y | X X x x ] ( ) x x m i i i i c i | i m = = = 1 m r 1 where + + + exp[ x x ] 0 c 1 c i 1 pc ip = = = = ( ) x x P C [ c | X X x x ] i i i c i i i i i K 1 i c + + + + 1 exp[ x x ] i i i 0 c 1 c i 1 pc ip = 1

  29. Latent class notation with (c= 1,2, ,K) latent classes and (m= 1,2, ,M) indicators, each with (rm= 1,2, ,Rm) response options. c = probability of membership in latent class c (latent class membership probabilities) = I y mr c ( m r ) m | = probability of response rm to indicator m, conditional on membership in latent class c (item-response probabilities) m

  30. Item-response probabilities parameters express the relation between the discrete latent variable in an LCA and the observed indicator variables Similar conceptually to factor loadings Basis for interpretation of latent classes Are probabilities (between 0 and 1)

  31. Item-response probabilities parameters analogous to factor loadings; both express the relation between manifest and latent variables form the basis for interpreting latent structure But Factor loadings are -weights parameters are probabilities

  32. Exercise 1 Fitting a latent class model Interpreting the parameters

  33. References Lanza, S. T., & Collins, L. M. (2008). A new SAS procedure for latent transition analysis: Transitions in dating and sexual risk behavior. Developmental Psychology, 44(2), 446. Lanza, S. T., Collins, L. M., Lemmon, D. R., & Schafer, J. L. (2007). PROC LCA: A SAS procedure for latent class analysis. Structural Equation Modeling, 14(4), 671-694. Lanza, S. T., Patrick, M. E., & Maggs, J. L. (2010). Latent transition analysis: benefits of a latent variable approach to modeling transitions in substance use. Journal of Drug Issues, 40(1), 93-120. Lanza, S. T., & Rhoades, B. L. (2013). Latent class analysis: An alternative perspective on subgroup analysis in prevention and treatment. Prevention Science, 14(2), 157-168. Lanza, S. T., Savage, J. S., & Birch, L. L. (2010). Identification and prediction of latent classes of weight loss strategies among women. Obesity, 18(4), 833-840. Smith, R. A., & Lanza, S. T. (2011). Testing theoretical network classes and HIV-related correlates with latent class analysis. AIDS Care, 23(10), 1274-1281.

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#