Hitherto, latent variable modelling has hovered on the fringes of the statistical mainstream but if the purpose of statistics is to deal with real problems, there is every reason for it to move closer to center stage. In the social sciences especially, latent variables are common, and if they are to be handled in a truly scientific manner, statistical theory must be developed to include them.



From the Preface of: Bartholomew, D.J., and Knott, M. (1999) Latent Variable Models and Factor Analysis, 2nd. ed., Kendall's Library of Statistics



What is a latent variable?

A variable which is not observable or not directly measurable

Examples: liberalism, quality of life, self-esteem, social economic status, unhealthy dieting, math ability, parenting skill, satisfaction, social support, sexual maturity, speech difficulties, asthma severity, self-restraint problems, etc.


Two fundamental uses of latent variable modeling

  1. Reduce the dimensionality of data - explanation of interrelations between observed variables using smaller number of latent variables, creation of scales
  2. Theoretical framework for modeling relations between latent variables and between observed variables and latent variables



  • Latent Trait Models: categorical observed - metrical latent

    1. Comes from Education testing, (latent variable are labeled as traits), Item Response Theory (IRT), large literature related to IRT

    2. Answer (0,1) to a series of p questions, thus there are 2^p possible response patterns (dichotomous data). Answer (1,2,...c) to a series of p questions, thus there are c^p possible response patterns (polytomous data).

    3. Response patters occur with very unequal frequency

    4. Questions to answer:
      1. How much of the differences in these responses can be explained by supposing all items depend on one or more continuous latent variables?
      2. How many underlying variables are there?
      3. Which observed variables help discriminate individuals the best?
      4. What is the best way to combine the observed variables in order to create a scale or score for each individual?
    A web site (still under construction) devoted to Item Response Theory (another name for Latent trait models) is:

    http://www.education.umd.edu/Depts/EDMS/tutorials/frontpage.html


    Latent Class Analysis - categorical observed - categorical latent

    1. Credit usually given to Paul Lazarsfeld as being the originator of LCA, Foundation book is Lazarsfeld, P.F. and Henry, N.W. (1968) Latent Structure Analysis. Houghton Mifflin. Actually this book includes techniques for all four areas of latent variable analysis.
    2. A statistical method for finding subtypes of related cases from multivariate categorical data.
    3. Questions to answer
      1. How many underlying classes are there?
      2. What is the prevalence in each of the latent classes?
      3. What is the probability that a particular individual will be in a particular class?
    Good web site including a FAQ about Latent class models is found at

    http://members.xoom.com/jsuebersax/index.html
    or if that doesn't work try
    http://ourworld.compuserve.com/homepages/jsuebersax/index.htm

    Another web site containing materials related to the Sage book, Latent Class Scaling Analysis, in the Series: Quantitative Methods in the Social Sciences (# 126) is

    http://www.education.umd.edu/EDMS/Latent/Dayton.html


    Latent Profile Analysis - metrical observed - categorical latent

    1. This area is more often referred to as cluster analysis or finite mixture models
    2. The family of analysis tools is a large one - Classification and regression trees (CART), projection pursuit, nearest neighbor techniques, maximum-likelihood for mixtures
    3. Several PROCs in SAS that do cluster analysis
    4. Questions
      1. If you don't already know, determine how many underlying classes there are (i.e. how many clusters)
      2. Which class does each observation belong to, with what probability
      3. What features distinguish the different classes.
    General web site for cluster analysis is

    http://www.statsoftinc.com/textbook/stcluan.html
    WE WILL NOT SPEND TIME ON THIS SUBJECT THIS SEMESTER


    Exploratory Factor Analysis and Structural Equation Modeling
    metrical observed - metrical latent

    1. Exploratory Factor Analysis (and similarly Principal Component Analysis) are used to determine the number of latent variables underlying a set of observed variables. The nature of the relationship between the observed variables and the latent variables is also estimated.
    2. Structural Equation modeling
      • path analysis - No latent variables involved, multiple regression, simultaneous regression, separate out direct effect from indirect effects, recursive and non recursive, ARROWS DO NOT CONFIRM CAUSALITY
      • confirmatory factor analysis - theory driven measurement model, usually simple structure
      • structural equation model - Kline calls hybrid models, i.e., path analysis but the variables are latent so for each latent variable we have a confirmatory factor analysis.
    3. Questions
      1. Does the hypothesized model fit the data?
      2. What is the significance of the paths between variables?
      3. What are the effects of one variable as it is related to another?
      4. What could be changed about hypothesized model in order to fit the data better
      5. Are there differences in the model across subgroups of the data?

    There is a VERY active listserve devoted to structural equation modeling called SEMNET, to join the listserve go to
    http://www.gsu.edu/~mkteer/semnet.html

    EXAMPLES

    The first 4 examples come from directly from the AMOS software example directory (i.e. Examples 4,7,8,and 5). The last example comes from the PROJECT EAT study by Dianne Neumark-Sztainer in the Division of Epidemiology.

















    The Web of science web page is a great resource for finding articles, in particular, articles that use structural equation modeling. Here are two randomly chosen examples that I found when I searched on "structural equation modeling"

    EXAMPLE 1 from Web of science
    The ex ante function of the criminal law
    Darley JM, Carlsmith KM, Robinson PH
    LAW & SOCIETY REVIEW
    35 (1): 165-189 2001


    Abstract:
    Criminal legal codes draw clear lines between permissible an d illegal conduct, and the criminal justice system counts on people knowing thes e lines and governing their conduct accordingly. This is the "ex ante" function of the lavi; lines are drawn, and because citizens fear punishments or believe i n the moral validity of the legal codes they do not cross these lines. But do pe ople in fact know the lines that legal codes draw? The fact that several states have adopted laws that deviate from other state laws enables a field experiment to address this question. Residents (N = 203) of states (Wisconsin, Texas, North Dakota, and South Dakota) that had adopted a minority position on some aspect of criminal law reported the relevant law of their state to be no different than did citizens of "majoritarian" states. Path analyses usi ng structural equation modeling suggest that people make guesses about what their state law holds by extrapolating from their personal view of whether or not the act in question ought to be crimi nalized.

    KeyWords Plus:
    SOCIAL-PERCEPTION, EGOCENTRIC BIAS, CONSENSUS, PUNISHMENT, CRIME, FIT

    Addresse s:
    Darley JM, Princeton Univ, Princeton, NJ 08544 USA
    Princeton Univ, Princeton, NJ 08544 USA
    Northwestern Univ, Evanston, IL 60208 USA

    Publisher:
    LAW SOC ASSOC, AMHERST

    IDS Number:
    489BH

    ISSN:
    0023-9216


    EXAMPLE 2 from Web of science

    Using structural equation modeling to examine factors that influence sunburn frequency and severity among adults living in Canada
    Shoveller JA, Ratner PA, Johnson JL
    CANCER DETECTION AND PREVENTION
    25 (5): 486-495 2001


    Abstract:
    This study uses structural equation modeling to examine hypothesized relationships between sunburn and physical characteristics and potentially modifiable behavior. The analysis was based on self-reported data collected from a randomly selected national sample of Canadian adults. An initial model was tested with 50% of the cases (n = 1,408); the remaining cases (n = 1,298) were reserved for confirmatory testing. After the initial model failed, theoretically plausible effects were added incrementally to improve overall model fit. The initial model yielded: chi (2)((68 d.f.)) = 3199.41 (P < .001) and the AGFI = .56. With 32 added effects, a fit model resulted in: chi (2)((36 d.f.)) = 394.35 (P < .001), AGFI = 0.87, and IFI = 0.91 (the Critical-N was 210). Model fit was confirmed. Suntanning, failure to wear protective clothing, and sun exposure were associated with the frequency of severity-adjusted sunburns. Sunscreen use was not associated with sunburn frequency-severity.

    Author Keywords:
    sunburn, behavior, suntan, skin cancer, awareness

    KeyWords Plus:
    NONMELANOCYTIC SKIN-CANCER, SUN EXPOSURE, PIGMENTATION FACTORS, SUNLIGHT EXPOSURE, CELL CARCINOMA, BASAL-CELL, POPULATION, MELANOMA, PROTECTION, BEHAVIORS

    Addresses:
    Shoveller JA, Univ British Columbia, Dept Hlth Care & Epidemiol, 5804 Fairview Ave, Vancouver, BC V6T 1Z3, Canada
    Univ British Columbia, Dept Hlth Care & Epidemiol, Vancouver, BC V6T 1Z3, Canada
    Univ British Columbia, Sch Nursing, Ctr Community Hlth & Hlth Evaluat Res, Vancouver, BC V6T 1Z3, Canada
    Univ British Columbia, Inst Hlth Promot Res, Vancouver, BC V6T 1Z3, Canada

    Publisher:
    JONES AND BARTLETT PUBLISHERS, SUDBURY

    IDS Number:
    486MN

    ISSN:
    0361-090X


    Causality


    Melanie Wall
    Tue Sep 5 17:26:32 CDT 2000
    last updated 9/2/02