Updated 9-24-02

FACTOR ANALYSIS MODEL
A bunch of Factor Analysis exmples can be found at http://www.psych.yorku.ca/lab/psy6140/ex/factor.htm

\begin{displaymath}\underline{X} = \underline{\mu} + \underline{\Lambda} \underline{f}
+ \underline{\epsilon}
\end{displaymath}

$\underline{X}$:
p - dimensional observed vector
$\underline{f}$:
q - dimensional underlying factors. $\underline{f}$ are often called ``common factors''. Assume for now $E (\underline{f}) = \underline{0}$ and $Var (\underline{f}) = {\bf I}_{q \times q}$
$\underline{\epsilon}$:
p - dimensional random error. $\underline{\epsilon}$ are often called ``unique factors'' or ``specific factors''. Assume $E (\underline{\epsilon}) =
\underline{0}$ and $Var (\underline{\epsilon}) = \underline{\Psi}$where $\underline{\Psi}$ is a diagonal matrix.
$\underline{\Lambda}$:
$p \times q$ matrix of scalars called ``factor loadings''
$\underline{\mu}$:
$p \times 1$ vector of scalar intercepts. Often ignored if we are only interested in interrelations. Most software assume $\underline{\mu}= \underline{0}$ by default and center the $\underline{X}$ variables, i.e. analyze $\underline{X} - \underline{\bar{X}}$


KEY ASSUMPTIONS

1.
$COV(\underline{f},\underline{\epsilon}) = 0_{q \times p}$.
2.
$\underline{\Psi}$ is a diagonal matrix.



Focus on Covariance structure

\begin{displaymath}Var(\underline{X}) = \underline{\Sigma} = \underline{\Lambda}
\underline{\Lambda}^\prime + \underline{\Psi}
\end{displaymath}




OR Standardize
$\underline{X}$ to get $\underline{Z}$, i.e. $\underline{Z} = \left( \begin{array}{c} \frac{x_1 -
\bar{x}_1}{s_1} \\ \frac{x_2 - \bar{x}_2}{s_2} \\ . \\ . \\
\frac{x_p - \bar{x}_p}{s_p} \end{array} \right)$


\begin{displaymath}Var(\underline{Z}) = \underline{\rho} = \underline{\Lambda}^s
{\underline{\Lambda}^s}^\prime + \underline{\Psi}^s
\end{displaymath}




NOTE: There are several different estimation procedures (e.g. 1. principal factor method, 2. normal theory maximum likelihood method, and others). For the maximum likelihood method,
$\underline{\Lambda}^s$ obtained by analyzing the correlation matrix is the same as rescaling the $\underline{\Lambda}$ obtained by analyzing the covariance matrix by the observed standard deviations, i.e. $\underline{\Lambda}^s =
(diag({\bf S}))^{-\frac{1}{2}} \underline{\Lambda}$. (For a discussion, see section 3.17 in Bartholomew and Knott 1998) THIS IS NOT TRUE WHEN THE PRINCIPAL FACTOR METHOD IS USED.


COMMUNALITIES


\begin{eqnarray*}x_1 &=& \mu_1 + \lambda_{11} f_1 + \lambda_{12} f_2 + \epsilon_...
...&=& \mu_5 + \lambda_{51} f_1 + \lambda_{52} f_2 + \epsilon_5 \\
\end{eqnarray*}



\begin{eqnarray*}Var(x_1) &=& \lambda_{11}^2 + \lambda_{12}^2 + \psi_1 \\
Var(x...
..._4 \\
Var(x_5) &=& \lambda_{51}^2 + \lambda_{52}^2 + \psi_5 \\
\end{eqnarray*}



EXAMPLE: The communality for x2 is $\lambda_{21}^2 +
\lambda_{22}^2$


PARAMETER ESTIMATION


We want to analyze the covariance structure of the factors, i.e. $\underline{\Lambda}
{\underline{\Lambda}}^\prime$. We want to estimate $\underline{\Lambda}$.




TWO estimation procedures will be discussed (Only one will be discussed in 2002 school year)

1.
Principal factor method (won't discuss details of this method)
2.
Normal theory Maximum likelihood method


SKIP in 2002.....PARAMETER ESTIMATION - VIA Principal factor method




Some NOTES

Since we don't know $\Psi$, we will start off with a guess and then iterate.....


SKIP in 2002......Need a first guess for $\Psi$


SKIP in 2002..... ITERATIVE PROCEDURE

1.
Extract q eigenvectors from ${\bf R} -
\hat{\underline{\Psi}}_{(t)}^s$
2.
Call these q eigenvectors $\hat{\underline{\Lambda}}^s_t$
3.
Re estimate $\hat{\underline{\Psi}}^s$ by taking $diag({\bf I} - diag({\hat{\underline{\Lambda}}^s_t
\hat{\underline{\Lambda}}^s_t}^\prime))$
4.
Label this new estimate of $\hat{\underline{\Psi}}^s$ as $\hat{\underline{\Psi}}_{(t+1)}^s$
5.
Go back to step 1 and continue iterating until convergence


SKIP in 2002......INTERESTING NOTE about Principal factor method




BK (sections 3.10, 3.11) show that the principal factor method is identical to solving for
$\Lambda$ and $\Psi$ such that

\begin{displaymath}\sum_{i=1}^{p} \sum_{u=1}^{p} (s_{iu} - \sigma_{iu})^2
\end{displaymath}

is as small as possible.




This is the same thing as solving for
$\Lambda$ and $\Psi$ such that

\begin{displaymath}trace ({\bf S} - \underline{\Sigma})^2
\end{displaymath}

is as small as possible. This is the Ordinary Least Squares discrepancy function.




Note that this minimization criterion does not assume anything about the correlations between the elements of
${\bf S}$. Hence we are ignoring information.