The EM Algorithm SPH 7460 notes.emalgorithm
The EM algorithm was the subject of a famous paper in 1977 by A. P. Dempster,
Nan Laird, and Donald Rubin (J. Royal Statist Soc Series B, vol 39 pp 1-38.)
Many of the ideas in this paper had been discovered several times prior to 1977, but
DLR stated them in more generality. The algorithm works because of a basic theorem
called Jensen's Inequality. Basically, if X is a random variable, Jensen's Inequality
states that
log(Mean(X)) > Mean(log(X)).
This result in turn is a consequence of the fact that the log function is a
concave function.
A reasonably good exposition of the EM algorithm is given in a paper by Jeff
Bilmes which will be handed out in class. Bilmes illustrates the method by
applying it to a 'mixture' model, i.e., a distribution which is a mixture of
different normal distributions.
The following is a program that implements the EM algorithm for a simple mixture
problem in which the distribution is a mixture of two normals, with means mu1
and mu2 and standard deviations sigma1 and sigma2, and a given probability alpha
that an observation is from the first normal distribution and probability 1 - alpha of
being from the second normal distribution. Thus the objective is to estimate
mu1, mu2, sigma1, sigma2, and alpha.
==============================================================================
options linesize = 200 ;
footnote "~john-c/5421/emalgorithm.sas &sysdate &systime" ;
data curve ;
mu1 = 1 ; mu2 = 3 ;
sigma1 = 1 ; sigma2 = 1 ;
alpha = .5 ;
n = 1000 ;
do i = 1 to n ;
group = 1 ;
r = ranuni(-1) ;
if r > alpha then group = 2 ;
if group = 1 then do ;
x = mu1 + sigma1 * rannor(-1) ;
end ;
if group = 2 then do ;
x = mu2 + sigma2 * rannor(-1) ;
end ;
output ;
end ;
run ;
proc iml ;
use curve ;
read all var {x} into x ;
N = nrow(x) ;
itlim = 400 ;
pi = 4 * atan(1) ;
mu1 = 4.0 ; mu2 = 4.0 ;
sigma1 = .8 ; sigma2 = 1.3 ;
alpha = .6 ;
pi = 4 * atan(1) ;
eps = 1e-8 ;
diff = 9999 ;
do iter = 1 to itlim while (diff > eps) ;
sumnumermu1 = 0 ;
sumnumermu2 = 0 ;
sumz1 = 0 ;
sumz2 = 0 ;
sumloglike = 0 ;
sumq = 0 ;
do i = 1 to n ;
xi = x[i] ;
prob1 = (1 / sqrt(2 * pi))*(1 / sigma1) * exp(-(xi - mu1)**2 /(2 * sigma1**2)) ;
prob2 = (1 / sqrt(2 * pi))*(1 / sigma2) * exp(-(xi - mu2)**2 /(2 * sigma2**2)) ;
t1 = prob1 * alpha ;
t2 = prob2 * (1 - alpha) ;
z1i = t1 / (t1 + t2) ;
z2i = t2 / (t1 + t2) ;
firstpart1 = log(alpha) * z1i ;
secondpart1 = log(prob1) * z1i ;
firstpart2 = log(1 - alpha) * z2i ;
secondpart2 = log(prob2) * z2i ;
sumq = sumq + firstpart1 + secondpart1
+ firstpart2 + secondpart2 ;
sumloglike = sumloglike + log(alpha * prob1 + (1 - alpha) * prob2) ;
sumz1 = sumz1 + z1i ;
sumz2 = sumz2 + z2i ;
numermu1i = xi * z1i ;
numermu2i = xi * z2i ;
sumnumermu1 = sumnumermu1 + numermu1i ;
sumnumermu2 = sumnumermu2 + numermu2i ;
end ;
newalpha = sumz1 / n ;
newmu1 = sumnumermu1 / sumz1 ;
newmu2 = sumnumermu2 / sumz2 ;
sumz1 = 0 ;
sumz2 = 0 ;
sumsig1 = 0 ;
sumsig2 = 0 ;
do i = 1 to N ;
xi = x[i] ;
prob1 = (1 / sqrt(2 * pi))*(1 / sigma1) * exp(-(xi - mu1)**2 /(2 * sigma1**2)) ;
prob2 = (1 / sqrt(2 * pi))*(1 / sigma2) * exp(-(xi - mu2)**2 /(2 * sigma2**2)) ;
t1 = prob1 * newalpha ;
t2 = prob2 * (1 - newalpha) ;
z1i = t1 / (t1 + t2) ;
z2i = t2 / (t1 + t2) ;
sigma1numeri = z1i * (xi - newmu1)**2 * prob1 ;
sigma2numeri = z2i * (xi - newmu2)**2 * prob2 ;
sigma1numeri = z1i * (xi - newmu1)**2 ;
sigma2numeri = z2i * (xi - newmu2)**2 ;
sumsig1 = sumsig1 + sigma1numeri ;
sumsig2 = sumsig2 + sigma2numeri ;
sumz1 = sumz1 + z1i ;
sumz2 = sumz2 + z2i ;
end ;
newsigma1 = sqrt(sumsig1 / sumz1) ;
newsigma2 = sqrt(sumsig2 / sumz2) ;
diff = abs(newalpha - alpha) + abs(newmu1 - mu1) + abs(newmu2 - mu2)
+ abs(newsigma1 - sigma1) + abs(newsigma2 - sigma2) ;
print "iter = " iter sumz1 sumz2 mu1 sigma1 mu2 sigma2 alpha sumq sumloglike diff ;
print "iter = " iter newmu1 newsigma1 newmu2 newsigma2 newalpha ;
alpha = newalpha ;
mu1 = newmu1 ;
mu2 = newmu2 ;
sigma1 = newsigma1 ;
sigma2 = newsigma2 ;
end ;
sumloglike = 0 ;
quit ;
endsas ;
proc univariate data = curve plot normal ;
var x ;
run ;
==============================================================================
Output from the program:
The SAS System 16:22 Sunday, December 26, 2010 1
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 1 719.54853 280.45147 2 0.7 3 0.7 0.6 -2152.243 -1793.204 0.9972825
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 2 660.92035 339.07965 1.4337345 0.9543408 3.0307973 0.6340151 0.679894 -1782.227 -1477.166 0.2893968
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 3 640.55015 359.44985 1.3781324 0.8853073 3.0791927 0.5307658 0.6667775 -1692.87 -1449.702 0.2480738
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 4 620.67737 379.32263 1.3074685 0.816019 3.1141961 0.4772704 0.6471547 -1622.068 -1430.543 0.1781684
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 5 601.12775 398.87225 1.2419169 0.7507156 3.1215423 0.4584936 0.6259643 -1569.984 -1417.438 0.1479576
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 6 582.91113 417.08887 1.1841992 0.6950813 3.110618 0.4556881 0.6050886 -1530.234 -1406.546 0.1395247
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 7 567.03573 432.96427 1.1342649 0.6482143 3.0914626 0.4599533 0.5857858 -1499.344 -1397.045 0.1260924
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 8 553.95495 446.04505 1.0925576 0.6091167 3.0705377 0.4675821 0.5690521 -1476.382 -1389.354 0.1066973
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 9 543.6629 456.3371 1.0594165 0.5778772 3.0509371 0.4765872 0.5553411 -1460.727 -1383.806 0.0854031
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 10 535.84928 464.15072 1.0343432 0.5543082 3.0340083 0.4856915 0.5446134 -1451.038 -1380.238 0.065482
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 11 530.06401 469.93599 1.0160751 0.5373789 3.020179 0.4940401 0.5365067 -1445.547 -1378.145 0.0487448
~john-c/5421/emalgorithm.sas 26DEC10 16:22
The SAS System 16:22 Sunday, December 26, 2010 2
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 12 525.84735 474.15265 1.0030705 0.5255894 3.0093354 0.5011643 0.5305239 -1442.658 -1376.991 0.0356309
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 13 522.80095 477.19905 0.9939139 0.5174905 3.0010775 0.5069302 0.5261722 -1441.239 -1376.378 0.0257828
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 14 520.60968 479.39032 0.987485 0.5119367 2.9949125 0.5114249 0.5230319 -1440.596 -1376.059 0.0185617
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 15 519.03649 480.96351 0.982964 0.5081101 2.9903698 0.5148391 0.5207746 -1440.338 -1375.893 0.0133326
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 16 517.90777 482.09223 0.9797739 0.5054547 2.9870508 0.5173871 0.5191545 -1440.261 -1375.808 0.0095685
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 17 517.09798 482.90202 0.9775145 0.5035983 2.9846392 0.5192661 0.5179924 -1440.263 -1375.765 0.006866
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 18 516.5169 483.4831 0.9759089 0.5022921 2.9828933 0.5206408 0.5171587 -1440.293 -1375.742 0.0049273
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 19 516.09983 483.90017 0.9747648 0.5013682 2.9816324 0.5216409 0.5165605 -1440.33 -1375.73 0.0035367
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 20 515.80042 484.19958 0.9739477 0.500712 2.9807231 0.5223658 0.5161311 -1440.365 -1375.725 0.0025391
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 21 515.58542 484.41458 0.9733633 0.5002445 2.9800682 0.5228898 0.5158229 -1440.393 -1375.721 0.0018233
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 22 515.43102 484.56898 0.9729448 0.4999106 2.9795968 0.523268 0.5156016 -1440.416 -1375.72 0.0013094
~john-c/5421/emalgorithm.sas 26DEC10 16:22
The SAS System 16:22 Sunday, December 26, 2010 3
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 23 515.32013 484.67987 0.9726448 0.4996718 2.9792577 0.5235406 0.5154426 -1440.434 -1375.719 0.0009405
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 24 515.24047 484.75953 0.9724296 0.4995008 2.9790139 0.5237368 0.5153285 -1440.447 -1375.719 0.0006756
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 25 515.18324 484.81676 0.9722752 0.4993782 2.9788386 0.5238781 0.5152465 -1440.456 -1375.718 0.0004853
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 26 515.14213 484.85787 0.9721643 0.4992903 2.9787125 0.5239797 0.5151875 -1440.463 -1375.718 0.0003486
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 27 515.1126 484.8874 0.9720847 0.4992272 2.978622 0.5240527 0.5151452 -1440.468 -1375.718 0.0002505
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 28 515.09138 484.90862 0.9720276 0.4991819 2.9785569 0.5241052 0.5151148 -1440.472 -1375.718 0.00018
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 29 515.07614 484.92386 0.9719865 0.4991494 2.9785101 0.524143 0.515093 -1440.475 -1375.718 0.0001293
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 30 515.06518 484.93482 0.971957 0.499126 2.9784765 0.5241701 0.5150773 -1440.477 -1375.718 0.0000929
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 31 515.05731 484.94269 0.9719359 0.4991092 2.9784523 0.5241896 0.515066 -1440.478 -1375.718 0.0000667
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 32 515.05166 484.94834 0.9719206 0.4990972 2.978435 0.5242036 0.5150579 -1440.479 -1375.718 0.0000479
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 33 515.0476 484.9524 0.9719097 0.4990885 2.9784225 0.5242137 0.5150521 -1440.48 -1375.718 0.0000345
~john-c/5421/emalgorithm.sas 26DEC10 16:22
The SAS System 16:22 Sunday, December 26, 2010 4
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 34 515.04468 484.95532 0.9719018 0.4990823 2.9784136 0.5242209 0.5150479 -1440.481 -1375.718 0.0000248
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 35 515.04258 484.95742 0.9718962 0.4990778 2.9784071 0.5242261 0.5150449 -1440.481 -1375.718 0.0000178
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 36 515.04108 484.95892 0.9718921 0.4990746 2.9784025 0.5242299 0.5150427 -1440.481 -1375.718 0.0000128
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 37 515.03999 484.96001 0.9718892 0.4990723 2.9783992 0.5242325 0.5150412 -1440.481 -1375.718 9.1807E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 38 515.03922 484.96078 0.9718871 0.4990706 2.9783968 0.5242345 0.5150401 -1440.482 -1375.718 6.5962E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 39 515.03866 484.96134 0.9718856 0.4990695 2.9783951 0.5242359 0.5150393 -1440.482 -1375.718 4.7393E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 40 515.03825 484.96175 0.9718846 0.4990686 2.9783938 0.5242368 0.5150387 -1440.482 -1375.718 3.4052E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 41 515.03797 484.96203 0.9718838 0.499068 2.9783929 0.5242376 0.5150383 -1440.482 -1375.718 2.4466E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 42 515.03776 484.96224 0.9718832 0.4990675 2.9783923 0.5242381 0.515038 -1440.482 -1375.718 1.7578E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 43 515.03761 484.96239 0.9718828 0.4990672 2.9783919 0.5242384 0.5150378 -1440.482 -1375.718 1.263E-6
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 44 515.0375 484.9625 0.9718825 0.499067 2.9783915 0.5242387 0.5150376 -1440.482 -1375.718 9.0744E-7
~john-c/5421/emalgorithm.sas 26DEC10 16:22
The SAS System 16:22 Sunday, December 26, 2010 5
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 45 515.03743 484.96257 0.9718823 0.4990668 2.9783913 0.5242389 0.5150375 -1440.482 -1375.718 6.5199E-7
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 46 515.03737 484.96263 0.9718822 0.4990667 2.9783911 0.524239 0.5150374 -1440.482 -1375.718 4.6845E-7
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 47 515.03733 484.96267 0.9718821 0.4990666 2.978391 0.5242391 0.5150374 -1440.482 -1375.718 3.3658E-7
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 48 515.0373 484.9627 0.971882 0.4990666 2.9783909 0.5242392 0.5150373 -1440.482 -1375.718 2.4183E-7
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 49 515.03728 484.96272 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482 -1375.718 1.7375E-7
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 50 515.03727 484.96273 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482 -1375.718 1.2484E-7
ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF
iter = 51 515.03726 484.96274 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482 -1375.718 8.9695E-8
~john-c/5421/emalgorithm.sas 26DEC10 16:22
The SAS System 16:22 Sunday, December 26, 2010 6
The UNIVARIATE Procedure
Variable: x
Moments
N 1000 Sum Weights 1000
Mean 1.94496391 Sum Observations 1944.96391
Std Deviation 1.12624922 Variance 1.2684373
Skewness 0.07244942 Kurtosis -1.2447552
Uncorrected SS 5050.05346 Corrected SS 1267.16886
Coeff Variation 57.9059187 Std Error Mean 0.03561513
Basic Statistical Measures
Location Variability
Mean 1.944964 Std Deviation 1.12625
Median 1.820389 Variance 1.26844
Mode . Range 5.02183
Interquartile Range 2.00123
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 54.61061 Pr > |t| <.0001
Sign M 485 Pr >= |M| <.0001
Signed Rank S 249842 Pr >= |S| <.0001
Tests for Normality
Test --Statistic--- -----p Value------
Shapiro-Wilk W 0.950076 Pr < W <0.0001
Kolmogorov-Smirnov D 0.101295 Pr > D <0.0100
Cramer-von Mises W-Sq 3.720664 Pr > W-Sq <0.0050
Anderson-Darling A-Sq 20.30881 Pr > A-Sq <0.0050
Quantiles (Definition 5)
Quantile Estimate
100% Max 4.5148019
99% 4.0532249
95% 3.6613783
90% 3.4152895
75% Q3 2.9383861
50% Median 1.8203894
25% Q1 0.9371594
10% 0.5344829
5% 0.3544992
1% -0.0651857
~john-c/5421/emalgorithm.sas 26DEC10 16:22
The SAS System 16:22 Sunday, December 26, 2010 7
The UNIVARIATE Procedure
Variable: x
Quantiles (Definition 5)
Quantile Estimate
0% Min -0.5070234
Extreme Observations
------Lowest------ -----Highest-----
Value Obs Value Obs
-0.507023 83 4.34448 462
-0.505849 497 4.34469 327
-0.401534 383 4.35188 679
-0.263233 992 4.39133 972
-0.253069 76 4.51480 781
Histogram # Boxplot Normal Probability Plot
4.5+* 1 | 4.5+ ++ *
.*** 5 | | + *
.***** 9 | | ++ ****
.******* 14 | | ++****
.***************** 34 | | *****
3.5+******************** 39 | 3.5+ ***
.*************************** 54 | | ****
.*********************************** 69 | | ***+
.***************************************** 81 +-----+ | ****+
.*********************************** 69 | | | ***++
2.5+************************* 50 | | 2.5+ ** +
.**************** 31 | | | **++
.************** 28 | | | **+
.********* 18 *--+--* | +*
.*************** 30 | | | ++**
1.5+****************************** 60 | | 1.5+ + **
.****************************** 59 | | | ++***
.**************************************** 80 | | | ++***
.************************************ 71 +-----+ | ++***
.***************************************** 81 | | ****
0.5+***************************** 57 | 0.5+ ****
.*************** 30 | | ****++
.******** 15 | | **** ++
.**** 8 | | *** ++
.** 4 | |* +
-0.5+** 3 | -0.5+* ++
----+----+----+----+----+----+----+----+- +----+----+----+----+----+----+----+----+----+----+
* may represent up to 2 counts -2 -1 0 +1 +2
~john-c/5421/emalgorithm.sas 26DEC10 16:22
==========================================================================
Several things to note in the program:
1. The notation basically follows that in the paper by Jeff Bilmes referred to above.
2. Note that the log likelihood does indeed increase from one iteration to the next.
3. The algorithm in this case converged after 51 iterations. Here are the statistics:
Parameters True Values Starting Values Final Estimates
---------- ----------- --------------- ---------------
mu1 1.0000 2.0000 0.9719
mu2 3.0000 3.0000 2.9784
sigma1 0.5000 0.7000 0.4991
sigma2 0.5000 0.7000 0.5242
alpha 0.5000 0.6000 0.5150
----------------------------------------------------------------------------------------
3. Note that, in the printout, the PROC UNIVARIATE plot clearly indicates that this
distribution is bimodal.
4. Note that the EM algorithm does not automatically produce a covariance matrix for the
parameter estimates (unlike maximum likelihood estimation, where variances and covariances
of the parameter estimates are computed from the Fisher Information Matrix). However,
it is not hard to find estimates of the variances and covariances of the parameter estimates
using the delta method. See notes.029.
~john-c/5421/notes.emalgorithm Revised Dec. 26, 2010.