The EM Algorithm SPH 7460 notes.emalgorithm The EM algorithm was the subject of a famous paper in 1977 by A. P. Dempster, Nan Laird, and Donald Rubin (J. Royal Statist Soc Series B, vol 39 pp 1-38.) Many of the ideas in this paper had been discovered several times prior to 1977, but DLR stated them in more generality. The algorithm works because of a basic theorem called Jensen's Inequality. Basically, if X is a random variable, Jensen's Inequality states that log(Mean(X)) > Mean(log(X)). This result in turn is a consequence of the fact that the log function is a concave function. A reasonably good exposition of the EM algorithm is given in a paper by Jeff Bilmes which will be handed out in class. Bilmes illustrates the method by applying it to a 'mixture' model, i.e., a distribution which is a mixture of different normal distributions. The following is a program that implements the EM algorithm for a simple mixture problem in which the distribution is a mixture of two normals, with means mu1 and mu2 and standard deviations sigma1 and sigma2, and a given probability alpha that an observation is from the first normal distribution and probability 1 - alpha of being from the second normal distribution. Thus the objective is to estimate mu1, mu2, sigma1, sigma2, and alpha. ============================================================================== options linesize = 200 ; footnote "~john-c/5421/emalgorithm.sas &sysdate &systime" ; data curve ; mu1 = 1 ; mu2 = 3 ; sigma1 = 1 ; sigma2 = 1 ; alpha = .5 ; n = 1000 ; do i = 1 to n ; group = 1 ; r = ranuni(-1) ; if r > alpha then group = 2 ; if group = 1 then do ; x = mu1 + sigma1 * rannor(-1) ; end ; if group = 2 then do ; x = mu2 + sigma2 * rannor(-1) ; end ; output ; end ; run ; proc iml ; use curve ; read all var {x} into x ; N = nrow(x) ; itlim = 400 ; pi = 4 * atan(1) ; mu1 = 4.0 ; mu2 = 4.0 ; sigma1 = .8 ; sigma2 = 1.3 ; alpha = .6 ; pi = 4 * atan(1) ; eps = 1e-8 ; diff = 9999 ; do iter = 1 to itlim while (diff > eps) ; sumnumermu1 = 0 ; sumnumermu2 = 0 ; sumz1 = 0 ; sumz2 = 0 ; sumloglike = 0 ; sumq = 0 ; do i = 1 to n ; xi = x[i] ; prob1 = (1 / sqrt(2 * pi))*(1 / sigma1) * exp(-(xi - mu1)**2 /(2 * sigma1**2)) ; prob2 = (1 / sqrt(2 * pi))*(1 / sigma2) * exp(-(xi - mu2)**2 /(2 * sigma2**2)) ; t1 = prob1 * alpha ; t2 = prob2 * (1 - alpha) ; z1i = t1 / (t1 + t2) ; z2i = t2 / (t1 + t2) ; firstpart1 = log(alpha) * z1i ; secondpart1 = log(prob1) * z1i ; firstpart2 = log(1 - alpha) * z2i ; secondpart2 = log(prob2) * z2i ; sumq = sumq + firstpart1 + secondpart1 + firstpart2 + secondpart2 ; sumloglike = sumloglike + log(alpha * prob1 + (1 - alpha) * prob2) ; sumz1 = sumz1 + z1i ; sumz2 = sumz2 + z2i ; numermu1i = xi * z1i ; numermu2i = xi * z2i ; sumnumermu1 = sumnumermu1 + numermu1i ; sumnumermu2 = sumnumermu2 + numermu2i ; end ; newalpha = sumz1 / n ; newmu1 = sumnumermu1 / sumz1 ; newmu2 = sumnumermu2 / sumz2 ; sumz1 = 0 ; sumz2 = 0 ; sumsig1 = 0 ; sumsig2 = 0 ; do i = 1 to N ; xi = x[i] ; prob1 = (1 / sqrt(2 * pi))*(1 / sigma1) * exp(-(xi - mu1)**2 /(2 * sigma1**2)) ; prob2 = (1 / sqrt(2 * pi))*(1 / sigma2) * exp(-(xi - mu2)**2 /(2 * sigma2**2)) ; t1 = prob1 * newalpha ; t2 = prob2 * (1 - newalpha) ; z1i = t1 / (t1 + t2) ; z2i = t2 / (t1 + t2) ; sigma1numeri = z1i * (xi - newmu1)**2 * prob1 ; sigma2numeri = z2i * (xi - newmu2)**2 * prob2 ; sigma1numeri = z1i * (xi - newmu1)**2 ; sigma2numeri = z2i * (xi - newmu2)**2 ; sumsig1 = sumsig1 + sigma1numeri ; sumsig2 = sumsig2 + sigma2numeri ; sumz1 = sumz1 + z1i ; sumz2 = sumz2 + z2i ; end ; newsigma1 = sqrt(sumsig1 / sumz1) ; newsigma2 = sqrt(sumsig2 / sumz2) ; diff = abs(newalpha - alpha) + abs(newmu1 - mu1) + abs(newmu2 - mu2) + abs(newsigma1 - sigma1) + abs(newsigma2 - sigma2) ; print "iter = " iter sumz1 sumz2 mu1 sigma1 mu2 sigma2 alpha sumq sumloglike diff ; print "iter = " iter newmu1 newsigma1 newmu2 newsigma2 newalpha ; alpha = newalpha ; mu1 = newmu1 ; mu2 = newmu2 ; sigma1 = newsigma1 ; sigma2 = newsigma2 ; end ; sumloglike = 0 ; quit ; endsas ; proc univariate data = curve plot normal ; var x ; run ; ============================================================================== Output from the program: The SAS System 16:22 Sunday, December 26, 2010 1 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 1 719.54853 280.45147 2 0.7 3 0.7 0.6 -2152.243 -1793.204 0.9972825 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 2 660.92035 339.07965 1.4337345 0.9543408 3.0307973 0.6340151 0.679894 -1782.227 -1477.166 0.2893968 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 3 640.55015 359.44985 1.3781324 0.8853073 3.0791927 0.5307658 0.6667775 -1692.87 -1449.702 0.2480738 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 4 620.67737 379.32263 1.3074685 0.816019 3.1141961 0.4772704 0.6471547 -1622.068 -1430.543 0.1781684 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 5 601.12775 398.87225 1.2419169 0.7507156 3.1215423 0.4584936 0.6259643 -1569.984 -1417.438 0.1479576 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 6 582.91113 417.08887 1.1841992 0.6950813 3.110618 0.4556881 0.6050886 -1530.234 -1406.546 0.1395247 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 7 567.03573 432.96427 1.1342649 0.6482143 3.0914626 0.4599533 0.5857858 -1499.344 -1397.045 0.1260924 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 8 553.95495 446.04505 1.0925576 0.6091167 3.0705377 0.4675821 0.5690521 -1476.382 -1389.354 0.1066973 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 9 543.6629 456.3371 1.0594165 0.5778772 3.0509371 0.4765872 0.5553411 -1460.727 -1383.806 0.0854031 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 10 535.84928 464.15072 1.0343432 0.5543082 3.0340083 0.4856915 0.5446134 -1451.038 -1380.238 0.065482 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 11 530.06401 469.93599 1.0160751 0.5373789 3.020179 0.4940401 0.5365067 -1445.547 -1378.145 0.0487448 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 The SAS System 16:22 Sunday, December 26, 2010 2 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 12 525.84735 474.15265 1.0030705 0.5255894 3.0093354 0.5011643 0.5305239 -1442.658 -1376.991 0.0356309 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 13 522.80095 477.19905 0.9939139 0.5174905 3.0010775 0.5069302 0.5261722 -1441.239 -1376.378 0.0257828 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 14 520.60968 479.39032 0.987485 0.5119367 2.9949125 0.5114249 0.5230319 -1440.596 -1376.059 0.0185617 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 15 519.03649 480.96351 0.982964 0.5081101 2.9903698 0.5148391 0.5207746 -1440.338 -1375.893 0.0133326 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 16 517.90777 482.09223 0.9797739 0.5054547 2.9870508 0.5173871 0.5191545 -1440.261 -1375.808 0.0095685 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 17 517.09798 482.90202 0.9775145 0.5035983 2.9846392 0.5192661 0.5179924 -1440.263 -1375.765 0.006866 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 18 516.5169 483.4831 0.9759089 0.5022921 2.9828933 0.5206408 0.5171587 -1440.293 -1375.742 0.0049273 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 19 516.09983 483.90017 0.9747648 0.5013682 2.9816324 0.5216409 0.5165605 -1440.33 -1375.73 0.0035367 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 20 515.80042 484.19958 0.9739477 0.500712 2.9807231 0.5223658 0.5161311 -1440.365 -1375.725 0.0025391 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 21 515.58542 484.41458 0.9733633 0.5002445 2.9800682 0.5228898 0.5158229 -1440.393 -1375.721 0.0018233 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 22 515.43102 484.56898 0.9729448 0.4999106 2.9795968 0.523268 0.5156016 -1440.416 -1375.72 0.0013094 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 The SAS System 16:22 Sunday, December 26, 2010 3 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 23 515.32013 484.67987 0.9726448 0.4996718 2.9792577 0.5235406 0.5154426 -1440.434 -1375.719 0.0009405 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 24 515.24047 484.75953 0.9724296 0.4995008 2.9790139 0.5237368 0.5153285 -1440.447 -1375.719 0.0006756 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 25 515.18324 484.81676 0.9722752 0.4993782 2.9788386 0.5238781 0.5152465 -1440.456 -1375.718 0.0004853 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 26 515.14213 484.85787 0.9721643 0.4992903 2.9787125 0.5239797 0.5151875 -1440.463 -1375.718 0.0003486 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 27 515.1126 484.8874 0.9720847 0.4992272 2.978622 0.5240527 0.5151452 -1440.468 -1375.718 0.0002505 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 28 515.09138 484.90862 0.9720276 0.4991819 2.9785569 0.5241052 0.5151148 -1440.472 -1375.718 0.00018 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 29 515.07614 484.92386 0.9719865 0.4991494 2.9785101 0.524143 0.515093 -1440.475 -1375.718 0.0001293 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 30 515.06518 484.93482 0.971957 0.499126 2.9784765 0.5241701 0.5150773 -1440.477 -1375.718 0.0000929 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 31 515.05731 484.94269 0.9719359 0.4991092 2.9784523 0.5241896 0.515066 -1440.478 -1375.718 0.0000667 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 32 515.05166 484.94834 0.9719206 0.4990972 2.978435 0.5242036 0.5150579 -1440.479 -1375.718 0.0000479 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 33 515.0476 484.9524 0.9719097 0.4990885 2.9784225 0.5242137 0.5150521 -1440.48 -1375.718 0.0000345 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 The SAS System 16:22 Sunday, December 26, 2010 4 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 34 515.04468 484.95532 0.9719018 0.4990823 2.9784136 0.5242209 0.5150479 -1440.481 -1375.718 0.0000248 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 35 515.04258 484.95742 0.9718962 0.4990778 2.9784071 0.5242261 0.5150449 -1440.481 -1375.718 0.0000178 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 36 515.04108 484.95892 0.9718921 0.4990746 2.9784025 0.5242299 0.5150427 -1440.481 -1375.718 0.0000128 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 37 515.03999 484.96001 0.9718892 0.4990723 2.9783992 0.5242325 0.5150412 -1440.481 -1375.718 9.1807E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 38 515.03922 484.96078 0.9718871 0.4990706 2.9783968 0.5242345 0.5150401 -1440.482 -1375.718 6.5962E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 39 515.03866 484.96134 0.9718856 0.4990695 2.9783951 0.5242359 0.5150393 -1440.482 -1375.718 4.7393E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 40 515.03825 484.96175 0.9718846 0.4990686 2.9783938 0.5242368 0.5150387 -1440.482 -1375.718 3.4052E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 41 515.03797 484.96203 0.9718838 0.499068 2.9783929 0.5242376 0.5150383 -1440.482 -1375.718 2.4466E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 42 515.03776 484.96224 0.9718832 0.4990675 2.9783923 0.5242381 0.515038 -1440.482 -1375.718 1.7578E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 43 515.03761 484.96239 0.9718828 0.4990672 2.9783919 0.5242384 0.5150378 -1440.482 -1375.718 1.263E-6 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 44 515.0375 484.9625 0.9718825 0.499067 2.9783915 0.5242387 0.5150376 -1440.482 -1375.718 9.0744E-7 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 The SAS System 16:22 Sunday, December 26, 2010 5 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 45 515.03743 484.96257 0.9718823 0.4990668 2.9783913 0.5242389 0.5150375 -1440.482 -1375.718 6.5199E-7 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 46 515.03737 484.96263 0.9718822 0.4990667 2.9783911 0.524239 0.5150374 -1440.482 -1375.718 4.6845E-7 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 47 515.03733 484.96267 0.9718821 0.4990666 2.978391 0.5242391 0.5150374 -1440.482 -1375.718 3.3658E-7 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 48 515.0373 484.9627 0.971882 0.4990666 2.9783909 0.5242392 0.5150373 -1440.482 -1375.718 2.4183E-7 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 49 515.03728 484.96272 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482 -1375.718 1.7375E-7 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 50 515.03727 484.96273 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482 -1375.718 1.2484E-7 ITER SUMZ1 SUMZ2 MU1 SIGMA1 MU2 SIGMA2 ALPHA SUMQ SUMLOGLIKE DIFF iter = 51 515.03726 484.96274 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482 -1375.718 8.9695E-8 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 The SAS System 16:22 Sunday, December 26, 2010 6 The UNIVARIATE Procedure Variable: x Moments N 1000 Sum Weights 1000 Mean 1.94496391 Sum Observations 1944.96391 Std Deviation 1.12624922 Variance 1.2684373 Skewness 0.07244942 Kurtosis -1.2447552 Uncorrected SS 5050.05346 Corrected SS 1267.16886 Coeff Variation 57.9059187 Std Error Mean 0.03561513 Basic Statistical Measures Location Variability Mean 1.944964 Std Deviation 1.12625 Median 1.820389 Variance 1.26844 Mode . Range 5.02183 Interquartile Range 2.00123 Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 54.61061 Pr > |t| <.0001 Sign M 485 Pr >= |M| <.0001 Signed Rank S 249842 Pr >= |S| <.0001 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.950076 Pr < W <0.0001 Kolmogorov-Smirnov D 0.101295 Pr > D <0.0100 Cramer-von Mises W-Sq 3.720664 Pr > W-Sq <0.0050 Anderson-Darling A-Sq 20.30881 Pr > A-Sq <0.0050 Quantiles (Definition 5) Quantile Estimate 100% Max 4.5148019 99% 4.0532249 95% 3.6613783 90% 3.4152895 75% Q3 2.9383861 50% Median 1.8203894 25% Q1 0.9371594 10% 0.5344829 5% 0.3544992 1% -0.0651857 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 The SAS System 16:22 Sunday, December 26, 2010 7 The UNIVARIATE Procedure Variable: x Quantiles (Definition 5) Quantile Estimate 0% Min -0.5070234 Extreme Observations ------Lowest------ -----Highest----- Value Obs Value Obs -0.507023 83 4.34448 462 -0.505849 497 4.34469 327 -0.401534 383 4.35188 679 -0.263233 992 4.39133 972 -0.253069 76 4.51480 781 Histogram # Boxplot Normal Probability Plot 4.5+* 1 | 4.5+ ++ * .*** 5 | | + * .***** 9 | | ++ **** .******* 14 | | ++**** .***************** 34 | | ***** 3.5+******************** 39 | 3.5+ *** .*************************** 54 | | **** .*********************************** 69 | | ***+ .***************************************** 81 +-----+ | ****+ .*********************************** 69 | | | ***++ 2.5+************************* 50 | | 2.5+ ** + .**************** 31 | | | **++ .************** 28 | | | **+ .********* 18 *--+--* | +* .*************** 30 | | | ++** 1.5+****************************** 60 | | 1.5+ + ** .****************************** 59 | | | ++*** .**************************************** 80 | | | ++*** .************************************ 71 +-----+ | ++*** .***************************************** 81 | | **** 0.5+***************************** 57 | 0.5+ **** .*************** 30 | | ****++ .******** 15 | | **** ++ .**** 8 | | *** ++ .** 4 | |* + -0.5+** 3 | -0.5+* ++ ----+----+----+----+----+----+----+----+- +----+----+----+----+----+----+----+----+----+----+ * may represent up to 2 counts -2 -1 0 +1 +2 ~john-c/5421/emalgorithm.sas 26DEC10 16:22 ========================================================================== Several things to note in the program: 1. The notation basically follows that in the paper by Jeff Bilmes referred to above. 2. Note that the log likelihood does indeed increase from one iteration to the next. 3. The algorithm in this case converged after 51 iterations. Here are the statistics: Parameters True Values Starting Values Final Estimates ---------- ----------- --------------- --------------- mu1 1.0000 2.0000 0.9719 mu2 3.0000 3.0000 2.9784 sigma1 0.5000 0.7000 0.4991 sigma2 0.5000 0.7000 0.5242 alpha 0.5000 0.6000 0.5150 ---------------------------------------------------------------------------------------- 3. Note that, in the printout, the PROC UNIVARIATE plot clearly indicates that this distribution is bimodal. 4. Note that the EM algorithm does not automatically produce a covariance matrix for the parameter estimates (unlike maximum likelihood estimation, where variances and covariances of the parameter estimates are computed from the Fisher Information Matrix). However, it is not hard to find estimates of the variances and covariances of the parameter estimates using the delta method. See notes.029. ~john-c/5421/notes.emalgorithm Revised Dec. 26, 2010.