The EM Algorithm                                     SPH 7460 notes.emalgorithm
The EM algorithm was the subject of a famous paper in 1977 by A. P. Dempster,
Nan Laird, and Donald Rubin (J. Royal Statist Soc Series B, vol 39 pp 1-38.)
Many of the ideas in this paper had been discovered several times prior to 1977, but
DLR stated them in more generality.  The algorithm works because of a basic theorem
called Jensen's Inequality.  Basically, if X is a random variable, Jensen's Inequality
states that
       log(Mean(X)) > Mean(log(X)).
This result in turn is a consequence of the fact that the log function is a
concave function.
A reasonably good exposition of the EM algorithm is given in a paper by Jeff
Bilmes which will be handed out in class.  Bilmes illustrates the method by
applying it to a 'mixture' model, i.e., a distribution which is a mixture of
different normal distributions.
The following is a program that implements the EM algorithm for a simple mixture
problem in which the distribution is a mixture of two normals, with means mu1
and mu2 and standard deviations sigma1 and sigma2, and a given probability alpha
that an observation is from the first normal distribution and  probability 1 - alpha of
being from the second normal distribution.  Thus the objective is to estimate
mu1, mu2, sigma1, sigma2, and alpha.
==============================================================================
options linesize = 200 ;
footnote "~john-c/5421/emalgorithm.sas &sysdate &systime" ;
data curve ;
     mu1 = 1 ; mu2 = 3 ;
     sigma1 = 1 ; sigma2 = 1 ;
     alpha = .5 ;
     n = 1000 ;
     do i = 1 to n ;
        group = 1 ;
        r = ranuni(-1) ;
        if r > alpha then group = 2 ;
        if group = 1 then do ;
           x = mu1 + sigma1 * rannor(-1) ;
        end ;
        if group = 2 then do ;
           x = mu2 + sigma2 * rannor(-1) ;
        end ;
        output ;
      end ;
run ;
proc iml ;
     use curve ;
     read all var {x} into x ;
     N = nrow(x) ;
     itlim = 400 ;
     pi = 4 * atan(1) ;
     mu1 = 4.0 ; mu2 = 4.0 ;
     sigma1 = .8 ; sigma2 = 1.3 ;
     alpha = .6 ;
     pi = 4 * atan(1) ;
     eps = 1e-8 ;
     diff = 9999 ;
     do iter = 1 to itlim while (diff > eps) ;
        sumnumermu1 = 0 ;
        sumnumermu2 = 0 ;
        sumz1 = 0 ;
        sumz2 = 0 ;
        sumloglike = 0 ;
        sumq = 0 ;
        do i = 1 to n ;
           xi = x[i] ;
           prob1 = (1 / sqrt(2 * pi))*(1 / sigma1) * exp(-(xi - mu1)**2 /(2 * sigma1**2)) ;
           prob2 = (1 / sqrt(2 * pi))*(1 / sigma2) * exp(-(xi - mu2)**2 /(2 * sigma2**2)) ;
           t1 = prob1 * alpha ;
           t2 = prob2 * (1 - alpha) ;
           z1i = t1 / (t1 + t2) ;
           z2i = t2 / (t1 + t2) ;
           firstpart1 = log(alpha) * z1i ;
           secondpart1 = log(prob1) * z1i ;
           firstpart2 = log(1 - alpha) * z2i ;
           secondpart2 = log(prob2) * z2i ;
           sumq =   sumq + firstpart1 + secondpart1
                  + firstpart2 + secondpart2 ;
           sumloglike = sumloglike + log(alpha * prob1 + (1 - alpha) * prob2) ;
           sumz1 = sumz1 + z1i ;
           sumz2 = sumz2 + z2i ;
           numermu1i = xi * z1i ;
           numermu2i = xi * z2i ;
           sumnumermu1 = sumnumermu1 + numermu1i ;
           sumnumermu2 = sumnumermu2 + numermu2i ;
        end ;
        newalpha = sumz1 / n ;
        newmu1 = sumnumermu1 / sumz1 ;
        newmu2 = sumnumermu2 / sumz2 ;
        sumz1 = 0 ;
        sumz2 = 0 ;
        sumsig1 = 0 ;
        sumsig2 = 0 ;
        do i = 1 to N ;
           xi = x[i] ;
           prob1 = (1 / sqrt(2 * pi))*(1 / sigma1) * exp(-(xi - mu1)**2 /(2 * sigma1**2)) ;
           prob2 = (1 / sqrt(2 * pi))*(1 / sigma2) * exp(-(xi - mu2)**2 /(2 * sigma2**2)) ;
           t1 = prob1 * newalpha ;
           t2 = prob2 * (1 - newalpha) ;
           z1i = t1 / (t1 + t2) ;
           z2i = t2 / (t1 + t2) ;
           sigma1numeri = z1i * (xi - newmu1)**2 * prob1 ;
           sigma2numeri = z2i * (xi - newmu2)**2 * prob2 ;
           sigma1numeri = z1i * (xi - newmu1)**2 ;
           sigma2numeri = z2i * (xi - newmu2)**2 ;
           sumsig1 = sumsig1 + sigma1numeri ;
           sumsig2 = sumsig2 + sigma2numeri ;
           sumz1 = sumz1 + z1i ;
           sumz2 = sumz2 + z2i ;
        end ;
        newsigma1 = sqrt(sumsig1 / sumz1) ;
        newsigma2 = sqrt(sumsig2 / sumz2) ;
        diff =   abs(newalpha - alpha) + abs(newmu1 - mu1) + abs(newmu2 - mu2)
               + abs(newsigma1 - sigma1) + abs(newsigma2 - sigma2) ;
        print "iter = " iter sumz1 sumz2 mu1    sigma1    mu2    sigma2    alpha sumq sumloglike diff ;
        print "iter = " iter newmu1 newsigma1 newmu2 newsigma2 newalpha ;
        alpha = newalpha ;
        mu1 = newmu1 ;
        mu2 = newmu2 ;
        sigma1 = newsigma1 ;
        sigma2 = newsigma2 ;
     end ;
     sumloglike = 0 ;
quit ;
endsas ;
proc univariate data = curve plot normal ;
     var x ;
run ;
==============================================================================
Output from the program:
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   1
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          1 719.54853 280.45147         2       0.7         3       0.7       0.6 -2152.243  -1793.204 0.9972825
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          2 660.92035 339.07965 1.4337345 0.9543408 3.0307973 0.6340151  0.679894 -1782.227  -1477.166 0.2893968
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          3 640.55015 359.44985 1.3781324 0.8853073 3.0791927 0.5307658 0.6667775  -1692.87  -1449.702 0.2480738
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          4 620.67737 379.32263 1.3074685  0.816019 3.1141961 0.4772704 0.6471547 -1622.068  -1430.543 0.1781684
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          5 601.12775 398.87225 1.2419169 0.7507156 3.1215423 0.4584936 0.6259643 -1569.984  -1417.438 0.1479576
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          6 582.91113 417.08887 1.1841992 0.6950813  3.110618 0.4556881 0.6050886 -1530.234  -1406.546 0.1395247
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          7 567.03573 432.96427 1.1342649 0.6482143 3.0914626 0.4599533 0.5857858 -1499.344  -1397.045 0.1260924
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          8 553.95495 446.04505 1.0925576 0.6091167 3.0705377 0.4675821 0.5690521 -1476.382  -1389.354 0.1066973
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =          9  543.6629  456.3371 1.0594165 0.5778772 3.0509371 0.4765872 0.5553411 -1460.727  -1383.806 0.0854031
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         10 535.84928 464.15072 1.0343432 0.5543082 3.0340083 0.4856915 0.5446134 -1451.038  -1380.238  0.065482
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         11 530.06401 469.93599 1.0160751 0.5373789  3.020179 0.4940401 0.5365067 -1445.547  -1378.145 0.0487448
 
 
 
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   2
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         12 525.84735 474.15265 1.0030705 0.5255894 3.0093354 0.5011643 0.5305239 -1442.658  -1376.991 0.0356309
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         13 522.80095 477.19905 0.9939139 0.5174905 3.0010775 0.5069302 0.5261722 -1441.239  -1376.378 0.0257828
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         14 520.60968 479.39032  0.987485 0.5119367 2.9949125 0.5114249 0.5230319 -1440.596  -1376.059 0.0185617
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         15 519.03649 480.96351  0.982964 0.5081101 2.9903698 0.5148391 0.5207746 -1440.338  -1375.893 0.0133326
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         16 517.90777 482.09223 0.9797739 0.5054547 2.9870508 0.5173871 0.5191545 -1440.261  -1375.808 0.0095685
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         17 517.09798 482.90202 0.9775145 0.5035983 2.9846392 0.5192661 0.5179924 -1440.263  -1375.765  0.006866
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         18  516.5169  483.4831 0.9759089 0.5022921 2.9828933 0.5206408 0.5171587 -1440.293  -1375.742 0.0049273
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         19 516.09983 483.90017 0.9747648 0.5013682 2.9816324 0.5216409 0.5165605  -1440.33   -1375.73 0.0035367
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         20 515.80042 484.19958 0.9739477  0.500712 2.9807231 0.5223658 0.5161311 -1440.365  -1375.725 0.0025391
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         21 515.58542 484.41458 0.9733633 0.5002445 2.9800682 0.5228898 0.5158229 -1440.393  -1375.721 0.0018233
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         22 515.43102 484.56898 0.9729448 0.4999106 2.9795968  0.523268 0.5156016 -1440.416   -1375.72 0.0013094
 
 
 
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   3
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         23 515.32013 484.67987 0.9726448 0.4996718 2.9792577 0.5235406 0.5154426 -1440.434  -1375.719 0.0009405
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         24 515.24047 484.75953 0.9724296 0.4995008 2.9790139 0.5237368 0.5153285 -1440.447  -1375.719 0.0006756
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         25 515.18324 484.81676 0.9722752 0.4993782 2.9788386 0.5238781 0.5152465 -1440.456  -1375.718 0.0004853
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         26 515.14213 484.85787 0.9721643 0.4992903 2.9787125 0.5239797 0.5151875 -1440.463  -1375.718 0.0003486
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         27  515.1126  484.8874 0.9720847 0.4992272  2.978622 0.5240527 0.5151452 -1440.468  -1375.718 0.0002505
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         28 515.09138 484.90862 0.9720276 0.4991819 2.9785569 0.5241052 0.5151148 -1440.472  -1375.718   0.00018
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         29 515.07614 484.92386 0.9719865 0.4991494 2.9785101  0.524143  0.515093 -1440.475  -1375.718 0.0001293
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         30 515.06518 484.93482  0.971957  0.499126 2.9784765 0.5241701 0.5150773 -1440.477  -1375.718 0.0000929
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         31 515.05731 484.94269 0.9719359 0.4991092 2.9784523 0.5241896  0.515066 -1440.478  -1375.718 0.0000667
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         32 515.05166 484.94834 0.9719206 0.4990972  2.978435 0.5242036 0.5150579 -1440.479  -1375.718 0.0000479
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         33  515.0476  484.9524 0.9719097 0.4990885 2.9784225 0.5242137 0.5150521  -1440.48  -1375.718 0.0000345
 
 
 
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   4
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         34 515.04468 484.95532 0.9719018 0.4990823 2.9784136 0.5242209 0.5150479 -1440.481  -1375.718 0.0000248
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         35 515.04258 484.95742 0.9718962 0.4990778 2.9784071 0.5242261 0.5150449 -1440.481  -1375.718 0.0000178
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         36 515.04108 484.95892 0.9718921 0.4990746 2.9784025 0.5242299 0.5150427 -1440.481  -1375.718 0.0000128
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         37 515.03999 484.96001 0.9718892 0.4990723 2.9783992 0.5242325 0.5150412 -1440.481  -1375.718 9.1807E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         38 515.03922 484.96078 0.9718871 0.4990706 2.9783968 0.5242345 0.5150401 -1440.482  -1375.718 6.5962E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         39 515.03866 484.96134 0.9718856 0.4990695 2.9783951 0.5242359 0.5150393 -1440.482  -1375.718 4.7393E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         40 515.03825 484.96175 0.9718846 0.4990686 2.9783938 0.5242368 0.5150387 -1440.482  -1375.718 3.4052E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         41 515.03797 484.96203 0.9718838  0.499068 2.9783929 0.5242376 0.5150383 -1440.482  -1375.718 2.4466E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         42 515.03776 484.96224 0.9718832 0.4990675 2.9783923 0.5242381  0.515038 -1440.482  -1375.718 1.7578E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         43 515.03761 484.96239 0.9718828 0.4990672 2.9783919 0.5242384 0.5150378 -1440.482  -1375.718  1.263E-6
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         44  515.0375  484.9625 0.9718825  0.499067 2.9783915 0.5242387 0.5150376 -1440.482  -1375.718 9.0744E-7
 
 
 
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   5
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         45 515.03743 484.96257 0.9718823 0.4990668 2.9783913 0.5242389 0.5150375 -1440.482  -1375.718 6.5199E-7
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         46 515.03737 484.96263 0.9718822 0.4990667 2.9783911  0.524239 0.5150374 -1440.482  -1375.718 4.6845E-7
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         47 515.03733 484.96267 0.9718821 0.4990666  2.978391 0.5242391 0.5150374 -1440.482  -1375.718 3.3658E-7
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         48  515.0373  484.9627  0.971882 0.4990666 2.9783909 0.5242392 0.5150373 -1440.482  -1375.718 2.4183E-7
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         49 515.03728 484.96272 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482  -1375.718 1.7375E-7
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         50 515.03727 484.96273 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482  -1375.718 1.2484E-7
                             ITER     SUMZ1     SUMZ2       MU1    SIGMA1       MU2    SIGMA2     ALPHA      SUMQ SUMLOGLIKE      DIFF
                iter =         51 515.03726 484.96274 0.9718819 0.4990665 2.9783908 0.5242393 0.5150373 -1440.482  -1375.718 8.9695E-8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   6
                                                               The UNIVARIATE Procedure
                                                                     Variable:  x
                                                                       Moments
                                           N                        1000    Sum Weights               1000
                                           Mean               1.94496391    Sum Observations    1944.96391
                                           Std Deviation      1.12624922    Variance             1.2684373
                                           Skewness           0.07244942    Kurtosis            -1.2447552
                                           Uncorrected SS     5050.05346    Corrected SS        1267.16886
                                           Coeff Variation    57.9059187    Std Error Mean      0.03561513
                                                              Basic Statistical Measures
 
                                                    Location                    Variability
                                                Mean     1.944964     Std Deviation            1.12625
                                                Median   1.820389     Variance                 1.26844
                                                Mode      .           Range                    5.02183
                                                                      Interquartile Range      2.00123
                                                              Tests for Location: Mu0=0
 
                                                   Test           -Statistic-    -----p Value------
                                                   Student's t    t  54.61061    Pr > |t|    <.0001
                                                   Sign           M       485    Pr >= |M|   <.0001
                                                   Signed Rank    S    249842    Pr >= |S|   <.0001
                                                                 Tests for Normality
 
                                              Test                  --Statistic---    -----p Value------
                                              Shapiro-Wilk          W     0.950076    Pr < W     <0.0001
                                              Kolmogorov-Smirnov    D     0.101295    Pr > D     <0.0100
                                              Cramer-von Mises      W-Sq  3.720664    Pr > W-Sq  <0.0050
                                              Anderson-Darling      A-Sq  20.30881    Pr > A-Sq  <0.0050
                                                               Quantiles (Definition 5)
 
                                                               Quantile        Estimate
                                                               100% Max       4.5148019
                                                               99%            4.0532249
                                                               95%            3.6613783
                                                               90%            3.4152895
                                                               75% Q3         2.9383861
                                                               50% Median     1.8203894
                                                               25% Q1         0.9371594
                                                               10%            0.5344829
                                                               5%             0.3544992
                                                               1%            -0.0651857
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
                                                                    The SAS System                                 16:22 Sunday, December 26, 2010   7
                                                               The UNIVARIATE Procedure
                                                                     Variable:  x
                                                               Quantiles (Definition 5)
 
                                                               Quantile        Estimate
                                                               0% Min        -0.5070234
                                                                 Extreme Observations
 
                                                     ------Lowest------        -----Highest-----
 
                                                         Value      Obs           Value      Obs
                                                     -0.507023       83         4.34448      462
                                                     -0.505849      497         4.34469      327
                                                     -0.401534      383         4.35188      679
                                                     -0.263233      992         4.39133      972
                                                     -0.253069       76         4.51480      781
                        Histogram                      #             Boxplot                                   Normal Probability Plot
     4.5+*                                             1                |                    4.5+                                               ++ *
        .***                                           5                |                       |                                              +   *
        .*****                                         9                |                       |                                            ++ ****
        .*******                                      14                |                       |                                          ++****
        .*****************                            34                |                       |                                        *****
     3.5+********************                         39                |                    3.5+                                      ***
        .***************************                  54                |                       |                                   ****
        .***********************************          69                |                       |                                 ***+
        .*****************************************    81             +-----+                    |                              ****+
        .***********************************          69             |     |                    |                            ***++
     2.5+*************************                    50             |     |                 2.5+                           ** +
        .****************                             31             |     |                    |                          **++
        .**************                               28             |     |                    |                         **+
        .*********                                    18             *--+--*                    |                        +*
        .***************                              30             |     |                    |                      ++**
     1.5+******************************               60             |     |                 1.5+                     + **
        .******************************               59             |     |                    |                   ++***
        .****************************************     80             |     |                    |                 ++***
        .************************************         71             +-----+                    |               ++***
        .*****************************************    81                |                       |             ****
     0.5+*****************************                57                |                    0.5+          ****
        .***************                              30                |                       |      ****++
        .********                                     15                |                       |   **** ++
        .****                                          8                |                       | ***  ++
        .**                                            4                |                       |*    +
    -0.5+**                                            3                |                   -0.5+*  ++
         ----+----+----+----+----+----+----+----+-                                               +----+----+----+----+----+----+----+----+----+----+
         * may represent up to 2 counts                                                              -2        -1         0        +1        +2
 
 
 
                                                      ~john-c/5421/emalgorithm.sas 26DEC10 16:22
==========================================================================
Several things to note in the program:
1.  The notation basically follows that in the paper by Jeff Bilmes referred to above.
2.  Note that the log likelihood does indeed increase from one iteration to the next.
3.  The algorithm in this case converged after 51 iterations.  Here are the statistics:
    Parameters             True Values             Starting Values           Final Estimates
    ----------             -----------             ---------------           ---------------
       mu1                   1.0000                    2.0000                    0.9719
       mu2                   3.0000                    3.0000                    2.9784
       sigma1                0.5000                    0.7000                    0.4991
       sigma2                0.5000                    0.7000                    0.5242
       alpha                 0.5000                    0.6000                    0.5150
----------------------------------------------------------------------------------------
  3.  Note that, in the printout, the PROC UNIVARIATE plot clearly indicates that this
      distribution is bimodal.
  4.  Note that the EM algorithm does not automatically produce a covariance matrix for the
      parameter estimates (unlike maximum likelihood estimation, where variances and covariances
      of the parameter estimates are computed from the Fisher Information Matrix).  However,
      it is not hard to find estimates of the variances and covariances of the parameter estimates
      using the delta method.  See notes.029.
~john-c/5421/notes.emalgorithm  Revised Dec. 26, 2010.