VARIANCE ESTIMATION, Continued SPH 5421 notes.029
THE 'DELTA' METHOD
The 'delta' method, also known as 'propagation of errors', is a useful
way to estimate variances of functions of random variables. The basic problem
is as follows:
Assume A and B are random variables with means MUA and MUB, and variances
SIG2A and SIG2B respectively, and covariance COVAB. Suppose f(A, B) is a
real-valued function of these two random variables. What is the variance of
f(A, B), and how may it be used for construction of confidence intervals?
An answer to this is shown below. Before giving the answer, I must point
out that it is not always explained very well what conditions must be met for
the answer to be valid. Essentially the conditions are that X and Y must
behave like maximum likelihood estimates of parameters. More precisely,
the multivariate distribution of A and B must be such that
sqrt(N) * [(A, B)` - (MUA, MUB}']
converges in distribution (as sample size N gets large) to a multivariate normal
distribution with mean (MUA, MUB)` and covariance matrix SIGMA, where
| SIG2A COVAB |
SIGMA = | |
| COVAB SIG2B |.
When these conditions are met, the approximate variance of f(A, B) is
given by
a
var(f(A, B)) = (df/dA)2 * var(A) + 2 * (df/dA)*(df/dB)*cov(A, B) +
(df/dB)2 * var(B).
This can be written in matrix terms as:
a
var(f(A, B)) = (df/dA df/dB) * SIGMA * (df/dA df/dB)`.
This form of the delta method lends itself to computation in PROC IML.
The 'delta method' is based on the first-degree Taylor's expansion of
the function f(A, B), and requires that the function be at least differentiable.
It generalizes to higher dimensions in a straightforward way.
CONFIDENCE REGIONS
Many statistical routines which compute maximum likelihood
estimates also produce estimated covariance matrices for those
estimates. If these are diagonal matrices, the corresponding
confidence regions are ellipsoids whose axes are parallel to the
coordinate axes in the parameter space. One can compute confidence
regions for the estimates by describing an ellipsoid with a specified
volume. In two dimensions, the area of an ellipse is the product
pi * a * b,
where a and b are the lengths of the major and minor axes of the
ellipse. There is a more complicated expression for higher
dimensions.
In general the covariance matrix is not diagonal. Confidence
regions are ellipses, but they do not have their axes parallel to
the coordinate axes. To find their volumes (or areas) one needs to
(1) find their major and minor axes, and (2) compute their volumes
using the lengths of these axes. As explained in notes.021 the
axes can be computed by finding the eigenvectors of the covariance
matrix.
==================================================================================
PROBLEM 31
Refer to Problem 30, notes.028. Use a PROC IML implementation of the
method just described to compute the standard error of the maximum
likelihood estimate of
1 / (1 + exp(-b0 - b1 * 2)),
where b0 and b1 are the maximum likelihood estimates of the true values
B0 and B1.
Use this to compute 95% confidence limits of the true value of
the above expression.
==================================================================================
PROBLEM 31a:
Assume (b1, b2) are maximum likelihood estimates of
parameters (B1, B2), and that the estimated covariance matrix for
(b1, b2) is
| .02 -.01 |
| |
|-.01 .03 |.
Find a 95% confidence region for the true parameters (B1, B2).
==================================================================================
/home/walleye/john-c/5421/notes.029 Last update: December 8, 2003.