SAS PROC NLIN SPH 5421 notes.026
PROC NLIN is a procedure to find least-squares estimates of
coefficients for nonlinear models. The underlying model is of the form
Yi = f(Xi; B) + ei,
where the ei are assumed to be normally distributed, ei ~ N(0, sig^2), B
is a vector of unknown coefficients, and Xi is a vector of covariates.
In general the function f is nonlinear in the coefficients B. The
randomness in the data is all contained in the ei.
In general, we will assume that Xi is a k-vector. The number of
unknown parameters may not equal k in general; we shall assume that B is a
p x 1 vector.
Here is an example:
[1] Yi = a * exp(-b * ti) + ei
In this case, Xi = ti, and B = (a, b)`. The unknowns to be found are
a, b, and sig^2.
Note that if this problem were changed just slightly, as follows,
[2] Yi = a * exp(-b + ti) + ei,
it would no longer be a nonlinear problem, because it could be written as
[3] Yi = a + c * ui + ei,
where c = exp(-b) and ui = exp(ti). It is thus just an ordinary linear
regression problem. You would estimate b by first estimating c and then
noting that
b = -log(c).
Hopefully your estimate of c is a positive number, since otherwise b will be
undefined.
In general if you can transform a nonlinear problem into a linear
one, you should do so and not bother with using PROC NLIN.
But the original model [1] is not linear. It cannot be converted to
be linear without changing the error structure.
Here is how PROC NLIN works. Consider the sum of squares
SS = sum_i(Yi - f(Xi; B))^2.
The objective is to find coefficients B which minimize this sum of
squares. This is equivalent to finding maximum likelihood estimates of
the coefficients.
You proceed as follows: take the first derivative of SS with respect
to each of the components of B = (b1, b2, b3, ... , bp)`, and set these
derivatives equal to zero:
[4] dSS/dbj = -2 * sum_i(Yi - f(Xi; B)) * df/dbj = 0.
Note here that j is the index of the j-th parameter, while i is the
index of the i-th observation. [4] is in fact a system of p equations.
In general these equations are not linear.
PROC NLIN can solve this system of equations by the use of several
algorithms, most of which are variants of the Newton-Raphson algorithm.
You need to specify the function f(X; B) and its first derivatives, and
which algorithm you want PROC NLIN to use.
The program below illustrates the use of several methods for PROC NLIN.
In the older versions of SAS (through Version 6.12), it was necessary to
specify the derivatives of the model with respect to the parameters. This is
done in some of the versions listed below, for example:
------------------------------------------------------------------------------
proc nlin method = marquardt ;
parms a .50
b .50 ;
der.a = 1 - exp(-b * (x - 8)) ;
der.b = - (x - 8) * (.49 - a) * exp(-b * (x - 8)) ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = MARQUARDT ' ;
run ;
------------------------------------------------------------------------------
However, later versions of SAS replaced the requirement that you supply
the derivative expressions with a method of automatic analytic computation of
the derivatives. This means that you can omit the statments like 'der.a = ... ' .
Note that this is used in the 'method = newton' example below.
It is also possible to have SAS perform an initial grid search for a
good starting point for the parameters. This is also incorporated into
the 'metho = newton' example:
------------------------------------------------------------------------------
proc nlin method = newton ;
parms a .2 to .6 by .05
b .01 to .5 by .05 ;
* der.a = 1 - exp(-b * (x - 8)) ;
* der.b = - (x - 8) * (.49 - a) * exp(-b * (x - 8)) ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = NEWTON with Initial Search ' ;
run ;
------------------------------------------------------------------------------
Regarding the DUD method: This method uses numerical estimation of the
derivatives. As you can see from the printout, this method fails to work even
when you incorporate an initial search. It is not recommended.
The dataset below is from the textbook, Applied Regression Analysis,
by N. Draper and H. Smith. The nonlinear regression problem involves
the amount of available chlorine in a product which is used for washing
clothes. The amount of available chlorine deteriorates with time. In
the following table, X is the time in weeks of storage of the product,
and Y is the amount of available chlorine:
-------------------------------------------------------------------------------
X = time Y = chlorine
-------- ------------
8 0.49
8 0.49
10 0.48
10 0.47
10 0.48
10 0.47
12 0.46
12 0.46
12 0.45
12 0.43
14 0.45
14 0.43
14 0.43
16 0.44
16 0.43
16 0.43
18 0.46
18 0.45
20 0.42
20 0.42
20 0.43
22 0.41
22 0.41
22 0.40
24 0.42
24 0.40
24 0.40
26 0.41
26 0.40
26 0.41
28 0.41
28 0.40
30 0.40
30 0.40
30 0.38
32 0.41
32 0.40
34 0.40
36 0.41
36 0.38
38 0.40
38 0.40
40 0.39
42 0.39
---------------------------------------------------------------------------------
The assumed model is:
Y = a + (.49 - a) * exp(-b * (X - 8)) + e,
where e is the error term, assumed N(0, sig2). Coefficients a and b
are unknown, and the problem is to estimate them. Note that the
coefficient a occurs in two places in the expression on the right.
The goal here is to find estimates of a and b and the variance of the
error term, sig2.
The following is a PROC NLIN program which estimates the coefficients
a and b. Note that in the printout, only the results of the procedure
with method = marquardt are shown.
==================================================================================
/* Program to illustrate the use of PROC NLIN for nonlinear least */
/* squares problems. The data here are from Draper & Smith, */
/* Applied Regression Analysis, Wiley & Sons, 1966 (1st edition).*/
/* Program to illustrate the use of PROC NLIN for nonlinear least */
/* squares problems. The data here are from Draper & Smith, */
/* Applied Regression Analysis, Wiley & Sons, 1966 (1st edition).*/
/* Program to illustrate the use of PROC NLIN for nonlinear least */
/* squares problems. The data here are from Draper & Smith, */
/* Applied Regression Analysis, Wiley & Sons, 1966 (1st edition).*/
options linesize = 80 ;
footnote "prog: /home/walleye/john-c/5421/draper.sas &sysdate &systime";
libname sasplus '' ;
FILENAME GRAPH 'gsas.grf' ;
LIBNAME loc '' ;
OPTIONS LINESIZE = 80 MPRINT ;
GOPTIONS
RESET = GLOBAL
ROTATE = PORTRAIT
FTEXT = SWISSB
DEVICE = PS300
GACCESS = SASGASTD
GSFNAME = GRAPH
GSFMODE = REPLACE
GUNIT = PCT BORDER
CBACK = WHITE
HTITLE = 2 HTEXT = 1 ;
*===================================================================== ;
data draper ;
infile 'draper.dat' ;
input x y ;
time = x ;
chlorine = y ;
run ;
data sasplus.drape ;
set draper ;
time = x ;
chlorine = y ;
run ;
symbol v = 'o' w = 2 h = 1.5 c = black ;
proc gplot data = draper ;
plot chlorine * time ;
title1 h = 2 'Draper data: Chlorine Concentration vs Storage Time' ;
run ;
proc nlin method = marquardt ;
parms a .50
b .50 ;
der.a = 1 - exp(-b * (x - 8)) ;
der.b = - (x - 8) * (.49 - a) * exp(-b * (x - 8)) ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = MARQUARDT ' ;
run ;
proc nlin method = gradient maxiter = 1000 ;
parms a .50
b .50 ;
der.a = 1 - exp(-b * (x - 8)) ;
der.b = - (x - 8) * (.49 - a) * exp(-b * (x - 8)) ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = GRADIENT ' ;
run ;
proc nlin method = gauss ;
parms a .50
b .50 ;
der.a = 1 - exp(-b * (x - 8)) ;
der.b = - (x - 8) * (.49 - a) * exp(-b * (x - 8)) ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = GAUSS ' ;
run ;
proc nlin method = newton ;
parms a .2 to .6 by .05
b .01 to .5 by .05 ;
* der.a = 1 - exp(-b * (x - 8)) ;
* der.b = - (x - 8) * (.49 - a) * exp(-b * (x - 8)) ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = NEWTON with Initial Search ' ;
run ;
title ' ... no title ...' ;
proc nlin method = dud ;
parms a .2 to .6 by .05
b .01 to .5 by .05 ;
model y = a + (.49 - a) * exp(-b * (x - 8)) ;
title 'PROC NLIN: Data from Draper and Smith: Method = DUD with Initial Search ' ;
run ;
proc iml ;
use draper ;
read all var {x} into x ;
read all var {y} into y ;
npoint = 0 ;
a = j(2500, 1, 0) ;
b = j(2500, 1, 0) ;
obsn = j(2500, 1, 0) ;
sumsq = j(2500, 1, 0) ;
do i = 1 to 50 ;
do j = 1 to 50 ;
npoint = npoint + 1 ;
obsn[npoint] = npoint ;
a[npoint] = .385 + .01 * i / 50 ;
b[npoint] = .095 + .01 * j / 50 ;
ai = a[npoint] ; bj = b[npoint] ;
sumsq[npoint] = 0 ;
do k = 1 to 44 ;
predy = ai + (.49 - ai) * exp(-bj * (x[k] - 8)) ;
obsdy = y[k] ;
sumsq[npoint] = sumsq[npoint] + 10000 * (obsdy - predy)**2 ;
end ;
end ;
end ;
V = obsn || a || b || sumsq ;
varnames = {'npoint' 'a' 'b' 'sumsq'} ;
create likesurf from V [colname = varnames] ;
append from V ;
quit ;
proc print data = likesurf ;
where npoint le 2500 ;
run ;
goptions gsfmode = append ;
proc gcontour data = likesurf ;
plot a * b = sumsq / levels = 50 to 56 by .10 ;
title H = 2 f = swissb 'Sum-of-Squares Surface: Draper-Smith Chlorine Data' ;
run ;
endsas ;
PROC NLIN: Data from Draper and Smith: Method = MARQUARDT 1
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Method: Marquardt
Iterative Phase
Sum of
Iter a b Squares
0 0.5000 0.5000 0.2853
1 0.4084 6.2850 0.0383
2 0.4084 4.1490 0.0383
3 0.4084 3.8510 0.0382
4 0.4085 2.2122 0.0377
5 0.4087 1.6061 0.0362
6 0.4105 0.0983 0.0150
7 0.4006 0.1199 0.00588
8 0.3929 0.1072 0.00505
9 0.3904 0.1019 0.00500
10 0.3901 0.1016 0.00500
11 0.3901 0.1016 0.00500
NOTE: Convergence criterion met.
Estimation Summary
Method Marquardt
Iterations 11
Subiterations 14
Average Subiterations 1.272727
R 1.818E-6
PPC(b) 1.549E-6
RPC(b) 0.000067
Object 7.22E-9
Objective 0.005002
Observations Read 44
Observations Used 44
Observations Missing 0
NOTE: An intercept was not specified for this model.
Sum of Mean Approx
Source DF Squares Square F Value Pr > F
Regression 2 7.9820 3.9910 33513.1 <.0001
Residual 42 0.00500 0.000119
Uncorrected Total 44 7.9870
Corrected Total 43 0.0395
PROC NLIN: Data from Draper and Smith: Method = MARQUARDT 2
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Approx Approximate 95% Confidence
Parameter Estimate Std Error Limits
a 0.3901 0.00504 0.3800 0.4003
b 0.1016 0.0134 0.0747 0.1286
Approximate Correlation Matrix
a b
a 1.0000000 0.8878597
b 0.8878597 1.0000000
PROC NLIN: Data from Draper and Smith: Method = GRADIENT 3
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Method: Gradient
Iterative Phase
Sum of
Iter a b Squares
0 0.5000 0.5000 0.2853
1 0.4005 0.4999 0.0290
2 0.4239 0.4990 0.0213
3 0.4163 0.4988 0.0192
4 0.4170 0.4971 0.0192
5 0.4159 0.4965 0.0192
. . . .
. . . .
. . . .
520 0.3901 0.1016 0.00500
521 0.3901 0.1016 0.00500
522 0.3901 0.1016 0.00500
523 0.3901 0.1016 0.00500
524 0.3901 0.1016 0.00500
525 0.3901 0.1016 0.00500
526 0.3901 0.1016 0.00500
527 0.3901 0.1016 0.00500
PROC NLIN: Data from Draper and Smith: Method = GRADIENT 14
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Method: Gradient
NOTE: Convergence criterion met.
Estimation Summary
Method Gradient
Iterations 527
Subiterations 1037
Average Subiterations 1.967742
R 8.576E-6
PPC(b) 5.396E-6
RPC(b) 2.161E-6
Object 7.25E-12
Objective 0.005002
Observations Read 44
Observations Used 44
Observations Missing 0
NOTE: An intercept was not specified for this model.
Sum of Mean Approx
Source DF Squares Square F Value Pr > F
Regression 2 7.9820 3.9910 33513.1 <.0001
Residual 42 0.00500 0.000119
Uncorrected Total 44 7.9870
Corrected Total 43 0.0395
Approx Approximate 95% Confidence
Parameter Estimate Std Error Limits
a 0.3901 0.00504 0.3800 0.4003
b 0.1016 0.0134 0.0747 0.1286
Approximate Correlation Matrix
a b
a 1.0000000 0.8878586
b 0.8878586 1.0000000
PROC NLIN: Data from Draper and Smith: Method = GAUSS 15
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Method: Gauss-Newton
Iterative Phase
Sum of
Iter a b Squares
0 0.5000 0.5000 0.2853
1 0.4084 6.2850 0.0383
2 0.4084 3.1255 0.0382
3 0.4084 3.0969 0.0382
4 0.4084 2.9622 0.0381
5 0.4085 2.4488 0.0379
6 0.4086 1.5364 0.0361
7 0.4092 0.8175 0.0290
8 0.4104 0.1457 0.00877
9 0.3944 0.0948 0.00599
10 0.3903 0.1017 0.00500
11 0.3901 0.1016 0.00500
12 0.3901 0.1016 0.00500
NOTE: Convergence criterion met.
Estimation Summary
Method Gauss-Newton
Iterations 12
Subiterations 15
Average Subiterations 1.25
R 3.608E-7
PPC(b) 3.073E-7
RPC(b) 0.000013
Object 6.34E-10
Objective 0.005002
Observations Read 44
Observations Used 44
Observations Missing 0
NOTE: An intercept was not specified for this model.
Sum of Mean Approx
Source DF Squares Square F Value Pr > F
Regression 2 7.9820 3.9910 33513.1 <.0001
Residual 42 0.00500 0.000119
Uncorrected Total 44 7.9870
Corrected Total 43 0.0395
PROC NLIN: Data from Draper and Smith: Method = GAUSS 16
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Approx Approximate 95% Confidence
Parameter Estimate Std Error Limits
a 0.3901 0.00504 0.3800 0.4003
b 0.1016 0.0134 0.0747 0.1286
Approximate Correlation Matrix
a b
a 1.0000000 0.8878600
b 0.8878600 1.0000000
PROC NLIN: Data from Draper and Smith: Method = NEWTON with Initial Search 17
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Grid Search
Sum of
a b Squares
0.2000 0.0100 0.0427
0.2500 0.0100 0.0618
0.3000 0.0100 0.0861
0.3500 0.0100 0.1156
0.4000 0.0100 0.1502
0.4500 0.0100 0.1899
0.5000 0.0100 0.2348
. . .
. . .
. . .
0.4000 0.4600 0.0277
0.4500 0.4600 0.0622
0.5000 0.4600 0.2851
0.5500 0.4600 0.6963
0.6000 0.4600 1.2960
PROC NLIN: Data from Draper and Smith: Method = NEWTON with Initial Search 19
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Method: Newton
Iterative Phase
Sum of
Iter a b Squares
0 0.4000 0.1100 0.00628
1 0.3925 0.1090 0.00504
2 0.3901 0.1011 0.00500
3 0.3901 0.1016 0.00500
4 0.3901 0.1016 0.00500
NOTE: Convergence criterion met.
Estimation Summary
Method Newton
Iterations 4
R 1.655E-7
PPC(b) 8.322E-8
RPC(b) 0.000239
Object 7.735E-8
Objective 0.005002
Observations Read 44
Observations Used 44
Observations Missing 0
NOTE: An intercept was not specified for this model.
Sum of Mean Approx
Source DF Squares Square F Value Pr > F
Regression 2 7.9820 3.9910 33513.1 <.0001
Residual 42 0.00500 0.000119
Uncorrected Total 44 7.9870
Corrected Total 43 0.0395
Approx Approximate 95% Confidence
Parameter Estimate Std Error Limits
a 0.3901 0.00509 0.3799 0.4004
b 0.1016 0.0135 0.0744 0.1289
Approximate Correlation Matrix
a b
a 1.0000000 0.8899899
b 0.8899899 1.0000000
PROC NLIN: Data from Draper and Smith: Method = DUD with Initial Search 20
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
Grid Search
Sum of
a b Squares
0.2000 0.0100 0.0427
0.2500 0.0100 0.0618
0.3000 0.0100 0.0861
0.3500 0.0100 0.1156
0.4000 0.0100 0.1502
0.4500 0.0100 0.1899
0.5000 0.0100 0.2348
. . .
. . .
. . .
0.4000 0.4600 0.0277
0.4500 0.4600 0.0622
0.5000 0.4600 0.2851
0.5500 0.4600 0.6963
0.6000 0.4600 1.2960
PROC NLIN: Data from Draper and Smith: Method = DUD with Initial Search 22
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Dependent Variable Y
DUD Initialization
Sum of
DUD a b Squares
-3 0.4000 0.1100 0.00628
-2 0.4400 0.1100 0.0571
-1 0.4000 0.1210 0.00571
Iterative Phase
Sum of
Iter a b Squares
0 0.4000 0.1210 0.00571
1 0.3906 0.1010 0.00501
2 0.3902 0.1020 0.00500
3 0.3902 0.1018 0.00500
4 0.3902 0.1017 0.00500
5 0.3901 0.1016 0.00500
6 0.3901 0.1016 0.00500
NOTE: Convergence criterion met.
Estimation Summary
Method DUD
Iterations 6
Object 1.035E-9
Objective 0.005002
Observations Read 44
Observations Used 44
Observations Missing 0
NOTE: An intercept was not specified for this model.
Sum of Mean Approx
Source DF Squares Square F Value Pr > F
Regression 2 7.9820 3.9910 33513.1 <.0001
Residual 42 0.00500 0.000119
Uncorrected Total 44 7.9870
Corrected Total 43 0.0395
PROC NLIN: Data from Draper and Smith: Method = DUD with Initial Search 23
16:44 Wednesday, December 7, 2011
The NLIN Procedure
Approx Approximate 95% Confidence
Parameter Estimate Std Error Limits
a 0.3901 0.00504 0.3800 0.4003
b 0.1016 0.0134 0.0747 0.1286
Approximate Correlation Matrix
a b
a 1.0000000 0.8878553
b 0.8878553 1.0000000
PROC NLIN: Data from Draper and Smith: Method = DUD with Initial Search 24
16:44 Wednesday, December 7, 2011
Obs npoint a b sumsq
1 1 0.3852 0.0952 51.7570
2 2 0.3852 0.0954 51.8282
3 3 0.3852 0.0956 51.9023
4 4 0.3852 0.0958 51.9794
5 5 0.3852 0.0960 52.0594
. . . . .
. . . . .
. . . . .
2497 2497 0.395 0.1044 53.5922
2498 2498 0.395 0.1046 53.4921
2499 2499 0.395 0.1048 53.3945
2500 2500 0.395 0.1050 53.2993
------------------------------------------------------------------------
PROBLEM 28
Given the dataset below of 50 observations, assume the model
Yi = a * cos(c + b * ti) + err,
where err ~ N(0, sig2).
1. Use PROC NLIN with method = gauss to find estimates of a, b, and
c. Also provide an estimate of sig2.
You will probably need to graph the data to decide what the
starting values for a, b, and c should be.
2. Using a test based on -2 * log likelihood, test the hypothesis
that c = .9.
Dataset:
------------------------
Obs ti yi
1 0.2 0.20427
2 0.4 -0.07374
3 0.6 -0.02483
4 0.8 -0.16959
5 1.0 -0.43232
6 1.2 -0.53239
7 1.4 -0.97995
8 1.6 -0.66267
9 1.8 -1.12233
10 2.0 -0.94835
11 2.2 -0.78410
12 2.4 -0.79513
13 2.6 -0.54265
14 2.8 -0.59817
15 3.0 -0.71626
16 3.2 -0.50005
17 3.4 -0.11546
18 3.6 -0.09870
19 3.8 -0.29911
20 4.0 0.38076
21 4.2 0.40349
22 4.4 1.07876
23 4.6 0.84918
24 4.8 0.60890
25 5.0 0.66975
26 5.2 0.53559
27 5.4 1.32738
28 5.6 1.01013
29 5.8 0.97087
30 6.0 0.99035
31 6.2 0.72526
32 6.4 0.55154
33 6.6 0.35486
34 6.8 0.36963
35 7.0 0.04066
36 7.2 -0.30633
37 7.4 -0.53084
38 7.6 -0.88813
39 7.8 -0.81896
40 8.0 -0.72137
41 8.2 -0.85303
42 8.4 -0.83439
43 8.6 -1.05811
44 8.8 -1.02931
45 9.0 -0.94405
46 9.2 -0.38597
47 9.4 -0.72289
48 9.6 -0.56342
49 9.8 -0.72862
50 10.0 0.04955
------------------------------------------------------------------------
SPLUS MINIMIZATION FUNCTIONS
Splus (and R) also have procedures which can minimize functions. The
simplest of these is nlmin. As with the Simplex method, one need
only define the function to be minimized; derivatives need not be
specified. An example of nlmin is shown below. Note that nlmin
does not produce estimates of derivatives of the function being
minimized or estimated standard errors.
The Splus documentation for nlmin says:
"The optimizer is based on a quasi-Newton method using the
double dogleg step with the BFGS update to the Hessian."
References are given in the Splus Reference Manual.
------------------------------------------------------------------------
# This Splus program plots chlorine vs. time from the
# Draper dataset (generated by SAS program sastos.draper.sas),
# and does least-squares regression using the S+ function nlmin.
# To run this program in Splus, say: source("draper.s").
# Then to see the results, say: m
postscript("draper.ps", horizontal = F, onefile = T)
par(mfrow = c(2, 1), oma = c(2, 2, 2, 2))
rm(sasdat)
sasdat <- sas.get(library = "/home/gnome/john-c/5421", mem = "draper")
time <- sasdat[, "time"]
chlorine <- sasdat[, "chlorine"]
hist(chlorine, nclass = 6, main = "SPH 5421 Draper data: chlorine.")
par(new = F)
plot(time, chlorine)
b <- c(.5, .5)
f1 <- function(v)
{
w <- exp(-v[2] * (time - 8))
f <- v[1] + (.49 - v[1]) * w
ssq <- sum((chlorine - f)^2)
ssq
}
m <- nlmin(f1, b, max.iter = 1000)
# rm(time, chlorine, sasdat, f, ssq, b, w)
dev.off()
------------------------------------------------------------------------
PROBLEM 28b:
Use nlmin in R to find parameter estimates for the data and model
specified in Problem 28.
/home/walleye/john-c/5421/notes.026 Last update: December 7, 2011