APPLICATIONS OF SAS IML SPH 5421 notes.017 SOLVING NONLINEAR EQUATIONS A general nonlinear equation is of the form G(X) = C, where X and C are vectors, not necessarily of the same dimension, and G is a function. The constant vector C is assumed to be known, and the problem is to find the vector X that satisfies the equation. The equation can be re-written in the form G(X) - C = 0, where 0 represent a vector of all zeroes. The simplest case is that in which X and C are assumed to be one-dimensional. An example of a nonlinear equation in this case is G(x) = x - cos(x) = 0. The most common way of solving such an equation is Newton's method. The basic idea is as follows: 1) Make an initial guess at the solution. Call it x0. 2) Approximate G(x) by a linear function L(x) by using the first-order Taylor expansion of G around x = x0. 3) Solve L(x) = 0. This is easy to do because L is linear. Call the solution x1. 4) Go back to step 2) using x1 instead of x0. 5) Continue like this until two successive approximate solutions xi and xi+1 are close to each other - close to within some preset interval, like 10e-8. 6) The approximate solution is xi+1. How do you find L(x)? As above, assume G(x) = x - cos(x). Let your initial guess at a solution be x0. The Taylor expansion of G(x) around x0 is [1] L(x) = G(x0) + G'(x0) * (x - x0). Here we think of x0 as fixed, and x as variable. Note that the graph of L(x) is a line which is tangent to the curve G(x) at the point x = x0. In the case of G(x) = x - cos(x), G'(x) = 1 + sin(x), and L(x) = (x0 - cos(x0)) + (1 + sin(x0)) * (x - x0). The solution to L(x) = 0 is obtained by setting [1] equal to 0 and solving for x. In general, this yields the solution x1, where [2] x1 = x0 - G(x0) / G'(x0). Equation [2] is really the heart of Newton's method for one-dimensional nonlinear equations. The next step produces x2 = x1 - G(x1) / G'(x1), and in general xi+1 = xi - G(xi) / G'(xi). You stop this iterative process when abs(xi+1 - xi) < diff, where diff is a small number that you set in advance, like 10e-8. In practice, this process often converges within 10 iterations to a precise answer. The following program shows how this works for the function G(x) = x - cos(x). ================================================================================== * Program to illustrate Newton's method solution of a nonlinear equation ; * in one dimension. ; * The nonlinear equation is: G(x) = x - cos(x). ; options linesize = 80 ; footnote "program: /home/walleye/john-c/5421/newton1 &sysdate &systime" ; data newton ; x0 = 60 ; diff = 10e-8 ; abdiff = 10 ; iter = 0 ; iterlim = 50 ; do while (abdiff gt diff) ; x1 = x0 - (x0 - cos(x0)) / (1 + sin(x0)) ; abdiff = abs(x1 - x0) ; eval = x1 - cos(x1) ; iter = iter + 1 ; output ; x0 = x1 ; end ; run ; proc print ; title1 "Iterations to solve G(x) = x - cos(x) = 0 by Newton's method ..." ; ================================================================================== Iterations to solve G(x) = x - cos(x) = 0 by Newton's method ... 1 21:30 Tuesday, February 8, 2000 OBS X0 DIFF ABDIFF ITER ITERLIM X1 EVAL 1 60.0000 .000000 87.6774 1 50 -27.6774 -26.8503 2 -27.6774 .000000 61.3148 2 50 33.6374 34.2431 3 33.6374 .000000 19.0699 3 50 14.5675 14.9847 4 14.5675 .000000 7.8503 4 50 6.7173 5.8100 5 6.7173 .000000 4.0899 5 50 2.6273 3.4980 6 2.6273 .000000 2.3447 6 50 0.2827 -0.6777 7 0.2827 .000000 0.5299 7 50 0.8125 0.1249 8 0.8125 .000000 0.0723 8 50 0.7402 0.0018 9 0.7402 .000000 0.0011 9 50 0.7391 0.0000 10 0.7391 .000000 0.0000 10 50 0.7391 0.0000 11 0.7391 .000000 0.0000 11 50 0.7391 0.0000 program: /home/walleye/john-c/5421/newton1 08FEB00 21:30 ================================================================================== The printout indicates that the pre-set tolerance limit, DIFF, is zero. That is because the actual value is 10e-8, and the default print format does not print enough digits to show that it is nonzero. In this case the approximate solution was found in 11 iterations. Notice that, in the last iteration, the variable EVAL represents the value of the function at the final value of x1. The computer sees this value as not distinguishable from 0. You should try running this program several times with different values of the initial estimate x0. You may be surprised by what you get if you initialize x0 to 1000. If the Newton algorithm does not converge within 50 iterations, something probably is not working right. ***** It is important to put a limit on the number of iterations, because, in ***** 'do while' or 'do until' loops, the computer could go on forever without ***** stopping and you would have wasted a lot of computer time. ***** This was not done in the program above; it would have been better if the ***** 'do' loop were started as follows: ***** ***** do iter = 1 to 50 while(abdiff gt diff) ; ***** ***** If this change is made, the later statement ***** ***** iter = iter + 1 ; ***** ***** should be removed. ***** -------------------------------------------------------------------------------- PROBLEM 15 1. Find the solution of G(x) = x - cos(x) = 0 using a numerical approximation to the derivative G'(x0). The numerical approximation is defined by G`(x0) = (G(x0 + h) - G(x0)) / h where h is a very small number; for example, h = 1e-6. 2. Use Newton's Method to solve the equation x3 - exp(cos(x)) = 0. -------------------------------------------------------------------------------- The program above made no use of PROC IML. That is because the function G(x) was a real-valued function of a real argument; that is, G: R1 ---> R1. All the computations involved just scalar arithmetic; no vectors or matrices were necessary. But suppose G(X) is a nonlinear function from R2 to R2. For example, | x | | x - cos(y) | G | | = | | | y | | y - exp(x) | It would be best to think of G as having two components, | x | | G1 | G | | = | | | y | | G2 | | x | | x | where here G1 | | = x - cos(y) and G2 | | = y - exp(x). | y | | y | The goal here is to find a pair (x, y) such that G(x, y)' = (0, 0)'. This is the same as simultaneously solving the two nonlinear equations, G1(x, y) = x - cos(y) = 0 G2(x, y) = y - exp(x) = 0. Newton's algorithm can be used here with matrices with only a slightly different appearance. | x0 | Let X0 represent the column vector | | . | y0 | The analogue of Newton's Equation [1] above is written [3] X1 = X0 - inv(G'(X0)) * G(X0), where G'(X0) is a 2 x 2 matrix of partial derivatives of G, | dG1/dx dG1/dy | G'(X0) = | | . | dG2/dx dG2/dy | Now you can see how PROC IML might be used for this kind of problem. As in the one-dimensional case, the derivation of equation [3] depends on approximating the function G by the use of Taylor's theorem. The first order Taylor's expansion L of G around X0 is given by: | x | | G1(x0, y0) | | dG1/dx dG1/dy | | x - x0 | L | | = | | + | | * | |. | y | | G2(x0, y0 | | dG2/dx dG2/dy | | y - y0 | To derive [3], just set this equation equal to the zero vector and solve for | x | | | . | y | The equation is: | dG1/dx dG1/dy | | x - x0 | | G1(x0, y0) | | | * | | = - | | , or | dG2/dx dG2/dy | | y - y0 | | G2(x0, y0) | | x | | x0 | / | dG1/dx dG1/dy| \ | G1(x0, y0) | | | = | | - inv | | | | * | | | y | | y0 | \ | dG2/dx dG2/dy| / | G2(x0, y0) | . Again, this can be abbreviated to: X = X0 - inv[dG(X0)] * G(X0), where X and X0 are thought of as column vectors, and dG is a 2 x 2 matrix. The following is a SAS IML program to solve a nonlinear vector equation of the form G(B) = 0. Printout from the program follows the program. ======================================================================== * Example of an IML program that uses Newton's method to solve a ; * nonlinear vector equation of the form ; ; * ; * G(B) = 0. ; * ; options linesize = 80 ; footnote "program: /home/walleye/john-c/5421/newton3.sas" ; proc iml ; start funct(a, b, g1, g2, dg1a, dg1b, dg2a, dg2b) ; g1 = a * a - exp(a + b + a * b) ; g2 = a + b - cos(a + b) ; dg1a = 2 * a - (1 + b) * exp(a + b + a * b) ; dg1b = -(1 + a) * exp(a + b + a * b) ; dg2a = 1 + sin(a + b) ; dg2b = 1 + sin(a + b) ; finish funct ; a = 1.2 ; * initial estimates ... ; b = -1 ; * ; e1 = {1, 0} ; e2 = {0, 1} ; * elementary vectors ; s1 = {1 0, 0 0} ; s2 = {0 1, 0 0} ; * elementary matrices ; s3 = {0 0, 1 0} ; s4 = {0 0, 0 1} ; * ; v = a * e1 + b * e2 ; * initializations ... ; g1 = a * a - exp(a + b + a * b) ; * ... ; g2 = a + b - cos(a + b) ; * ... ; g = g1 * e1 + g2 * e2 ; * ... ; absg = 1 ; * ... ; eps = 1e-8 ; * tolerance level ; do iter = 1 to 30 while (absg > eps) ; va = v` * e1 ; vb = v` * e2 ; * ; * Evaluate the function and its derivatives ... ; * ; run funct(va, vb, g1, g2, dg1a, dg1b, dg2a, dg2b) ; g = g1 * e1 + g2 * e2 ; dg = dg1a * s1 + dg1b * s2 + dg2a * s3 + dg2b * s4 ; invdg = inv(dg) ; v = v - invdg * g ; * The key Newton step ...; gtg = g` * g ; absg = sqrt(gtg) ; print iter v g dg absg ; end ; ======================================================================== The SAS System 1 16:14 Tuesday, March 6, 2001 ITER V G DG ABSG 1 1.0300508 1.0721206 2.4 -0.809335 1.3258757 -0.179274 -0.780067 1.1986693 1.1986693 ITER V G DG ABSG 2 1.1328844 -0.885658 0.4624246 -3.951823 0.9060987 -0.391354 0.191378 1.7517931 1.7517931 ITER V G DG ABSG 3 1.146067 -0.063965 1.445684 -2.873832 0.0640963 -0.406981 0.0040943 1.6754171 1.6754171 ITER V G DG ABSG 4 1.1460623 0.0000166 1.5132308 -2.818758 0.0000167 -0.406977 2.2036E-6 1.673613 1.673613 ITER V G DG ABSG 5 1.1460623 2.964E-11 1.5132135 -2.818764 2.965E-11 -0.406977 6.407E-13 1.673612 1.673612 program: /home/walleye/john-c/5421/newton3.sas ======================================================================== --------------------------------------------------------------------------------- PROBLEM 16 | x | | 0 | 1. Solve the equation G | | = | | using PROC IML, where | y | | 0 | G = (G1, G2) is as defined above: G1(x, y) = x - cos(y), G2(x, y) = y - exp(x). 2. Solve the same equation, but, instead of using exact partial derivatives for dG1/dx, etc., use numerical approximations to the partial derivatives. These are computed as dG1/dx ~=~ [G1(x0 + h, y0)' - G1(x0, y0)'] / h, etc., for h a very small number (e.g., h = 1e-6). An advantage of using numerical derivatives is that you need not have an explicit expression for the derivatives of the function. A disadvantage is that the numerical approximation may not be sufficiently accurate to yield a reliable answer. --------------------------------------------------------------------------------- /home/walleye/john-c/5421/notes.017 Last update: October 29, 2012.