52  Numerical methods

When the exact answer isn’t the point

The heat equation through an irregular fin. The trajectory of a satellite perturbed by the moon’s gravity. The implied volatility of an options contract given its market price. Three equations you can write down in seconds. None of them have closed-form solutions.

This is not unusual. Most equations that describe something real cannot be solved exactly. Numerical methods are what you use instead — systematic procedures that produce answers you can trust, with errors you can bound.

Three families of problems appear everywhere:

Each family has the same structure: start with something computable, iterate or accumulate, and keep track of how wrong you might be.


52.1 Root-finding

52.1.1 The problem

You have a function \(f(x)\) — read “\(f\) of \(x\)”, the function whose root you want — and you need to find \(x^*\) such that \(f(x^*) = 0\). This is every nonlinear equation in disguise. If you need to solve \(g(x) = h(x)\), define \(f(x) = g(x) - h(x)\) and find its root.

52.1.2 Bisection method

The simplest idea: if \(f(a) < 0\) and \(f(b) > 0\), and \(f\) is continuous, then somewhere in \((a, b)\) the function must cross zero. The midpoint \(m = (a+b)/2\) is either a root, or the root lies in \((a, m)\), or it lies in \((m, b)\). Evaluate \(f(m)\) and keep the half that changes sign. Repeat.

Algorithm:

  1. Choose \([a, b]\) with \(f(a)\) and \(f(b)\) having opposite signs.
  2. Set \(m = (a + b)/2\).
  3. If \(|f(m)| < \varepsilon\) (your tolerance), stop. \(m\) is the root.
  4. If \(f(a) \cdot f(m) < 0\), the root is in \([a, m]\): set \(b = m\). Otherwise set \(a = m\).
  5. Return to step 2.

After \(n\) steps the bracket width is \((b - a)/2^n\), so the error is bounded by \((b - a)/2^{n+1}\). This is the bisection error bound. To guarantee an error below \(\varepsilon\), you need at most \(\lceil \log_2((b-a)/\varepsilon) \rceil\) iterations.

The method is slow — the error halves each step — but it always converges as long as the function is continuous and the initial bracket is valid. Nothing else is assumed.

Worked example. Find the positive real root of \(f(x) = x^3 - x - 2\), starting in \([1, 2]\).

Note that \(f(1) = 1 - 1 - 2 = -2 < 0\) and \(f(2) = 8 - 2 - 2 = 4 > 0\). A root exists in \([1, 2]\).

Step \(a\) \(b\) \(m\) \(f(m)\) New bracket
1 1 2 1.5 \(1.5^3 - 1.5 - 2 = -0.125\) \([1.5, 2]\)
2 1.5 2 1.75 \(1.75^3 - 1.75 - 2 \approx 1.609\) \([1.5, 1.75]\)
3 1.5 1.75 1.625 \(1.625^3 - 1.625 - 2 \approx 0.666\) \([1.5, 1.625]\)

After three steps the root is bracketed in \([1.5, 1.625]\), width \(0.125 = 1/2^3\). The exact root is \(x^* \approx 1.5214\).

52.1.3 Newton-Raphson method

Bisection is robust but slow. If \(f\) is differentiable, you can do much better. The idea: at your current estimate \(x_n\) — read “\(x\) sub \(n\)”, the \(n\)th iterate — draw the tangent line to \(y = f(x)\). Follow the tangent to where it hits zero; call that \(x_{n+1}\).

The tangent line at \(x_n\) has equation:

\[y - f(x_n) = f'(x_n)(x - x_n)\]

where \(f'(x_n)\) — read “\(f\) prime at \(x_n\)” — is the derivative of \(f\) at the current point. Setting \(y = 0\) and solving for \(x\):

\[x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}\]

This is the Newton-Raphson iteration. Apply it repeatedly until \(|f(x_n)|\) is below your tolerance.

Why it converges so fast. The error \(e_n = x_n - x^*\) satisfies:

Substituting a Taylor expansion of \(f(x_n)\) about the true root \(x^*\) into the Newton iteration and simplifying:

\[e_{n+1} \approx -\frac{f''(x^*)}{2f'(x^*)} e_n^2\]

The next error is proportional to the square of the current error. If you are accurate to 3 decimal places, one more step gives you roughly 6. This is quadratic convergence — the number of correct digits doubles each step. Bisection adds about 0.3 correct digits per step; Newton-Raphson roughly doubles them.

The method fails if \(f'(x_n) = 0\) (the tangent is flat) or if the starting point is far from the root (the iteration diverges or cycles). Always check convergence.

Worked example. Compute \(\sqrt{2}\) using Newton-Raphson. The positive root of \(f(x) = x^2 - 2\) is \(\sqrt{2}\), so \(f'(x) = 2x\).

The iteration is:

\[x_{n+1} = x_n - \frac{x_n^2 - 2}{2x_n} = \frac{x_n + 2/x_n}{2}\]

Starting from \(x_0 = 1\):

\(n\) \(x_n\) \(f(x_n) = x_n^2 - 2\) \(|f(x_n)|\)
0 1.000000 −1.000000 1.000000
1 1.500000 0.250000 0.250000
2 1.416667 0.006944 0.006944
3 1.414216 0.000006 \(6 \times 10^{-6}\)

Three steps to six-digit accuracy. Quadratic convergence in action.

52.1.4 Secant method

Newton-Raphson requires the derivative \(f'(x)\). When that is expensive or unavailable, replace it with a finite-difference approximation using the two most recent iterates:

\[x_{n+1} = x_n - f(x_n) \cdot \frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}\]

This is the secant method. It needs two starting values \(x_0\) and \(x_1\). The convergence order is approximately 1.618 (the golden ratio) — super-linear but slower than quadratic. No derivative evaluation required.

Which root-finder to use

Bisection is your fallback: slow, guaranteed, needs only function values and a sign change. Newton-Raphson is your workhorse when you have a derivative and a decent starting point: fast, elegant, can fail. The secant method sits between them: no derivative, faster than bisection, needs two starting points. In practice, hybrid methods (Brent’s method, Illinois algorithm) combine bisection safety with Newton-like speed.


52.2 Interpolation

52.2.1 The problem

You have \(n+1\) data points \((x_0, y_0), (x_1, y_1), \ldots, (x_n, y_n)\) and you want to estimate \(f(x)\) at a new \(x\) lying between them. The classical approach: find a polynomial that passes exactly through all the given points, then evaluate it at \(x\).

52.2.2 Lagrange interpolation

For \(n+1\) points there is exactly one polynomial of degree \(\leq n\) passing through all of them. Write it as:

\[P_n(x) = \sum_{k=0}^{n} y_k \, L_k(x)\]

where each Lagrange basis polynomial \(L_k(x)\) is:

\[L_k(x) = \prod_{\substack{j=0 \\ j \neq k}}^{n} \frac{x - x_j}{x_k - x_j}\]

Read this as: \(L_k(x)\) equals 1 at \(x_k\) and equals 0 at every other node \(x_j\). So \(P_n(x_k) = y_k\) for each \(k\) — the polynomial interpolates every point exactly.

Worked example. Interpolate \(f\) at \(x = 1.5\) using the points \((1, 1)\), \((2, 8)\), \((3, 27)\) — the cube function sampled at three integers.

With nodes \(x_0 = 1\), \(x_1 = 2\), \(x_2 = 3\):

\[L_0(x) = \frac{(x-2)(x-3)}{(1-2)(1-3)} = \frac{(x-2)(x-3)}{2}\]

\[L_1(x) = \frac{(x-1)(x-3)}{(2-1)(2-3)} = \frac{(x-1)(x-3)}{-1} = -(x-1)(x-3)\]

\[L_2(x) = \frac{(x-1)(x-2)}{(3-1)(3-2)} = \frac{(x-1)(x-2)}{2}\]

At \(x = 1.5\):

\[L_0(1.5) = \frac{(1.5-2)(1.5-3)}{2} = \frac{(-0.5)(-1.5)}{2} = \frac{0.75}{2} = 0.375\]

\[L_1(1.5) = -(1.5-1)(1.5-3) = -(0.5)(-1.5) = 0.75\]

\[L_2(1.5) = \frac{(1.5-1)(1.5-2)}{2} = \frac{(0.5)(-0.5)}{2} = -0.125\]

\[P_2(1.5) = 1 \cdot 0.375 + 8 \cdot 0.75 + 27 \cdot (-0.125) = 0.375 + 6 - 3.375 = 3\]

The exact value is \(1.5^3 = 3.375\). The polynomial through three points of \(x^3\) is not perfect — it is a degree-2 polynomial approximating a degree-3 function. Adding the fourth node \((4, 64)\) would give exact interpolation.

52.2.3 Newton’s divided-difference form

The Lagrange form is conceptually clean but computationally inconvenient: to add a new node you must recompute everything. Newton’s divided-difference form builds the polynomial incrementally.

Define the divided differences recursively:

\[f[x_k] = y_k\]

\[f[x_k, x_{k+1}] = \frac{f[x_{k+1}] - f[x_k]}{x_{k+1} - x_k}\]

\[f[x_k, \ldots, x_{k+j}] = \frac{f[x_{k+1}, \ldots, x_{k+j}] - f[x_k, \ldots, x_{k+j-1}]}{x_{k+j} - x_k}\]

To see how this builds, use two of the cube-function points: \(x_0 = 1\), \(y_0 = 1\) and \(x_1 = 2\), \(y_1 = 8\). The first divided difference is \(f[x_0, x_1] = (8-1)/(2-1) = 7\). Newton’s form gives \(P_1(x) = 1 + 7(x - 1)\) — the straight line through \((1,1)\) and \((2,8)\). Adding a third node brings in the next divided difference and bends the curve to pass through that point too.

The interpolating polynomial is then:

\[P_n(x) = f[x_0] + f[x_0,x_1](x - x_0) + f[x_0,x_1,x_2](x-x_0)(x-x_1) + \cdots\]

Adding a new node \(x_{n+1}\) requires only one new divided difference and one new term — the previous work is retained.

52.2.4 Runge’s phenomenon and why splines exist

High-degree polynomial interpolation over equally-spaced nodes can oscillate wildly near the endpoints, even when the underlying function is smooth. This is Runge’s phenomenon. The standard fix is spline interpolation: use low-degree polynomials (typically cubics) on each sub-interval, stitched together with continuity conditions. Splines are covered in the next chapter.


52.3 Numerical integration

52.3.1 The problem

Compute \(\int_a^b f(x)\,dx\) when \(f\) has no closed-form antiderivative, or when \(f\) is known only at discrete points (sampled data from a sensor, say).

Partition \([a, b]\) into \(n\) equal panels of width \(h = (b - a)/n\). Let \(x_i = a + ih\) for \(i = 0, 1, \ldots, n\), and write \(f_i = f(x_i)\).

52.3.2 Trapezoid rule

Approximate \(f\) on each panel by a straight line joining \((x_i, f_i)\) and \((x_{i+1}, f_{i+1})\). The area of each trapezoid is \(\frac{h}{2}(f_i + f_{i+1})\). Summing over all panels:

\[\int_a^b f(x)\,dx \approx T_n = \frac{h}{2}\bigl(f_0 + 2f_1 + 2f_2 + \cdots + 2f_{n-1} + f_n\bigr)\]

The error is \(O(h^2)\) — halving the step size \(h\) reduces the error by a factor of four. More precisely:

\[\text{Error} = -\frac{(b-a)h^2}{12} f''(\xi)\]

for some \(\xi \in (a,b)\) — a point whose exact location is unknown, but which lies somewhere in the integration interval. The formula says: the error depends on the curvature of \(f\) somewhere in \((a,b)\). You don’t need to know where \(\xi\) is; you only need an upper bound on \(|f''|\) over the interval. The error is proportional to the second derivative of \(f\) — large curvature means larger error.

Worked example. Estimate \(\int_0^1 e^x\,dx\) with \(n = 4\) panels.

\(h = 0.25\), nodes at \(x = 0, 0.25, 0.5, 0.75, 1\).

\(x_i\) \(f_i = e^{x_i}\)
0 1.000000
0.25 1.284025
0.5 1.648721
0.75 2.117000
1 2.718282

\[T_4 = \frac{0.25}{2}\bigl(1 + 2(1.284025) + 2(1.648721) + 2(2.117000) + 2.718282\bigr)\]

\[= 0.125 \times (1 + 2.568050 + 3.297442 + 4.234000 + 2.718282)\]

\[= 0.125 \times 13.817774 = 1.727222\]

The exact value is \(e - 1 \approx 1.718282\). Error: \(|1.727222 - 1.718282| \approx 0.008940\).

52.3.3 Simpson’s 1/3 rule

Replace the straight-line approximation on each pair of panels with a quadratic through three consecutive nodes. Require \(n\) even.

\[\int_a^b f(x)\,dx \approx S_n = \frac{h}{3}\bigl(f_0 + 4f_1 + 2f_2 + 4f_3 + \cdots + 4f_{n-1} + f_n\bigr)\]

The coefficients alternate \(4, 2, 4, 2, \ldots, 4\) for interior nodes. The error is \(O(h^4)\):

\[\text{Error} = -\frac{(b-a)h^4}{180} f^{(4)}(\xi)\]

Halving \(h\) reduces the error by a factor of sixteen. Simpson’s rule is exact for polynomials of degree up to 3, even though it uses degree-2 approximations — a consequence of symmetry.

Worked example. Apply Simpson’s 1/3 rule to \(\int_0^1 e^x\,dx\) with \(n = 4\).

Using the same nodes as above:

\[S_4 = \frac{0.25}{3}\bigl(f_0 + 4f_1 + 2f_2 + 4f_3 + f_4\bigr)\]

\[= \frac{0.25}{3}\bigl(1 + 4(1.284025) + 2(1.648721) + 4(2.117000) + 2.718282\bigr)\]

\[= \frac{0.25}{3}\bigl(1 + 5.136100 + 3.297442 + 8.468000 + 2.718282\bigr)\]

\[= \frac{0.25}{3} \times 20.619824 = 1.718319\]

Error: \(|1.718319 - 1.718282| \approx 0.000037\). A factor of 241 smaller than the trapezoid rule with the same number of function evaluations.

52.3.4 Simpson’s 3/8 rule

A variant using groups of three panels instead of two, again with a cubic approximant. Require \(n\) divisible by 3.

\[\int_a^b f(x)\,dx \approx \frac{3h}{8}\bigl(f_0 + 3f_1 + 3f_2 + 2f_3 + 3f_4 + 3f_5 + 2f_6 + \cdots + f_n\bigr)\]

Error is also \(O(h^4)\). The 1/3 rule is more commonly used; the 3/8 rule appears when \(n\) is odd so the standard Simpson pairing fails.

52.3.5 Gaussian quadrature

The rules above use evenly-spaced nodes. If you are free to choose both the node locations and the weights, you can do much better.

Gauss-Legendre quadrature on \([-1, 1]\) with \(n\) nodes is exact for polynomials of degree up to \(2n-1\) — twice the order you might expect. The nodes are the roots of the Legendre polynomial \(P_n(x)\), and the weights are determined by the condition of exactness.

The Legendre polynomials are a specific family of functions whose roots have an extremal property: placing quadrature nodes there squeezes the most accuracy out of a given number of evaluations. You do not derive these roots by hand — they are tabulated in every numerical methods reference and built into every serious quadrature library. For \(n = 2\), the nodes and weights are:

For \(n = 2\): nodes \(x_{1,2} = \pm 1/\sqrt{3}\), weights \(w_{1,2} = 1\):

\[\int_{-1}^1 f(x)\,dx \approx f\!\left(-\frac{1}{\sqrt{3}}\right) + f\!\left(\frac{1}{\sqrt{3}}\right)\]

This is exact for all polynomials of degree \(\leq 3\).

To integrate over \([a, b]\), substitute \(x = \frac{a+b}{2} + \frac{b-a}{2}t\) where \(t \in [-1, 1]\):

\[\int_a^b f(x)\,dx = \frac{b-a}{2} \int_{-1}^1 f\!\left(\frac{a+b}{2} + \frac{b-a}{2}t\right) dt\]

Worked example. Estimate \(\int_0^1 e^x\,dx\) using 2-point Gauss-Legendre.

Change variables to \([-1,1]\): \(x = \frac{1}{2} + \frac{1}{2}t\), \(dx = \frac{1}{2}\,dt\).

Nodes in \(t\): \(t_{1,2} = \pm 1/\sqrt{3} \approx \pm 0.577350\).

Corresponding \(x\)-values: \(x_1 = 0.5 - 0.288675 = 0.211325\), \(x_2 = 0.5 + 0.288675 = 0.788675\).

\[\int_0^1 e^x\,dx \approx \frac{1}{2}\Bigl(e^{0.211325} + e^{0.788675}\Bigr) = \frac{1}{2}(1.235369 + 2.200819) = 1.718094\]

Error: \(|1.718094 - 1.718282| \approx 0.000188\). Two function evaluations; better accuracy than the 5-point trapezoid rule.

Summary: error orders

Method Error Error reduction when \(n\) doubles
Trapezoid \(O(h^2)\) 4× smaller
Simpson’s 1/3 \(O(h^4)\) 16× smaller
Gauss–Legendre (\(n\) pts) exact for degree \(\leq 2n-1\)

When \(f\) is smooth and you control the nodes, Gaussian quadrature wins. When \(f\) comes from sampled data at fixed points, use Simpson or trapezoid.


52.4 Where this goes

The three families here — root-finding, interpolation, numerical integration — are the building blocks for almost everything else in numerical computation. Numerical linear algebra (the next chapter in this section) extends root-finding ideas to systems: you want \(\mathbf{x}\) such that \(A\mathbf{x} = \mathbf{b}\), and for large systems you solve it iteratively using methods (Gauss-Seidel, conjugate gradient) that are direct descendants of the Newton-Raphson idea.

Numerical integration connects back to ordinary differential equations: Euler’s method for ODEs is the forward-rectangle (left-endpoint) rule applied to the integral form of the ODE — at each step you approximate the integrand by its value at the left endpoint, which is first-order accurate. Runge-Kutta methods are higher-order quadrature applied at each step. Knowing the error orders from this chapter tells you immediately why RK4 (fourth-order) is preferred over Euler (first-order): halving the step size reduces RK4’s error by 16×, Euler’s by only 2×.

Where these methods appear

  • Structural engineering (FEA): each element stiffness matrix is assembled using Gaussian quadrature over the element domain.
  • Signal processing: interpolation underlies sample-rate conversion (resampling audio or images); Newton-Raphson is used in carrier phase estimation.
  • Options pricing: Newton-Raphson is the standard method for computing implied volatility from the Black-Scholes formula — the exact inverse doesn’t exist.
  • Astrophysics: orbital trajectories are integrated numerically step by step using adaptive-step Runge-Kutta, which uses error estimates derived from the quadrature theory here.
  • Machine learning: most optimisers (Adam, L-BFGS) are Newton-like methods applied to high-dimensional loss functions. The convergence order matters enormously at scale.

52.5 Exercises

These are puzzles. Each has a clean numerical answer. The interesting work is carrying the method through correctly and checking the error bound.

1. Apply three iterations of bisection to find the positive root of \(f(x) = x^3 - x - 2 = 0\), starting with the bracket \([1, 2]\). State the midpoint after each iteration and the final bracket width.


2. Apply three iterations of Newton-Raphson to compute \(\sqrt{2}\), starting from \(x_0 = 1\). Use \(f(x) = x^2 - 2\). Record \(x_n\) and \(|f(x_n)|\) at each step and comment on the convergence rate.


3. Use the Lagrange interpolation formula to estimate \(f(1.5)\), given the data points \((1, 1)\), \((2, 8)\), \((3, 27)\). Compare with the exact value \(1.5^3 = 3.375\) and explain the discrepancy.


4. Apply the trapezoid rule to \(\int_0^1 e^x\,dx\) with \(n = 4\) panels. Compute the numerical result and the absolute error against the exact value \(e - 1 \approx 1.718282\).


5. Apply Simpson’s 1/3 rule to the same integral \(\int_0^1 e^x\,dx\) with \(n = 4\) panels. Compare your error with Exercise 4 and confirm the improvement is consistent with the \(O(h^4)\) order.


6. A diode circuit satisfies \(I \cdot R + V_T \ln(I / I_0) = V_s\), where \(R = 100\,\Omega\), \(V_T = 0.026\,\text{V}\), \(I_0 = 10^{-12}\,\text{A}\), \(V_s = 5\,\text{V}\). Starting from \(I_0^* = 0.04\,\text{A}\), apply one step of Newton-Raphson to find a better estimate. (Linearise the equation about the starting point.)