We have seen the power of Richardson extrapolation in this previous article: Want to get high order?. This time, we will show an even more powerful application —— constructing Runge-Kutta methods for arbitrary (even) order p. This is what we call the Gragg-Bulirsch-Stoer (GBS) extrapolation algorithm.
The construction of Runge-Kutta methods for arbitrary order p is a well-known difficult problem. But the GBS algorithm gives an elegant way to do this, and gives an upper-bound on the lowest stages number estimate, which has not been successfully improved in 60 years since this method was given in 1964.
The GBS algorithm uses the modified midpoint method for a single step of size H subdivided into n substeps (h=H/n):
y0=y(t0),y1=y0+hf(t0,y0)
yk+1=yk−1+2hf(t0+kh,yk),k=1,…,n−1
After n substeps, you get yn, which is 2nd-order accurate. Then the extrapolation step is applied to increase the order.
Theorem
The error of the above modified midpoint method can be expanded as a series of h2.
Proof (Stetter). Let h∗=2h, xk∗=x0+kh∗. u0=v0=y0. Let
uk=y2k
and
vk=y2k+1−hf(x2k,y2k)=21y2k+1+21y2k−1+hf(x2k,y2k)−hf(x2k,y2k)=21(y2k+1+y2k−1).
In this way, the modified midpoint method can be written as the following vector-formed one step method:
(uk+1vk+1)=(ukvk)+h∗(f(xk∗+2h∗,vk+2h∗f(xk∗,uk))21[f(xk∗+h∗,uk+1)+f(xk∗,uk)]).
It is a symmetric one-step method, since one can verify that when exchanging: uk→uk+1, vk→vk+1, uk+1→uk, vk+1→vk, xk∗→xk∗+h∗, h∗→−h∗, the formula for the method is invariant. Therefore, in the error expansion of this method, all coefficients for odd powers of h∗ are zero. This implies that the original modified midpoint method has an error expansion as a series of h2.
GBS constructs a table of approximations using step numbers n1<n2<⋯<nm. The approximations y(H,ni) are extrapolated to zero step size:
yextrapolated=i∑ciy(H,ni)
This is analogous to Richardson extrapolation.
Suppose we want to integrate from t0 to t0+H. You compute approximations using the modified midpoint rule with different numbers of substeps:
n1<n2<⋯<nm
Let’s denote the approximations by:
Ti,1=y(H,ni),i=1,2,…,m
Here:
- ni = number of substeps in the modified midpoint method.
- Ti,1 = approximation using ni substeps (order 2, since modified midpoint is 2nd-order).
The idea of Richardson extrapolation is:
yexact=y(H,ni)+C⋅(niH)2+O((niH)4)
- C is an unknown constant.
- The leading error term is proportional to (H/ni)2 (since midpoint is 2nd-order).
You can eliminate the (H/ni)2 error by combining two approximations:
Ti,2=Ti,1+(ni/ni−1)2−1Ti,1−Ti−1,1,i=2,…,m
- This gives a 4th-order approximation Ti,2.
- The denominator (ni/ni−1)2−1 comes from the error scaling.
For higher orders, we construct a table recursively:
Ti,k=Ti,k−1+(ni/ni−k+1)2k−1Ti,k−1−Ti−1,k−1,k=2,3,…,i
- Ti,k = approximation with order 2k (because modified midpoint is 2nd-order and each extrapolation increases order by 2).
- Ti,k is computed from the previous column in the table.
- The last entry in the last column Tm,m is the final GBS result: highest-order extrapolated approximation.
From the recursive formula, we can derive a general formula that writes the final extrapolated value Tm,m as a linear combination of the original approximations T1,1,…,Tm,1 (the values computed with different numbers of substeps).
The method comes from the Neville / Lagrange interpolation point of view.
- Let Ti,1=y(H,ni) be the modified midpoint approximations.
- The final extrapolated value Tm,m can be written as a weighted sum:
Tm,m=i=1∑mwiTi,1
- The weights wi depend only on the extrapolation nodes xi=(1/ni)2. i.e. wi=ℓi(0), where ℓi(x) is the lagrange basis with respect to the ith node xi.
- Why (1/ni)2? Because the error in Ti,1 is proportional to (H/ni)2.
This idea is as follows:
- Consider the function F(x)=y(H,n) as a function of x=(1/n)2.
- Ti,1=F(xi)≈F(0)+O(xi), where F(0) is the exact solution.
- Then the extrapolated value Tm,m is the value of the Lagrange interpolating polynomial at x=0:
Tm,m=i=1∑mTi,1j=1 j=i∏mxi−xj0−xj=i=1∑mTi,1j=1 j=i∏mxi−xj−xj
- This is exactly a Lagrange interpolation formula, where we interpolate F(x) at nodes x1,…,xm and evaluate at x=0.
We omit the rigorous proof here, which can be done either use mathematical induction to show the equivalence to the recursive formula, or using the exactness of Lagrange interpolation on polynomials.
- Recall xi=(1/ni)2. Then:
Tm,m=i=1∑mTi,1j=1 j=i∏m(1/ni)2−(1/nj)2−(1/nj)2=i=1∑mTi,1j=1 j=i∏mnj2−ni2−ni2
That is the general formula for the final extrapolated value in terms of the nodes ni.
Suppose n1=2, n2=4, n3=6 substeps.
- Compute T1,1,T2,1,T3,1 using the modified midpoint method.
- First extrapolation column:
T2,2=T2,1+(4/2)2−1T2,1−T1,1=T2,1+3T2,1−T1,1
T3,2=T3,1+(6/4)2−1T3,1−T2,1=T3,1+5/4T3,1−T2,1=T3,1+0.8(T3,1−T2,1)
- Second extrapolation column (higher order):
T3,3=T3,2+(6/2)4−1T3,2−T2,2=T3,2+80T3,2−T2,2
- T3,3 is now 8th-order accurate.
Every extrapolation method can be rewritten as a dense RK scheme because it's a linear combination of evaluations of f(t,y) at specific points.
- Let f0=f(t0,y0), f1=f(t0+h,y1), etc.
- Each modified midpoint sequence produces intermediate values yk as linear combinations of y0 and the fj's.
- Then, the extrapolation is a linear combination of these yn values, which themselves are linear combinations of y0 and all fj's.
Hence, the final extrapolated value can be written as:
yn+1=y0+hi∑bif(t0+cih,Yi)
where Yi are stage values depending on previous fj (so it fits the RK form).
Let resume with the example in the last section. We want to express it as an RK method.
Of course, we need
T1,1=y(h,ni),i=1,2,3
which are computed via ni steps respectively. All three start at Y1=y0.
- Ti,1 used 2 steps: Y2 as the intermediate step and T1,1 is the final result
Y2T1,1=Y1+2H1f(Y2)=y0+H1f(Y1)=y0+2H1f(Y2)
where H1=21h.
- T2,1 used 4 steps: we arrange Y3 to Y5 for the 3 intermediate steps
Y3Y4=Y1+2H2f(Y3)Y5=Y3+2H2f(Y4)T2,1=Y4+2H2f(Y5)=y0+H2f(Y1)=y0+2H2f(Y3)=y0+H2[f(Y1)+2f(Y4)]=y0+H2[2f(Y3)+2f(Y5)]
where H2=41h.
- T3,1 used 6 steps: arrange Y6 to Y10 for the 5 intermediate steps
Y6Y7=Y1+2H3f(Y6)Y8=Y6+2H3f(Y7)Y9=Y7+2H3f(Y8)Y10=Y8+2H3f(Y9)T3,1=Y9+2H3f(Y10)=y0+H3f(Y1)=y0+2H3f(Y6)=y0+H3[f(Y1)+2f(Y7)]=y0+H3[2f(Y6)+2f(Y8)]=y0+H3[f(Y1)+2f(Y7)+2f(Y9)]=y0+H3[2f(Y6)+2f(Y8)+2f(Y10)]
where H3=61h.
Next, apply the explicit formula to compute T3,3:
T3,3=i=1∑3Ti,1j=1 j=i∏3nj2−ni2−ni2=241T1,1−1516T2,1+4081T3,1
which can be expressed as
y1=T3,3=y0+hi=1∑10bif(Yi)
where the value of bi can be obtained by a simple substitution. This is an 8th order RK method of 10 stages.
- the number of stages is equal to the total number of function evaluations in all substeps.
- The tableau is usually sparse because each extrapolated stage depends on limited previous stages.
- We can construct many different RK tableaus using this method, but choosing {ni∗} in the vector representation as the harmonic sequence 1,2,3,… (which yield {ni} as 2,4,6,…) with the modified midpoint (which has error expansion associated with h2) seemed optimal.
- For low-order extrapolation (say n=2,4), it’s feasible to compute manually. For higher orders, it’s easier to compute numerically.
Hairer, Nørsett, Wanner — Solving Ordinary Differential Equations I: Nonstiff Problems (2nd edition), Section II.9.
Section II.9 gives detailed explanation of GBS extrapolation methods.