I. K. A R G Y R O S (Lawton, Okla.)
A NEW KANTOROVICH-TYPE THEOREM FOR NEWTON’S METHOD
Abstract. A new Kantorovich-type convergence theorem for Newton’s method is established for approximating a locally unique solution of an equation F (x) = 0 defined on a Banach space. It is assumed that the operator F is twice Fr´echet differentiable, and that F ′ , F ′′ satisfy Lipschitz conditions. Our convergence condition differs from earlier ones and therefore it has theoretical and practical value.
I. Introduction. In this study we are concerned with the problem of approximating a locally unique solution x ∗ of the equation
(1) F (x) = 0
where F is a twice Fr´echet differentiable operator defined on a convex subset D of a Banach space E 1 with values in a Banach space E 2 .
Newton’s method
(2) x n+1 = x n − F ′ (x n ) −1 F (x n ) (n ≥ 0), x 0 ∈ D,
has been used extensively by many authors (see [1]–[6] and the references there) to generate a sequence {x n } n≥0 converging to x ∗ . In particular the following conditions have been used:
Condition A (Kantorovich [6]). Let F : D ⊆ E 1 → E 2 be Fr´echet differentiable in D, F ′ (x 0 ) −1 ∈ L(E 2 , E 1 ) for some x 0 ∈ D, where L(E 2 , E 1 ) is the set of bounded linear operators from E 2 into E 1 , and assume
kF ′ (x 0 ) −1 [F ′ (x) − F ′ (y)]k ≤ lkx − yk for all x, y ∈ D, (3)
kF ′ (x 0 ) −1 F (x 0 )k ≤ a (4)
1991 Mathematics Subject Classification: 65J15, 47H15, 49M17.
Key words and phrases: Newton’s method, Banach space, Kantorovich hypothesis, Lipschitz–H¨ older condition.
[151]
and
(5) 2al ≤ 1.
Under condition A, one can obtain error estimates, existence and unique- ness regions of solutions, and know whether x 0 is a convergent initial guess, i.e., Newton’s method (2) starting from x 0 converges to x ∗ . But sometimes when we want to determine whether the Newton iteration (2) starting from x 0 converges, Condition A fails.
Example 1.1. Let E 1 = E 2 = R, D = [ √
2 − 1, √
2 + 1], x 0 = √ 2 and define the real polynomial F on D by
(6) F (x) = 1
6 x 3 − α, α = 2 3/2 6 + .23.
Using (3), (4), (6) and the above choices we get a = .23 and l = 2.4142136.
Condition (5) is not satisfied since
2al = 1.1105383 > 1.
Therefore under condition A we cannot determine whether Newton’s method (2) starting from x 0 = √
2 converges.
That is why in this study we introduce a new condition and a new the- orem under which we will see that Newton’s method starting from x 0 = √
2 in Example 1.1 converges.
From now on we assume:
Condition B. Let F : D ⊆ E 1 → E 2 be twice Fr´echet differentiable in D, with F ′ (x) ∈ L(E 1 , E 2 ), F ′′ (x) ∈ L(E 1 , L(E 1 , E 2 )) (x ∈ D), F ′ (x 0 ) −1 exists at some x 0 ∈ D, and assume
0 < kF ′ (x 0 ) −1 F (x 0 )k ≤ a, kF ′ (x 0 ) −1 F ′′ (x 0 )k ≤ b, (7)
kF ′ (x 0 ) −1 [F ′ (x) − F ′ (x 0 )]k ≤ ckx − x 0 k, c > 0, (8)
kF ′ (x 0 ) −1 [F ′′ (x) − F ′′ (x 0 )]k ≤ dkx − x 0 k for all x ∈ D, (9)
and
(10) 2ka ≤ 1,
where either
(11) k = max{c, b + 2ad},
or, if the function
(12) f (t) = t 3 − 2bt 2 − (2d − b 2 )t + 2d(b + ad) has two positive zeros k 1 , k 2 such that
(13) [b, b + 2ad] ⊆ [k 1 , k 2 ],
then k ≥ c and
(14) k ∈ [b, b + 2ad].
2. Convergence analysis. We need the lemma:
Lemma 2.1. Let a, k be given positive constants. Define the real polyno- mial p on [0, ∞) by
(15) p(t) = k
2 t 2 − t + a and the iteration {t n } n≥0 by
t 0 = 0, (16)
t n+1 = t n − p(t n ) p ′ (t n ) . (17)
Assume
(18) 2ka ≤ 1.
Then the equation
(19) p(t) = 0
has two positive roots r 1 , r 2 with r 1 ≤ r 2 and the iteration {t n } n≥0 generated by (16)–(17) is such that t 0 < t 1 < . . . < t n < t n+1 < . . . < r 1 with lim n→∞ t n = r 1 .
P r o o f. Using (15) and (18) we deduce that equation p(t) = 0 has two positive roots
(20) r 1 = 1 − √
1 − 2ka
k and r 2 = 1 + √
1 − 2ka k
with r 1 ≤ r 2 . Moreover the function t − p(t)/p ′ (t) increases on [0, r 1 ], since p ′ (t) < 0, p ′′ (t) > 0 and p(t) > 0 on [0, r 1 ]. Furthermore if t n ∈ [0, r 1 ] for all integer values smaller than or equal to n, then we obtain
t n ≤ t n − p(t n )
p ′ (t n ) = t n+1 and t n+1 = t n − p(t n )
p ′ (t n ) ≤ r 1 − p(r 1 ) p ′ (r 1 ) = r 1 . We set U (x 0 , s) = {x ∈ E 2 | kx − x 0 k ≤ s} and U(x 0 , s) = {x ∈ E 1 | kx − x 0 k < s}.
Lemma 2.2. The following estimates are true for x ∈ U(x 0 , 1/c):
(21) kF ′ (x) −1 F ′ (x 0 )k ≤ (1 − ckx − x 0 k) −1 and
(22) kF ′ (x 0 ) −1 F ′′ (x)k ≤ b + dkx − x 0 k.
P r o o f. If x ∈ U(x 0 , 1/c), using (7), the estimate
kF ′ (x 0 ) −1 (F ′ (x) − F ′ (x 0 ))k ≤ ckx − x 0 k < 1,
and the Banach lemma on invertible operators [6], the operator F ′ (x) has a continuous inverse on U (x 0 , 1/c) and
kF ′ (x) −1 F ′ (x 0 )k ≤ (1 − ckx − x 0 k) −1 . Moreover by (6) and (11) we get
kF ′ (x 0 ) −1 F ′′ (x)k ≤ kF ′ (x 0 ) −1 F ′′ (x 0 )k + kF ′ (x 0 ) −1 (F ′′ (x) − F ′′ (x 0 ))k
≤ b + dkx − x 0 k.
We can now prove the following semilocal result concerning the conver- gence of Newton’s method (2).
Theorem 2.3. Let F be the operator defined in (1). Let p be the polyno- mial defined in (15). Assume that U (x 0 , 1/c) ⊆ D and Condition B holds.
Then Newton’s iteration {x n } n≥0 generated by (2) is well defined, remains in U (x 0 , r 1 ) for all n ≥ 0, and converges to a solution x ∗ ∈ U(x 0 , r 1 ) of the equation F (x) = 0, which is unique in U (x 0 , r 2 ) if r 1 < r 2 . If r 1 = r 2 the solution x ∗ is unique in U (x 0 , r 1 ). Moreover the following estimates hold for all n ≥ 0:
(23) kx n+1 − x n k ≤ t n+1 − t n and
(24) kx n − x ∗ k ≤ r 1 − t n = (r 1 /r 2 ) 2n(r 2 − t n )
where r 1 and r 2 are the roots of the quadratic equation p(t) = 0 given by (20).
P r o o f. Using induction on n we first show estimate (23). The approx- imation x 1 is defined and
kx 1 − x 0 k = kF ′ (x 0 ) −1 F (x 0 )k ≤ a = t 1 − t 0 < r 1 . It follows that x 1 ∈ U(x 0 , r 1 ) and (23) holds for n = 0.
Assume that (23) holds for all integer values i ≤ n. Using (2) we can write in turn
F ′ (x 0 ) −1 F (x i+1 ) = F ′ (x 0 ) −1 [F (x i+1 ) − F (x i ) − F ′ (x i )(x i+1 − x i )]
(25)
= F ′ (x 0 ) −1
1\
0
[F ′′ [x i + t(x i+1 − x i )]
− F ′′ (x 0 )](1 − t) dt (x i+1 − x i ) 2 + 1
2 F ′′ (x 0 )(x i+1 − x i ) 2
.
Using the induction hypothesis we have kx i+1 − x 0 k ≤
i+1
X
j=1
kx j − x j−1 k ≤
i+1
X
j=1
(t i − t i−1 ) = t i+1 − t 0 = t i+1 < r 1
and
kx i + t(x i+1 − x i ) − x 0 k ≤ t i + t(t i+1 − t i ) < r 1 . Hence, by (7), (9), (15), (22), (23) and (25) we get
(26) kF ′ (x 0 ) −1 F (x i+1 )k
≤ 1 2
b + dkx i − x 0 k + d
3 kx i+1 − x i k
kx i+1 − x i k 2
≤ 1 2
b + dt i + d
3 (t i+1 − t i )
(t i+1 − t i ) 2
≤ 1 2
b + 2
3 dt i + dt i+1
3
(t i+1 − t i ) 2
≤ 1 2
b + 2
3 dr 1 + dr 1
3
(t i+1 − t i ) 2
≤ k
2 (t i+1 − t i ) 2 ≤ p(t i+1 ).
By (2), (17), (21) and (26) we obtain kx i+2 − x i+1 k ≤ − p(t i+1 )
p ′ (t i+1 ) = t i+2 − t i+1 , which shows (23) for all n ≥ 0.
By Lemma 2.1 and estimate (23) it follows that {x n } n≥0 is a Cauchy sequence in the Banach space E 1 and so it converges to some limit x ∗ ∈ U (x 0 , r 1 ) (since U (x 0 , r 1 ) is a closed set). By (2) and the continuity of F , we get F (x ∗ ) = 0. To show uniqueness let y ∈ U(x 0 , r 2 ) be such that F (y) = 0. Using (2) we obtain
y − x n+1 = − [F ′ (x n ) −1 F ′ (x 0 )] n
1
\
0
F ′ (x 0 ) −1 (F ′′ (x n + t(y − x n )) (27)
− F ′′ (x 0 ))(1 − t) dt (y − x n ) 2 +
1
\