K . M O S Z Y ´ N S K I (Warszawa)
ON SOLVING LINEAR ALGEBRAIC EQUATIONS WITH AN ILL-CONDITIONED MATRIX
1. Introduction. Consider a system of N linear algebraic equations
(1.1) Ax = b
with an invertible N × N real matrix A. The matrix A has spectral decom- position
(1.2) A =
X
p j=1(λ
jP
j+ N
j) where:
• P
j, j = 1, . . . , p, are the spectral projectors,
• N
j, j = 1, . . . , p, are the spectral nilpotents,
• λ
j, j = 1, . . . , p, are the eigenvalues of A.
The following conditions hold:
• P
kP
l= P
lP
k= δ
klP
k,
• P
kN
l= N
lP
k= δ
klN
kfor k, l = 1, . . . , p;
• if s
k= dim(P
kR
N), then s
1+ . . . + s
p= N and N
jsj= 0;
• P
pj=1
P
j= I
N, where I
Nis the N × N identity matrix.
It is easy to see that
(1.3) A
−1=
X
p j=11 λ
jP
j+
sj−1
X
s=1
(−1)
sλ
sjN
js;
1991 Mathematics Subject Classification: Primary 65F30.
Key words and phrases : linear algebraic systems, matrix decomposition.
Supported by Research Grant No 211689101 of the Polish Scientific Research Council
(KBN).
hence, for the solution x of (1.1) we get (1.4) x = A
−1b =
X
p j=11 λ
jP
jb +
sj−1
X
s=1
(−1)
sλ
sjN
jsb
.
We can read the formula (1.4) as follows: if we consider some class of matrices A, close to symmetric matrices, for example a class for which kN
jsk/|λ
sj| ≤ CkP
jk/s
j, j = 1, . . . , p; s = 1, . . . , s
j, with some not very large constant C independent of s and j, then the part of the matrix A corresponding to the eigenvalues of the smallest moduli has the strongest influence on the solution x. If A is ill-conditioned, the influence of large eigenvalues may be negligible, while they will disturb any process of numer- ical solution of (1.1). In such a case, (approximate) decomposition of A into parts corresponding to eigenvalues of small and large moduli seems to be useful. We can look at such an operation as a kind of preconditioning. But, in general, preconditioning is not the only purpose of decomposing A. An- other purpose is to enable parallel computing. Clearly, such a decomposition is closely related to (approximate) invariant subspaces of A, or equivalently, to some matrices (approximately) commuting with A.
2. Generalities. Put X = R
Nand Y = span{q
1, . . . , q
r} ⊂ X, where q
1, . . . , q
rare linearly independent elements of X such that r ≤ N . Denote by Q = [q
1. . . q
r] the matrix with columns q
1, . . . , q
r; it is an N × r matrix of rank r.
(2.1) Proposition. For any N × N matrix U , Y = U X iff there exist linearly independent vectors f
1, . . . , f
rin X = R
Nsuch that U = QF , where
F =
f
1T.. . f
rT
.
Moreover , U
2= U (U is a projector ) iff F Q = I
r.
Let U be a matrix satisfying the conditions of (2.1). Then U = QF , and the r × r matrices Q
TQ and F F
Tare both invertible.
If U is nearly a projector, then F Q should be at least invertible.
We are interested in matrices (approximately) commuting with A. As- sume first that U exactly commutes with A:
U A − AU = 0, i.e.
(2.2) QF A − AQF = 0
and
(2.3) AQ = QC
1, Q = [q
1. . . q
r],
with the r × r matrix C
1= F AF
T(F F
T)
−1; in other words, Y = span{q
1, . . . , q
r} is an invariant subspace of A. From (2.2) we immediately deduce
(2.4) F A = C
2F
with the r × r matrix C
2= (Q
TQ)
−1Q
TAQ. From (2.3) we get C
1= C
2. Condition (2.4) means that Z = span{f
1, . . . , f
r} is an invariant subspace of A
T.
If F Q is invertible, then (2.2) implies
(2.5) AQ = QC
3with C
3= F AQ(F Q)
−1, and now from (2.4) it follows that C
3= C
2= C
1. Finally, if F Q is invertible, then by (2.2) we obtain
(2.6) F A = C
4F
with C
4= (F Q)
−1F AQ. Formula (2.5) together with the previous condi- tions implies
(2.7) C
1= C
2= C
3= C
4= C.
Denote by σ(B) the spectrum of a matrix B. It is easy to see that (2.8) σ(C
i) ⊂ σ(A), i = 1, 2, 3, 4.
In fact, if (λ − C
1)x = 0 for x 6= 0, then Q(λ − C
1)x = 0 and hence, applying (2.2), we get (Qλ − AQ)x = 0 or (λ − A)Qx = 0 with Qx 6= 0, because Q = [q
1. . . q
r] and q
1, . . . , q
rare linearly independent. In view of (2.8) if A is invertible, then so is C = C
1= C
2= C
3= C
4. If U commutes with A, then C contains entire information concerning A relating to the invariant subspace Y = U X. This fact is expressed more precisely by the following:
(2.9) Proposition. If U commutes with A, then C =
X
p j=1, U Pj6=0(λ
jP
jC+ N
jC), where
P
jC= (Q
TQ)
−1Q
TP
jQ = F P
jF
T(F F
T)
−1, (2.10)
N
jC= (Q
TQ)
−1Q
TN
jQ = F N
jF
T(F F
T)
−1, j = 1, . . . , p, (2.10
′)
are the spectral projectors and nilpotents of C.
P r o o f. Observe first that formulae (2.3)–(2.8) follow from commuta-
tivity of U and A. Since U commutes with (λ − A)
−1(if (λ − A)
−1exists),
the condition similar to (2.7) holds for P
j=
2πi1R
Γ
(λ − A)
−1dλ. Here Γ is a Jordan curve containing only one eigenvalue λ
jin its interior domain, and such that σ(A) ∩ Γ is empty. We then have (2.10). By an easy trans- formation we prove that U commutes with N
jas well, and (2.10
′) follows.
Now
P
kCP
lC= (Q
TQ)
−1Q
TP
kQF P
lF
T(F F
T)
−1= (Q
TQ)
−1Q
TU P
kP
lF
T(F F
T)
−1= (Q
TQ)
−1Q
TU δ
klP
kF
T(F F
T)
−1= δ
kl(Q
TQ)
−1Q
TQF P
kF
T(F F
T)
−1= δ
klP
kC, k, l = 1, . . . , p, i.e. P
kCare spectral projectors. Similarly we can prove that N
jCare the spectral nilpotents of C. This completes the proof.
Now assume U A − AU = R and R 6= 0. Then neither (2.7) nor (2.9) is true. The only thing we may expect is that (2.7) and (2.9) hold approxi- mately if R is small enough (and (F Q)
−1exists). More precisely, if U
0, Q
0, F
0commute with A, (F
0Q
0)
−1exists and U = (Q
0+ ∆Q)(F
0+ ∆F ), then (2.7) and (2.9) hold asymptotically for U as k∆Qk → 0 and k∆F k → 0.
Moreover, if (F
0Q
0)
−1exists and k∆Qk and k∆F k are small enough, then C
j, j = 1, 2, 3, 4, are invertible. Since F F
T, Q
TQ, and F Q are invert- ible, it follows that F AF
T, Q
TAQ, and F AQ are invertible as well.
It may be of some interest to know the inverses of Q
TAQ, F AF
T, and F AQ. It is easy to verify that, under the above assumptions,
(Q
TAQ)
−1= F A
−1F
T(Q
TQF F
T)
−1(I − ∆
1)
−1(2.11)
= F A
−1F
T(Q
TQF F
T)
−1+ O(R) with ∆
1= Q
TRA
−1F
T(Q
TQF F
T)
−1= O(R), and
(F AF
T)
−1= (I + ∆
2)
−1(Q
TQF F
T)
−1Q
TA
−1Q (2.12)
= (Q
TQF F
T)
−1Q
TA
−1Q + O(R)
with ∆
2= (Q
TQF F
T)
−1Q
TA
−1RF
T= O(R); finally, if (F Q)
−1exists, then
(2.13) (F AQ)
−1= (F A
−1Q)(F Q)
−2(I − ∆
3)
−1= F A
−1Q(F Q)
−1+ O(R) with ∆
3= F RA
−1Q(F Q)
−2= O(R).
Let now σ
0be a spectral set, i.e. σ
0⊂ σ(A). Consider U of the following form:
(2.14) U = X
λj∈σ0
P
j+ ε,
where P
jare spectral projectors of A and ε is a small matrix. We have
R = U A − AU = εA − Aε = O(ε). We are interested in the asymptotic
behaviour of (Q
TAQ)
−1, (F AF
T)
−1, and (F AQ)
−1as ε → 0. Notice that:
(2.15)
F = (Q
TQ)
−1Q
TU or
F = (F Q)
−1F U if (QF )
−1exists, and Q = U F
T(F F
T)
−1or
Q = U Q(F Q)
−1if (F Q)
−1exists.
Let us consider (Q
TAQ)
−1only. The discussion of the remaining inverses is similar. From (2.11) and (2.15) we get
(Q
TAQ)
−1= F A
−1F
T(Q
TQF F
T)
−1+ O(R)
= (Q
TQ)
−1Q
TU A
−1F
T(Q
TQF F
T)
−1+ O(R).
From (1.2) and (1.3) it follows that U A
−1= X
λj∈σ0
P
j+ ε X
pj=1
1 λ
jP
j+
sj−1
X
s=1
(−1)
sλ
sjN
js= X
λj∈σ0
1 λ
jP
j+
sj−1
X
s=1
(−1)
sλ
sjN
js+ O(R).
Hence
(2.16) (Q
TAQ)
−1= (Q
TQ)
−1Q
TX
λj∈σ0
1 λ
jP
j+
sj−1
X
s=1
(−1)
sλ
sjN
jsF
T(Q
TQF F
T)
−1+ O(ε),
i.e. the principal component of (Q
TAQ)
−1depends on the spectral elements of A related to σ
0. Observe also that, in general, (2.16) is not the spectral decomposition of (Q
TAQ)
−1.
The matrices Q
TAQ and F AQ will play a very important role in the method presented below. F AF
Tplays a similar role to Q
TAQ, but for equations with the transposed matrix A
T.
3. Approximate decomposition of (1.1). Consider now a matrix U = QF , where Q = [q
1. . . q
r] is a matrix of rank r, r ≤ N , such that Y = span{q
1, . . . , q
r} is a sufficiently good approximation of some invariant subspace of A, related to a spectral set σ
0. We are going to decompose the system (1.1) into two parts corresponding to σ
0and to σ(A)\σ
0. Multiplying (1.1) from the left by U , we get a new system:
(3.1) U Ax = U b.
Since U A = AU + R (with R small), we have AU x + Rx = U b, or
(3.2) AQy + Rx = U b,
where y = F x.
Observe that, in general, (3.1) has many solutions (x is one of them), while (3.2), regarded as a system with unknown y and x given, has exactly one solution y = F x. This follows from the fact that the N × r matrix AQ has maximal possible rank r. If we multiply the solution y of (3.2) by Q we get
Qy = QF x = U x.
This is exactly the component of the solution x = A
−1b of (1.1) related to U (or, in other words, to the spectral set σ
0⊂ σ(A)).
Assume Q
TAQ to be invertible. Multiplying now (3.2) from the left by Q
T, we obtain a system equivalent to (3.2):
(3.3) Q
TAQy + Q
TRx = Q
TU b.
Another possibility is to multiply (3.2) from the left by F :
(3.4) F AQy + F Rx = F U b.
If F AQ is invertible (see Section 2) then (3.4) and (3.2) are equivalent. Both systems (3.3) and (3.4) are of dimension r × r and both satisfy the condition
(3.5) U x = Qy.
Algorithms for computing the matrices Q
TAQ, F AQ and the vectors Q
TU b, F U b will be proposed in Section 4.
We may stop here if we only want to have an approximate vector y (or U x), under the assumption that R is sufficiently small. In order to compute y we can solve one of the systems
(3.6) Q
TAQv = Q
TU b
or
(3.7) F AQw = F U b,
provided that the corresponding r × r matrix Q
TAQ or F AQ is invertible (see Section 2).
To estimate the errors kv − yk/kyk and kw − yk/kyk we can apply the well known inequality given in
(3.8) Lemma. Let B be an invertible matrix and consider two systems of linear algebraic equations
Bu = d and (B + E)(u + ∆) = d + δ.
Then
k∆k
kuk ≤ cond(B)
kδkkdk
+
kBkkEk1 − cond(B)
kEkkBkprovided that cond(B)kEk/kBk < 1, where cond(B) = kBk kB
−1k.
P r o o f. By simple verification.
Applying now (3.8) to (3.6) and (3.7), we get
(3.9) kv − yk/kyk ≤ cond(Q
TAQ)kQ
TRxk/kQ
TU bk = O(R), kw − yk/kyk ≤ cond(F AQ)kF Rxk/kF U bk = O(R).
Observe that if A is ill-conditioned and σ
0⊂ σ(A) is properly chosen, then we should expect that
cond(Q
TAQ) < cond(A), cond(F AQ) < cond(A).
Moreover, sometimes it is possible to get a full decomposition of (1.1) corresponding to the decomposition of the spectrum
σ(A) = σ
0∪ (σ(A)\σ
0).
Assume that, as above, the matrix U = QF is known; then we can also use the matrix I − U . If U is a projector of rank r, then I − U is a projector of rank N − r. This is the most interesting situation. In general, if U is not a projector, then I − U is a matrix of rank s, where N − r ≤ s ≤ N . Let
(3.10) I − U = SG,
where S is an N × s matrix of rank s (N − r ≤ s ≤ N ) and G is an s × N matrix of rank s. If U is a projector then s = N − r. Observe that
(3.11) (I − U )A − A(I − U ) = −R,
hence multiplying now (1.1) from the left by U and by I − U , and taking into account (3.10) and (3.11), we get
AQy + Rx = U b and ASz − Rx = (I − U )b,
where y = F x and z = Gx with x = A
−1b. Since x = U x + (I − U )x = Qy + Sz, we can write
(3.12)
AQy + RQy + RSz = U b, ASz − RSz − RQy = (I − U )b, and, finally, multiplying (3.12) by Q
Tand S
T, we obtain (3.13)
Q
TAQy + Q
TRQy + Q
TRSz = Q
TU b,
S
TASz − S
TRSz − S
TRQy = S
T(I − U )b.
Another possibility is to multiply (3.12) by F and G to get (3.14)
F AQy + F RQy + F RSz = F U b, GASz − GRSz − GRQy = G(I − U )b.
Both systems (3.13) and (3.14) are of dimension s + r, N ≤ s + r. We shall prove:
(3.15) Theorem. (a) If Q
TAQ and S
TAS are invertible and kRk is small enough, then (3.13) and (1.1) are equivalent.
(b) If F AQ and GAS are invertible and kRk is small enough, then (3.14) and (1.1) are equivalent.
P r o o f. We shall prove (a). The proof of (b) is analogous. We have to show that:
(i)
yz
is a solution of (3.13), where y = F x, z = Gx, and x = A
−1b.
(ii) (3.13) has a unique solution.
To verify (i), insert y and z into the first equation of (3.13) to get (Q
TAQF + Q
TRQF + Q
TRSG)x − Q
TU b
= Q
T{[AU + RU + R(I − U )]x − U b}
= Q
T{[AU + R]x − U b} = Q
TU [Ax − b] = 0.
Similar calculations show that the second equation of (3.13) is also satisfied.
(ii) will be proved if we show that the matrix of (3.13) is invertible. This matrix has the following block form:
(3.16)
Q
TAQ + Q
TRQ Q
TRS
−S
TRQ S
TAS − S
TRS
=
Q
TAQ(I + (Q
TAQ)
−1Q
TRQ) Q
TRS
−S
TRQ S
TAS(I − (S
TAS)
−1S
TRS)
. Observe now that, in general, the matrix
A
11A
12A
21A
22is invertible if A
−111and A
−122exist and kA
12k and kA
21k are small enough.
In fact, we have to find a matrix
B
11B
12B
21B
22such that
(3.17) A
11B
11+ A
12B
21= I, A
11B
12+ A
12B
22= 0,
A
21B
11+ A
22B
21= 0, A
21B
12+ A
22B
22= I.
The first block column of (3.17) yields (I − A
−111A
12A
22−1A
21)B
11= A
−111and B
21= −A
−122A
21B
11. If kA
12k and kA
21k are small enough, the matrix in brackets in the first formula is invertible; hence B
11and B
21are well defined. The same argu- ment applies to the second block column of (3.17). In particular, for suitable A
ij, i, j = 1, 2, we obtain (a).
Observe that the matrix R can be expressed by means of A, Q, S, F, G.
To make use of the decomposition (3.13) or (3.14) of the system (1.1) one can apply the following iterative procedure: we start with an arbitrary pair of vectors y
0∈ R
rand z
0∈ R
s; then the consecutive vectors y
k+1and z
k+1are computed from the system of algebraic equations (3.18)
Q
TAQy
k+1+ Q
TRQy
k+ Q
TRSz
k= Q
TU b, S
TASz
k+1− S
TRSz
k− S
TRQy
k= S
T(I − U )b (for (3.13)), or
(3.19)
F AQy
k+1+ F RQy
k+ F RSz
k= F U b, GASz
k+1− GRSz
k− GRQy
k= G(I − U )b (for (3.14)).
Observe that each iteration step using (3.18) or (3.19) consists in solving two independent systems of dimension r and s with matrices Q
TAQ and S
TAS, or F AQ and GAS respectively. If the decomposition is done properly, then the conditioning of these systems should be much better than that of the original system (1.1). Moreover, the two systems admit parallel solution.
Now let us transform slightly the systems (3.18) and (3.19) to obtain new systems, more convenient for computations. If we express R in terms of Q, F , and S, G, then we easily obtain a new form of (3.18):
(3.20)
Q
TAQv
k+1= Q
TU r
k, S
TASw
k+1= S
T(I − U )r
k, where
x
k= Qy
k+ Sz
k, v
k+1= y
k+1− F x
k, r
k= b − Ax
k, w
k+1= z
k+1− Gx
k. An analogous transformation applied to (3.19) gives (3.21)
F AQv
k+1= F U r
k, GASw
k+1= G(I − U )r
k,
with the same definitions of x
k, r
k, v
k+1, and w
k+1. In the next section we discuss the algorithms of computation of the entries in (3.20) and (3.21).
We also present possible simplifications of (3.20) and (3.21). The following
theorem makes use of equations (3.13) or (3.14), which are more satisfactory
for theoretical investigations than (3.20) and (3.21) respectively.
(3.22) Theorem. (a) If Q
TAQ and S
TAS are invertible, and cond(Q
TAQ) kRk
kQ
TAQk + cond(S
TAS) kRk kS
TASk
is small enough, then for any y
0and z
0the process (3.20) converges to the solution x = A
−1b of (1.1).
(b) If F AQ and GAS are invertible, and cond(F AQ) kRk
kF AQk + cond(GAS) kRk kGASk
is small enough, then for any y
0and z
0the process (3.21) converges to the solution x = A
−1b of (1.1).
P r o o f. We prove (a) only. The proof of (b) is analogous. Observe that the iterative process under consideration is of the general form
A e 0 0 B e
y
k+1z
k+1+
∆
AC
D ∆
By
kz
k=
b
1b
2, while the system of equations we are going to solve (see (3.15)) is
A + ∆ e
AC D B + ∆ e
By z
=
b
1b
2,
with some matrices e A, e B, ∆
A, ∆
B, C, D, and vectors b
1and b
2. Let e
yk= y − y
kand e
zk= z − z
k; then
e
yk+1e
zk+1= −
A e
−1∆
AA e
−1C B e
−1D B e
−1∆
Be
yke
zk= T
e
yke
zk.
The norm of the iteration matrix T can be easily estimated in the stan- dard way:
kT k ≤ k e A
−1∆
Ak + k e A
−1Ck + k e B
−1Dk + k e B
−1∆
Bk.
Since in our case e A = Q
TAQ, e B = S
TAS, ∆
A= Q
TRQ, ∆
B= −S
TRS, C = Q
TRS, D = −S
TRQ, we immediately get a sufficient condition for convergence of the process:
kT k ≤ K
cond(Q
TAQ) kRk
kQ
TAQk + cond(S
TAS) kRk kS
TASk
< 1 with some constant K depending on Q and S. This completes the proof of (a).
4. Algorithms. In this section we propose two algorithms giving entries for the iterative processes (3.20) and (3.21). As before let U be a matrix related to some spectral set σ
0⊂ σ(A), approximately commuting with A.
Certain suggestions on construction of such a matrix U are discussed in
Section 5.
Gram–Schmidt Process. The Gram–Schmidt Process is suitable for equa- tion (3.20). We consider two applications of this process.
A. We try to express the matrix U (of rank r < N ) in the form U = QF , where Q is an N × r matrix with orthonormal columns:
(4.1) Q
TQ = I
r.
The splitting U = QF is exactly the Gram–Schmidt Process, applied to the consecutive columns of U . In this case, F is an upper triangular matrix of the coefficients of the Gram–Schmidt Process. The same algorithm should be applied to I − U : I − U = SG and S
TS = I
s. The equations obtained are a little simpler than (3.20), namely,
(4.2)
Q
TAQv
k+1= F r
k, S
TASw
k+1= Gr
k,
with x
k, r
k, r
k+1, and w
k+1as defined after (3.20) (because Q
TU = Q
TQF
= F , and S
T(I − U ) = S
TSG = G).
B. We modify the Gram–Schmidt Process, applied also to the consecu- tive columns of U = QF. Now instead of the orthogonality assumption (4.2), we impose the condition
Q
TAQ = I
r.
The matrix F is again upper triangular. As before, the same algorithm should be applied to I − U : I − U = SG and S
TAS = I
s. If we succeed in this operation (this is always possible when A is positive definite), the iterative process (3.20) will be explicit:
(4.3)
y
k+1= F x
k+ Q
TU r
k, z
k+1= Gx
k+ S
T(I − U )r
k, with x
k= Qy
k+ Sz
k, r
k= b − Ax
kor, equivalently, (4.4) x
k+1= x
kU + SS
T(I − U )]r
kwith x
k→ x as k → ∞ under the assumptions of Theorem (3.22).
Lanczos Process. To define the entries for the iterative process (3.21) we can apply the Lanczos Process to the matrix U A: we find an N × r matrix Q with orthonormal columns and an r × r lower Hessenberg matrix T (quasi-triangular, i.e. triangular with one additional diagonal, nearest to the main diagonal) such that
U AQ = QT
T, Q
TQ = I
r.
In such a way we obtain an orthonormal basis of the space
Y = U R
N= span{q
1, . . . , q
r},
where Q = [q
1. . . q
r]. Hence U = QF for some matrix F . Moreover, QF AQ = QT
Tand the orthogonality condition Q
TQ = I
rgives
F AQ = T
T.
In other words, the Lanczos Process defines directly the matrices Q and F AQ = T
T; then F AQ is an upper Hessenberg matrix. If A is symmetric, then T
T= F AQ is symmetric tridiagonal. The analogous procedure applied to the matrix (I − U )A = SGA gives
(I − U )AS = SZ
Twith a lower Hessenberg matrix Z, and S satisfying S
TS = I
s. Hence GAS = Z
Tis also an upper Hessenberg matrix. In this way we have deter- mined the principal entries for the process (3.21).
Now we only need to transform the right hand side of (3.21). Observe that F = Q
TQF = Q
TU and G = S
TSG = S
T(I − U ). Hence (3.21) becomes
(4.5)
T
Tv
k+1= Q
TU
2r
k, Z
Tw
k+1= S
T(I − U )
2r
k.
Both subsystems of (4.5) are Hessenberg (tridiagonal if A is symmetric).
There is another known version of the Lanczos Process wich results in tridiagonal T and Z for any matrix A. However, this version is considered to be less stable.
We are looking for N × r matrices Q
1and Q
2such that (4.6)
U AQ
1= Q
1T
1T, A
TU
TQ
2= Q
2T
2T, and
(4.7) Q
T2Q
1= Q
T1Q
2= I
r. Since U = Q
1F , (4.6) and (4.7) imply
Q
T2U AQ
1= T
1T= T
2= F AQ
1.
Since both T
1and T
2are lower Hessenberg matrices, it follows that (4.8)
U AQ
1= Q
1T
T, A
TU
TQ
2= Q
2T,
with Q
T2Q
1= Q
T1Q
2= I
rand T = Q
T1A
TU
TQ
2tridiagonal. From the condition U = Q
1F we get F = Q
T2U .
Applying the similar procedure to I − U with S
1, S
2, and Z in place of Q
1, Q
2, and T , we obtain the tridiagonal version of (3.21):
(4.9)
T
Tv
k+1= Q
T2U
2r
k,
Z
Tw
k+1= S
T(I − U )
2r
k,
with x
k, y
k, z
k, v
k, w
kdefined as before.
5. The matrix U . Certainly there are many possibilities of construct- ing a suitable matrix U . Here we give some remarks on applications of polynomials of the matrix A. It seems that nice results can be obtained in the case of A diagonalizable and with real spectrum σ(A), using Bernstein- like polynomials approximating step functions (see [3]).
Let W
nbe a polynomial of degree n. Then, for A = P
pj=1
(λ
jP
j+ N
j), we have
(5.1) W
n(A) = X
p j=1P
jW
n(λ
j) +
n+1
X
s=1
W
n(s)(λ
j) s! N
js.
Let Ω ⊂ R be an interval containing σ(A); let Ω
1be a subset of Ω and χ
Ω1the characteristic function of Ω
1. Assume that W
n(k)(z) → χ
(k)Ω1
(z) as n → ∞ at any point z ∈ R of continuity of χ
Ω1, and that σ(A)∩∂Ω
1∩Ω = ∅.
Then W
n(λ
j) → 1 as n → ∞ for λ
j∈ σ(A) ∩ Ω
1, W
n(λ
j) → 0 as n → ∞ for λ
j∈ σ(A) ∩ (Ω\Ω
1), while W
n(s)(λ
j) → 0 as n → ∞ for λ
j∈ σ(A), s > 0.
In other words, U
n= W
n(A) → P
λj∈Ω1
P
jas n → ∞.
As an example, consider the following sequence of Bernstein-like polyno- mials B
n(x
1, x
2, λ) (see also [3]), which can be applied when A has real spectrum in [−1, 1]. The function B
n(x
1, x
2, λ) of three real variables:
x
1, x
2, −1 < x
1< x
2< 1, and λ, is a polynomial of degree n with respect to λ:
B
n(x
1, x
2, λ) = X
jn(x1)≤j≤jn(x2)
n j
1 + λ 2
j1 − λ
2
n−jwhere j
n(x) =
12n(1 + x), |x| < 1.
The graph of B
31(−0.5, 0.5, λ) is shown in Figure 1.
Fig. 1. The graph of B
31(−0.5, 0.5, λ)
Define U = B
n(x
1, x
2, A) with fixed x
1, x
2and n. This matrix corre- sponds to the spectral set σ
0= [x
1, x
2] ∩ σ(A) and approximately cuts off the part of the spectrum σ(A) contained in [−1, x
1] ∪ [x
2, 1].
A good method to compute values of the polynomial B
n(x
1, x
2, λ) is to apply the so-called Newton formula:
B
n(x
1, x
2, λ) = b
n0(x) + b
n1(x)(λ − λ
1) + b
n2(x)(λ − λ
1)(λ − λ
2) + . . . + b
nn(x)(λ − λ
1) . . . (λ − λ
n).
The coefficients b
nj(x) are divided differences of B
n(x, ·). The suitable choice of the knots λ
1, . . . , λ
nwas discussed for example in [2] and [3].
Below we give tables of Chebyshev knots and coefficients b
31j, j = 0, 1, 2, . . . , 31, for B
31(−0.5, 0.5, λ). In this case, only 15 (even) coefficients do not vanish.
Chebyshev knots for n = 31 The coefficients b
31jof B
31(−0.5, 0.5, λ)
λ
1= .998795456205172 b
310= 0.0
λ
3= .049067674327418 b
312= − 1.0
λ
5= .740951125354959 b
314= 1.667
λ
7= .671558954847018 b
316= − 0.5881 λ
9= .941544065183021 b
318= − 2.2957 λ
11= .336889853392220 b
3110= 11.4375 λ
13= .903989293123443 b
3112= − 4.3486 λ
15= .427555093430282 b
3114= − 34.0687 λ
17= .989176509964781 b
3116= 29.9361 λ
19= .148730474455362 b
3118= 33.4487 λ
21= .803207531480645 b
3120= − 36.0957 λ
23= .595699304492433 b
3122= − 18.8784 λ
25= .970031253194544 b
3124= 20.4324 λ
27= .242980179903264 b
3126= 9.2076 λ
29= .857728610000272 b
3128= 1.1829 λ
31= .514102744193222 b
3130= 0.3135
∗ ∗ The b
31jfor j odd all vanish
λ
2j= − λ
2j+16. Final remarks. The crucial point of the method discussed above
is the decomposition U = QF and I − U = GS, when U is given. All
processes presented here generate a kind of orthogonal basis of the space
Y = U R
N(the columns of the matrix Q). When dealing with the imple-
mentation of the numerical process of splitting U = QF , it is important to incorporate some criteria in this implementation. These criteria should en- able us to decide which elements of the basis under construction are nearly linearly dependent on those already accepted. Such elements have to be re- jected. Only after accepting or rejecting consecutive elements of the basis, the matrix U is really defined by means of its factors Q and F . At this point a non-polynomial intervention occurs. It seems that both processes of orthogonalization (Gram–Schmidt and Lanczos) are suitable to this end.
Let us remark at the end that the choice of U with U
2= U (a projector) is favorable. This condition implies that s = N − r; hence the decom- posed systems (3.20) and (3.21) are both of dimension N . Possibilities of construction of projectors U will be discussed elsewhere.
References
[1] G. G. L o r e n t z, Bernstein Polynomials, University of Toronto Press, 1953.
[2] V. I. L e b e d e v and S. A. F i n o g e n o v, On the order of choosing parameters in Chebychev cyclic iterative process, Zh. Vychisl. Mat. i Mat. Fiz. 11 (1971), 425–
438 (in Russian).
[3] K. M o s z y ´ n s k i, Bernstein polynomials and eigenvalue problems, report of KBN Grant No 211689191, Department of Mathematics, Computer Science and Mechanics, University of Warsaw, 1992.
KRZYSZTOF MOSZY ´NSKI
DEPARTMENT OF MATHEMATICS, COMPUTER SCIENCE AND MECHANICS UNIVERSITY OF WARSAW
BANACHA 2
02-097 WARSZAWA, POLAND E-mail: KMOSZYNS@MIMUW.EDU.PL