ITERATIVE ESTIMATORS OF PARAMETERS IN LINEAR MODELS WITH PARTIALLY VARIANT COEFFICIENTS

(1)

DOI: 10.2478/v10006-007-0017-0

ITERATIVE ESTIMATORS OF PARAMETERS IN LINEAR MODELS WITH PARTIALLY VARIANT COEFFICIENTS

SHAOLIN HU

^∗,∗∗

, KARL MEINKE

^∗∗

, RUSHAN CHEN

^∗

, OUYANG HUAJIANG

^∗∗∗

∗

Nanjing University of Science and Technology 210071, Nanjing, China

e-mail: hshaolin@ustc.edu.cn

∗∗

Department of Computer Science, Royal Institute of Technology Stockholm, 100-44, Sweden

e-mail:karlm@csc.kth.se

∗∗∗

University of Liverpool, Liverpool, United Kingdom L69 3 GH, Liverpool, UK

e-mail: h.ouyang@liverpool.ac.uk

A new kind of linear model with partially variant coefficients is proposed and a series of iterative algorithms are introduced and verified. The new generalized linear model includes the ordinary linear regression model as a special case. The iterative algorithms efficiently overcome some difficulties in computation with multidimensional inputs and incessantly appending parameters. An important application is described at the end of this article, which shows that this new model is reasonable and applicable in practical fields.

Keywords: linear model, parameter estimation, iterative algorithms, variant coefficients

1. Introduction

In the last centuries, many statisticians and mathemati- cians have considered the following kind of linear regression model:

Y

i

= B

i

β + ε

i

, (1) where Y

i

∈ R

^p

, B

i

∈ R

^p×r

(i = 1, 2, . . . , n) and the vector β ∈ R

^r

is a constant parameter vector to be estimated while ε

i

∈ R

^p

are errors from measurements or stochastic noise from disturbances.

Some excellent theories and practical results were published for statistical inference and stochastic decisions under this model. This model was also successfully used in many different kinds of practical fields (see Draper and Smith, 1981; Frank and Harrell, 2002; Graybill and Iyer, 1994; Hu and Sun, 2001).

Further research into this model shows that the limitation of constant coefficients in (1) is very rigorous and critical. In other words, there are some practical situa- tions in which this linear model cannot be applied (see Brown, 1964; Hu and Sun, 2001). Although statisticians

and researchers (see Fahrmeier and Tutz, 2001; Dodge and Kova, 2000) have done a lot to generalize and/or to adapt the linear model (2), the limitation on constant coefficients has not been essentially untied.

In order to overcome this limitation on the model structure, we set up a new linear model with partially variant coefficients as follows:

Y

i

= A

i

X

i

+ B

i

fβ + ε

i

, (2) where Y

i

∈ R

^p

, A

i

∈ R

^p×q

, B

i

∈ R

^p×r

, and a variant vector series {X

i

∈ R

^q

}. Generally, the dimension p of the measurement output must be greater than the dimension q of the variant coefficient, i.e., p > q, so as to assure that the structure of the time-variant multidimensional linear system is identifiable.

Obviously, the ordinary linear regression model (1) is just a special case of the generalized model (2). If there are not any time-variant parts, namely, q = 0, the model (2) degenerates to the ordinary linear model (1).

In Section 2, much attention is paid to estimate all

the model coefficients in (2) under the Gauss-Markov as-

(2)

sumptions (in Kala R. and Kłaczy´nski K., 1988). A series of iterative algorithms are built to estimate the coefficients which include the constant parameters β ∈ R

^r

and the variant vector series {X

i

∈ R

^q

}. In Section 3, a practical application is described and computation results show that this new model is valuable.

2. Iterative Estimators of Variant Coefficients

In order to make the results of this section universal, we assume first that the coefficient series {X

_i

∈ R

^q

} are not related at different sampling points. For sim- plicity, we adapt the notation of stacked vectors Φ

_n

= ( β

^τ

, X

₁^τ

, . . . , X

_n^τ

)

^τ

∈ R

^r+nq

, where the superscript τ denotes the matrix transpose. Moreover,

H

n

=

⎛

⎜ ⎜

⎝

B

1

A

1

. . . 0 .. . .. . . .. .. . B

_n

0 · · · A

_n

⎞

⎟ ⎟

⎠ ∈ R

^np×(r+nq)

. (3)

Under the following famous Gauss-Markov assump- tions (Kala R. and Kłaczy´nski K., 1988; Rencher, 2000) on random errors { ε

i

∈ R

^p

}:

(i) the error ε

i

has expected value 0;

(ii) the errors series { ε

_i

, i = 1, 2, . . . } are uncorrelated, and

(iii) the error series { ε

i

, i = 1, 2, . . . } are homoscedas- tic, i.e.,

they all have the same variance, the least-squares (LS) estimators of coefficients in the model (2) can be expressed as follows:

Φ ˆ

^LS(n)_n

= arg min

{β∈R^p,Xi∈R^q}

n i=1

Y

_i

− (A

_i

X

_i

+ B

_i

β)

²

. (4)

We can directly deduce a compact formula, which is very similar to the LS estimators of the model (1). The compact formula of the LS estimator of the coefficients in the model (2) is as follows:

Φ ˆ

^LS(n)_n

= ( H

_n^τ

H

n

)

⁻¹

H

_n^τ

Y ¯

_n

, (5) where ¯ Y

_n

= ( Y

₁^τ

, . . . , Y

_n^τ

)

^τ

∈ R

^np

.

In order to guarantee that the matrix H

_n^τ

H

n

is invertible, the dimensions of the model (2) must satisfy the requirement p > q and the sample cardinality must satisfy n > r/(p − q). Otherwise, the matrix inversion in (5) must be replaced by the pseudo-inverse operator “+”, namely, (H

_n^τ

H

_n

)

⁺

.

Theorem 1. Assume that the number of sampling points is n > r/(p − q), where the LS estimators of the coefficients in the model (2) can be iteratively expressed by

⎧ ⎪

⎪ ⎪

⎪ ⎨

⎪ ⎪

⎩

β ˆ

^LS(n+1)

= ˆ β

^LS(n)

+ ( L

n

+ B

_n+1^τ

B

n+1

)

⁻¹

× B

^τ_n+1

( R

⁻¹_n+1

− Ξ

_n+1

)

× R

n+1

( Y

n+1

− B

n+1

β ˆ

^LS(n)

), X ˆ

_i^LS(n+1)

= ˆ X

_i^LS(n)

+ ( A

^τ_i

A

_i

)

⁻¹

A

^τ_i

B

_i

× ( ˆ β

^LS(n)

− ˆ β

^LS(n+1)

),

X ˆ

_n+1^LS(n+1)

= ( A

^τ_n+1

R

n+1

A

n+1

)

⁻¹

A

^τ_n+1

R

n+1

× (Y

_n+1

− B

_n+1

β ˆ

^LS(n)

),

(6)

where

L

n

=

n i=1

B

^τ_i

[ I − A

i

( A

^τ_i

A

i

)

⁻¹

A

^τ_i

] B

i

∈ R

^r×r

,

Ξ

_n+1

= A

_n+1

( A

^τ_n+1

R

_n+1

A

_n+1

)

⁻¹

A

^τ_n+1

∈R

^p×p

, R

n+1

= I−B

n+1

( L

n

+ B

_n+1^τ

B

n+1

)

⁻¹

B

_n+1^τ

∈R

^p×p

. Here the superscript in ˆ X

_i^LS(n)

denotes the LS estimators of X

i

, i = 1, 2, . . . , n.

The proof of Theorem 1 is given in Appendix.

Obviously, the algorithm (6) is iterative and very common in practical engineering fields. Some obvious advantages are the following:

• ˆ β

^LS(n+1)

is a linear combination of the estimator β ˆ

^LS(n)

with innovation Y

n+1

− B

n+1

β ˆ

^LS(n)

from new sampling data.

• ˆ β

^LS(n+1)

can be computed from the estimator β ˆ

^LS(n)

without directly involving old sampling data {Y

i

, i = 1, . . . , n} as well as the estimates { ˆ X

_i^LS(n)

}.

• It is reasonable that the estimators ˆ X

_n+1^LS(n+1)

are de- termined with innovation {Y

n+1

− B

n+1

β ˆ

^LS(n)

}.

• The estimate ˆ X

_i^LS(n+1)

of X

i

(i ≤ n) can be ad- justed accurately in succession with the estimator error of the constant parameter vector β.

In order to use these iterative algorithms to effec- tively solve practical problems, the initial estimates of the iterative algorithms (6) must be carefully selected. Gener- ally, the initial estimators can be selected as the LS estimators processed in batch as follows:

Φ ˆ

^LS(n_n₀ ⁰⁾

= ( H

_n^τ₀

H

n0

)

⁻¹

H

_n^τ₀

Y ¯

_n₀

, (7) where n

₀

∈ N must satisfy the constraint

n

₀

> r

(p − q) .

(3)

If the disturbance {ε

n

∈ R

^p

, n ≤ n

0

} is a stationary Gaussian white noise process with zero mean, then it can be easily shown that the ordinary LS estimators given by (4) are unbiased.

Theorem 2. Given the LS estimators (7) as an initial estimate of the coefficients of (2), if the disturbance {ε

_n

∈ R

^p

, n ∈ N} is a stationary Gaussian white noise process with zero mean, then the iterative estimators (6) are unbiased.

Proof. We just have to show that E ˆ β

^LS(n)

= β. We get E{ ˆ β

^LS(n)

} = E{ ˆ β

^LS(n−1)

} + ˜ L

⁻¹_n

B

_n^τ

[ R

⁻¹_n

−Ξ

n

] R

n

×

A

n

X

n

+ B

n

( β − E{ ˆ β

^LS(n−1)

})

= β + ˜ L

⁻¹_n

B

^τ_n

I − A

_n

( A

^τ_n

R

_n

A

_n

)

⁻¹

×A

^τ_n

R

_n

A

_n

X

_n

= β, (8) where

L ˜

_n

=

ⁿ⁻¹

i=1

B

^τ_i

I − A

_i

( A

^τ_i

A

⁺_i

) A

^τ_i

B

_i

+ B

^τ_n

B

_n

. Using (6), we get

E{ ˆ X

_i^LS(n)

} = E{ ˆ X

_i^LS(n−1)

} + (A

^τ_i

A

_i

)

⁻¹

A

^τ_i

B

_i

×

E{ ˆ β

^LS(n−1)

} − E{ ˆ β

^LS(n)

}

= X

_i

+ ( A

^τ_i

A

_i

)

⁻¹

A

^τ_i

B

_i

( β − β) = X

_i

, i = 1, 2, . . . , n − 1, (9)

E{ ˆ X

_n^LS(n)

} = (A

^τ_n

R

n

A

n

)

⁻¹

A

^τ_n

R

n

×

E{Y

_n

} − B

_n

E{ ˆ β

^LS(n−1)

}

= ( A

^τ_n

R

_n

A

_n

)

⁻¹

A

^τ_n

R

_n

A

_n

X

_n

= X

_n

. (10) Mathematical induction is implicitly used in the above proof, which justifies its correctness.

3. Applications

The new linear model (2) can be widely used in many different fields, e.g., in data fusion, in modeling and identi- fying a computer controlled system, in signal processing, in spacecraft control engineering, etc. In this section, we present an application of the model (2) to the trajectory of a rocket.

Suppose that there are m transducers which are suit- ably located at different sites. These m devices are simul- taneously used to track a carrying rocket M in space. Us- ing these transducers, we get a series of measurement data {(A

_j

(t

_i

), E

_j

(t

_i

)) | i = 1, 2, . . . , n; j = 1, 2, . . . , m}, where A

_j

(t

_i

) denotes the azimuth and E

_j

(t

_i

) denotes the elevation of the rocket M at time t

_i

with respect to a ref- erence frame fixed at the center of Transducer j.

In order to simplify the expressions below, we use a simplified notation such as A

_ij

= A

_j

(t

_i

) and E

_ij

= E

_j

(t

_i

), etc. Accordingly, the error decomposition models used in determining the location of the spacecraft M can be set up as follows (Brown, 1964; Hu and Sun, 2001):

⎧ ⎪

⎪ ⎪

⎪ ⎨

⎪ ⎪

⎩

A

_ij

= tan

⁻¹

x−x

_0j

y−y

0j

+α

_j1

+α

_j3

tan(E

_ij

)sin(A

_ij

) +α

_j4

tan(E

_ij

) cos(A

_ij

) + α

_j5

tan(E

_ij

) +α

_j6

sec(E

_ij

)+ε

_A_ij

,

E

_ij

= tan

⁻¹

z − z

_0j

[(x − x

0j

)

²

+(y − y

0j

)

²

]

^1/2

+α

_j2

+α

_j3

cos(A

_ij

) −α

j4

sin(A

_ij

) + ε

_E_ij

,

(11) where the coefficients (α

_j1

, α

_j2

) are non-zero errors of Transducer j used to measure the azimuth and elevation of the spacecraft, the vectors (α

_j3

, . . . , α

_j6

) are non- orthogonal coefficients representing measurement errors arising from departures from right angles between each pair of axes in measurement equipment (the mechani- cal axis, the laser axis and the electrical axis) separately, (ε

_A

, ε

_E

) are stochastic errors included in the measurement data.

Assuming that we get a series of imprecise location data

p

_i^∗

= (x

^∗_i

, y

_i^∗

, z

_i^∗

)

for the spacecraft M at different sampling times t

_i

(i = 1, 2, . . . ), what we want to do is to estimate all of the in- strument error coefficients as well as the precise location of the spacecraft M .

According to the geometrical relationship between the ordinates and measurement data from radars, two functions are defined as follows:

f

_j

(x, y, z) = tan

⁻¹

x − x

0j

y − y

0j

, g

_j

(x, y, z) = tan

⁻¹

z − z

0j

[(x − x

0j

)

²

+ (y − y

0j

)

²

]

^1/2

,

and the design matrix is

Θ

_ij

=

1 0 tan (E

_ij

) sin(A

_ij

) tan (E

_ij

) cos(A

_ij

)

0 1 cos(A

_ij

) − sin(A

ij

)

tan (E

_ij

) sec(E

_ij

)

0 0

.

(4)

Then we get the following linear model:

Δ ˜ A

_ij

Δ ˜ E

_ij

= J

_j

( P

_i

)

P = P_i^∗

⎛

⎜ ⎝ Δx

_i

Δy

_i

Δz

_i

⎞

⎟ ⎠

+ Θ

_ij

⎛

⎜ ⎜

⎝ a

_j1

.. . a

_j6

⎞

⎟ ⎟

⎠ +

ε

_Aij

ε

_Eij

, (12)

for j = 1, . . . and i = 1, 2, . . . , where Δ ˜ A

_ij

= A

_ij

− f

_j

( P

_i^∗

)Δ ˜ E

_ij

= E

_ij

− g

_j

( P

_i^∗

),

J

₁

( P ) = ∂(f

_j

, g

_j

)

∂(x, y, z) ,

Integrating all m instruments, we get the following inte- grated error decomposition model:

⎡

⎢ ⎢

⎣ Δ ˜ A

_i1

Δ ˜ E

_i1

.. . Δ ˜ A

_im

Δ ˜ E

_im

⎤

⎥ ⎥

⎦

=

⎛

⎜ ⎜

⎝ J

₁

( P )

.. . J

_m

( P )

⎞

⎟ ⎟

⎠

P = P_i^∗

⎛

⎜ ⎝ Δx

_i

Δy

_i

Δz

_i

⎞

⎟ ⎠

+ ¯ B

_i

⎛

⎜ ⎜

⎝ α

₁₁

.. . α

_m6

⎞

⎟ ⎟

⎠ +

⎡

⎢ ⎢

⎣ ε

_Ai1

ε

_Ei1

.. . ε

_Aim

ε

_Eim

⎤

⎥ ⎥

⎦ ,

i = 1, 2, . . . (13) where ¯ B

i

= diag {Θ

i1

, . . . , Θ

_im

}.

Obviously, the model (13) is very similar to the linear model (2) with partially variant parameters. Conse- quently, we can use the iterative algorithm (6) to calibrate the error coefficients in the transducers and, at the same time, to accurately determine the rocket trajectory.

In our simulations, there are four transducers track- ing a rocket in space. Selecting the computation parameter n

₀

= 100[s], we use (6) to get the modification values.

Table 1 contains the estimated values of the error coefficients estimated at 110 seconds. Table 2 includes inerements in the rocket trajectories after n

₀

= 100[s].

The computation results given in Tables 1 and 2 show that the iterative algorithms given in Section 2 not only decrease the computation time, but also efficiently mod- ify the precision of the rocket trajectory. What is more, this practical application shows that this new kind of linear model with variant coefficients is reasonable and valuable not only in theory, but also in various engineering fields.

Table 1. Estimation of error coefficients.

[mrad]

Transducer α

j1

α

j2

α

j3

α

j4

α

j5

α

j6

i = 1 1.203 0.686 0.018 −0.006 0.006 0.003 i = 2 −0.058 −0.070 −0.000 0.000 −0.000 0.001 i = 3 1.087 −1.837 0.041 −0.016 −0.019 −0.004 i = 4 −0.514 1.449 −0.024 0.009 0.015 0.001

Table 2. Increments in values of trajectories.

[m]

i = 100+ Δx

i

Δy

i

Δz

i

1 −0.662445 0.631514 0.001622 2 −0.763551 0.687524 −0.006546 3 −0.760541 0.677017 −0.012729 4 −0.752472 0.673932 −0.017189 5 −0.793005 0.699980 −0.023827 6 0.835997 0.730210 −0.029760 7 −0.832480 0.731190 −0.036693 8 −0.802443 0.710644 −0.037840 9 −0.739471 0.661947 −0.036633 10 −0.739483 0.660732 −0.038780

4. Discussion

The paper not only presents a new kind of linear model, but also builds a series of convenient algorithms. This new model usefully generalized the widely used ordinary linear regression model. The new model can be used in many different kinds of fields, e.g., in data fusion, process monitoring, control engineering, etc.

As for the new algorithms, their advantages are evi- dent. Obviously, if we use the old LS algorithm (5), we must compute the inverse of a large matrix ( H

_n^τ

H

n

)

⁻¹

∈ R

(r+nq)×(r+nq)

. What is more, its dimension is contin- ually increased as long as the number n of samples in- creases or the process moves on. On the other hand, if we use the new iterative algorithm (6), we just need to deal with a series of lower-dimensional inverse matrices, the highest dimension of which is equal to max{p, q, r}.

In fact, the iterative algorithm (6) involves three inverse matrices (A

^τ_i

A

i

)

⁻¹

∈ R

^q×q

, R

⁻¹_n+1

∈ R

^p×p

and (L

n

+ B

^τ_n+1

B

n+1

)

⁻¹

∈ R

^r×r

.

Acknowledgments

We gratefully acknowledge the partial financial support

from the National Nature Science Fund of China (NSFC

90305007), the SI Project (SI-210-05483) of the Sweden

Institute, the Jiangsu Nature Science Fund (BK-06200)

and the NSFC-RS Joint Project (NSFC-RS/0510881-

207043).

(5)

References

Brown D.C. (1964): The Error Model Best Estimation Trajec- tory. — Tech. Rep. AD 602799:

http://stinet.dtic.mil/oai/oai?&verb

=getRecorg&metadataPrefix=

html&identifier=AD0602799

Dodge Y. and Kova J. (2000): Adaptive Regression. – Berlin:

Springer.

Draper N.R. and Smith H. (1981): Applied Regression Analysis.

– New York: Wiley.

Eubank R., Chunfeng H., Maldonado Y., Naisyin W., Suojin W., Buchanan R.J. (2004): Smoothing spline estimation in varying coefficient models. – J. Roy. Stat. Soc. B, Vol. 66, No. 3, pp. 653–667.

Fahrmeier L. and Tutz G.(2001): Multivariate Statistical Mod- eling Based on Generalized Linear Models. — Berlin:

Springer.

Frank E. and Harrell J.(2002): Regression Modeling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. — New York: Springer.

Graybill F.A. and Iyer H.K. (1994): Regression Analysis: Con- cepts and Applications. — Massachusetts: Duxbury Press.

Hu Shaolin and Sun Guoji (2001): Process Monitoring Tech- nique and Applications. — Bejing: National Defense In- dustry Press.

Kala R. and Kłaczy´nski K. (1988): Recursive improvement of estimates in a Gauss-Markov model with linear restrictions.

Canad. J. Stat., Vol. 16, No. 3, pp. 301–305.

Rencher A. (2000): Linear Models in Statistics. – New York:

Wiley.

Appendix

In order to prove Theorem 1, recall two lemmas without proofs. They are fundamental in linear algebra.

Lemma 1. If a block matrix A and its submatrix A

11

in A are invertible, then we have

A

₁₁

A

₁₂

A

21

A

22

₋₁

=

A

⁻¹₁₁

0 0 0

+

A

⁻¹₁₁

A

12

−I

( A

22

−A

21

A

⁻¹₁₁

A

12

)

⁻¹

( A

21

A

⁻¹₁₁

.. . − I).

(A1)

Similarly, if a matrix A and its submatrix A

22

are invertible, then we have

A

₁₁

A

₁₂

A

21

A

22

₋₁

=

0 0 0 A

⁻¹₂₂

+

−I A

⁻¹₂₂

A

₂₁

( A

11

−A

12

A

⁻¹₂₂

A

21

)

⁻¹

( −I .. .A

12

A

⁻¹₂₂

) (A2)

Lemma 2. If matrices F and G are invertible and an inverse matrix (F − HG

⁻¹

K)

⁻¹

exists, then

( F −HG

⁻¹

K)

⁻¹

= F

⁻¹

+ F

⁻¹

H(G−KF

⁻¹

H)

⁻¹

KF

⁻¹

. (A3) The proofs of these two lemmas can be found in the references (Draper and Smith, 1981).

Proof of Theorem 1. With the model (2) and n samples, Eqn. (5) shows that the LS estimator is

Φ ˆ

^LS(n)_n

= ( H

_n^τ

H

n

)

⁻¹

H

_n^τ

Y ¯

n

. If there is another sampling datum

Y

n+1

= A

n+1

X

n+1

+ B

n+1

β + ε

n+1

which has been added into the sampling set, the LS estimators of all the coefficients in the model (2) must be changed in accordance with the following expressions:

Φ ˆ

^LS(n+1)_n+1

= ( Ψ

^τ_n

Ψ

n

)

⁻¹

Ψ

^τ_n

Y ¯

_n

Y

n+1

=

D

11

C

_n+1^τ

A

_n+1

A

^τ_n+1

C

_n+1

A

^τ_n+1

A

_n+1

₋₁

Ψ

^τ_n

×

Y ¯

n

Y

_n+1

, (A4)

where Φ ˆ

^LS(n+1)_n+1

=

Φ ˆ

^LS(n+1)n

X ˆ

_n+1^LS(n+1)

, Ψ

_n

=

H

n

0 C

n+1

A

n+1

,

C

_n+1

= ( B

_n+1

, 0) ∈ R

^p×(q+nq)

D

₁₁

= H

_n^τ

H

_n

+ C

_n+1^τ

C

_n+1

.

(6)

Now, using the notation

D

₂₂

= A

^τ_n+1

A

_n+1

, D

₁₂

= C

_n+1^τ

A

_n+1

, Ω = D

₂₂

− D

₂₁

D

⁻¹₁₁

D

₁₂

,

the following formula can be directly derived from Lemma 1:

Φ ˆ

^LS(n+1)n

X ˆ

_n+1^LS(n+1)

=

D

⁻¹₁₁

0 0 0

+

D

₁₁⁻¹

D

12

−I

Ω

⁻¹

D

21

D

⁻¹₁₁

.. . − I

H

_n^τ

Y ¯

n

+ C

_n+1^τ

Y

n+1

A

^τ_n+1

Y

_n+1

= E

n

H

_n^τ

Y ¯

n

+ F

n

Y

n+1

, (A5) where

F

n

=

⎛

⎜ ⎜

⎝

D

⁻¹₁₁

[ C

_n+1^τ

+ D

12

Ω

⁻¹

D

21

D

₁₁⁻¹

C

_n+1^τ

−D

12

Ω

⁻¹

A

^τ_n+1

]

−Ω

⁻¹

[ D

₂₁

D

₁₁⁻¹

C

_n+1^τ

− A

^τ_n+1

]

⎞

⎟ ⎟

⎠

and

E

n

=

D

⁻¹₁₁

+ D

⁻¹₁₁

D

₁₂

Ω

⁻¹

D

₂₁

D

₁₁⁻¹

−Ω

⁻¹

D

₂₁

D

⁻¹₁₁

.

Step 1. We analyze the expression E

n

. From the expression for the block matrix D

11

and Lemma 1 we have

D

₁₁⁻¹

=

⎡

⎢ ⎢

⎣

n+1

i=1

B

^τ_l

B

i

B

^τ₁

A

1

· · · B

_n^τ

A

n

A

^τ₁

B

₁

A

^τ₁

A

₁

· · · 0 .. . .. . . .. .. . A

^τ_n

B

n

0 · · · A

^τ_n

A

n

⎤

⎥ ⎥

⎦

−1

.

It can be shown that the first r rows of the matrix D

₁₁⁻¹

can be expressed as follows:

D

₁₁⁻¹

=

⎡

⎢ ⎢

⎢ ⎣

_n+1

i=1

B

_i^τ

B

_i

−

ⁿ⁺¹

i=1

B

^τ_i

U

_i

A

^τ_i

B

_i

₋₁

∗

−T

n

_n+1

i=1

B

_i^τ

B

i

−

ⁿ⁺¹

i=1

B

_i^τ

U

i

A

^τ_i

B

i

−1

∗

⎤

⎥ ⎥

⎥ ⎦ ,

(A6) where

U

_i

= A

_i

( A

^τ_i

A

_i

)

⁻¹

, T

_n

=

B

^τ₁

U

₁

, . . . , B

^τ_n

U

_n

_τ

, and the asterisk ‘*’ denotes an omitted matrix block which is rather complicated and does not have any effect on the following deduction process.

Analyzing the formulas for the matrix blocks D

22

and D

12

= D

21

, we get

D

⁻¹₁₁

+ D

⁻¹₁₁

D

12

Ω

⁻¹

D

21

D

₁₁⁻¹

= !

I + D

⁻¹₁₁

C

_n^τ

V

_n+1

C

_n+1

"

D

₁₁⁻¹

, (A7) where

V

_n+1

= A

_n+1

( A

^τ_n+1

A

_n+1

− A

^τ_n+1

C

n+1

D

⁻¹₁₁

C

_n+1^τ

A

n+1

)

⁻¹

A

^τ_n+1

, and

D

₁₁⁻¹

C

_n^τ

V

n+1

C

n+1

= D

₁₁⁻¹

B

_n+1^τ

A

n+1

W

_k+1⁻¹

A

^τ_n+1

B

n+1

0 0 0

=

L ˜

⁻¹_n+1

B

_n+1^τ

A

n+1

W

_k+1⁻¹

A

^τ_n+1

B

n+1

0 −T

n

L ˜

⁻¹_n+1

B

_n+1^τ

A

n+1

W

_k+1⁻¹

A

^τ_n+1

B

n+1

0 , (A8) where

W

_k+1

= A

^τ_n+1

( I − B

_n+1

L ˜

⁻¹_n+1

B

^τ_n+1

) A

_n+1

, L ˜

_n+1

=

n i=1

B

_i^τ

[ I − U

_i

A

^τ_i

] B

_i

+ B

_n+1^τ

B

_n+1

.

The matrix D

₁₁⁻¹

can be expressed as follows:

D

⁻¹₁₁

= ( H

_n^τ

H

_n

)

⁻¹

− (H

_n^τ

H

_n

)

⁻¹

C

_n+1^τ

×

I + C

_n+1^τ

( H

_n^τ

H

_n

)

⁻¹

C

_n+1^τ

₋₁

× C

_n+1

( H

_n^τ

H

_n

)

⁻¹

. (A9) On second thoughts, using the notation

L

n

=

n i=1

B

_i^τ

I − A

i

( A

^τ_i

A

i

)

⁻¹

A

^τ_i

B

i

,

we have

( H

_n^τ

H

n

)

⁻¹

=

⎡

⎢ ⎢

⎣

n

i=1

B

^τ₁

B

_i

B

₁^τ

A

₁

. . . B

^τ_n

A

_n

A

^τ₁

B

1

A

^τ₁

A

1

. . . 0

.. . .. . . . . .. . A

^τ_n

B

_n

0 · · · A

^τ_n

A

_n

⎤

⎥ ⎥

⎦

−1

=

L

⁻¹_n

−L

⁻¹_n

T

_n^τ

−T

n

L

⁻¹_n

∗

, (A10)

so we get D

₁₁⁻¹

=

I −

L

⁻¹_n

Z ˜

_n+1

0 −T

_n

L

⁻¹_n

Z ˜

_n+1

0 #

( H

_n^τ

H

n

)

⁻¹

,

(A11)

where ˜ Z

_n+1

= B

^τ_n+1

( I + B

_n+1

L

⁻¹_n

B

_n+1^τ

)

⁻¹

B

_n+1

.

(7)

On the other hand, the matrix Ω

⁻¹

D

21

=( D

22

− D

₂₁

D

⁻¹₁₁

D

₁₂

)

⁻¹

D

₂₁

can be simplified as follows:

Ω

⁻¹

D

₂₁

=

$ A

^τ_n+1

( I − C

_n+1

D

₁₁⁻¹

C

_n+1^τ

) A

_n+1

%

₋₁

×A

^τ_n+1

C

n+1

=

$ A

^τ_n+1

( I − B

n+1

L ˜

⁻¹_n+1

B

_n+1^τ

) A

n+1

)

⁻¹

%

×A

^τ_n+1

( B

_n+1

, 0 · · · 0). (A12) Write

Q

n+1

= {A

^τ_n+1

( I − B

n+1

L ˜

⁻¹_n+1

B

^τ_n+1

) A

n+1

}

⁻¹

,

× A

^τ_n+1

B

_n+1

P

_n+1

= B

_n+1^τ

A

_n+1

Q

_n+1

. Then

Ω

⁻¹

D

₂₁

D

⁻¹₁₁

= ( Q

n+1

, 0

p×nq

)

I −

L

⁻¹_n

Z ˜

n+1

0 −T

_n

L

⁻¹_n

Z ˜

_n+1

0 #

×(H

_n^τ

H

n

)

⁻¹

= ( Q

n+1

[ I − L

⁻¹_n

Z ˜

n+1

] .. . 0)(H

_n^τ

H

n

)

⁻¹

(A13) and

D

₁₁⁻¹

+ D

₁₁⁻¹

D

12

Ω

⁻¹

D

21

D

⁻¹₁₁

=

I +

L

⁻¹_n+1

P

_n+1

0 T

n

L ˜

⁻¹_n+1

P

n+1

0 #

×

I−

L

⁻¹_n

Z ˜

_n+1

0 −T

_n

L

⁻¹_n

Z ˜

_n+1

0 #

( H

_n^τ

H

n

)

⁻¹

=

⎛

⎝ I+ ˜ L

⁻¹_n+1

P

_n+1

−(I+ ˜ L

n+1

P

n+1

) L

⁻¹_n

Z ˜

n+1

0 −T

_n

L ˜

⁻¹_n+1

P

_n+1

+ T

_n

( ˜ L

⁻¹_n+1

P

_n+1

+ I)L

⁻¹_n

Z ˜

_n+1

I

⎞

⎠

×(H

_n^τ

H

n

)

⁻¹

. (A14)

Combining (A5), (A13) and (A14), we get

E

n

=

⎛

⎜ ⎜

⎝

I + ˜ L

⁻¹_n+1

P

_n+1

( I − ˜ E

_n

) − ˜ E

_n

0 −T

n

[ ˜ L

⁻¹_n+1

P

n+1

( I − ˜ E

n

) − ˜ E

n

] I

−Q

n+1

( I − ˜ E

_(n)

) 0

⎞

⎟ ⎟

⎠

×(H

_n^τ

H

_n

)

⁻¹

, (A15)

where ˜ E

n

= L

⁻¹_n

Z ˜

n+1

. Using

L ˜

⁻¹_n+1

= ( L

_n

+ B

_n+1^τ

B

_n+1

)

⁻¹

= L

⁻¹_n

− L

⁻¹_n

Z ˜

_n+1

L

⁻¹_n

,

we get

I − ˜ E

n

= L

⁻¹_n+1

L

_n

. (A16) Consequently, Eqn. (25) can be expressed as

E

n

=

⎛

⎜ ⎜

⎝

I + L

^p_n+1

+ ˜ L

⁻¹_n+1

[ L

_n

− ˜ L

⁻¹_n+1

] 0

−T

_n

[ L

^p_n+1

+ ˜ L

⁻¹_n+1

( L

_n

− ˜ L

⁻¹_n+1

)] I

−Q

n+1

L ˜

⁻¹_n+1

L

n

0 ⎞

⎟ ⎟

⎠

×(H

_n^τ

H

n

)

⁻¹

, (A17)

where L

^p_n+1

= ˜ L

⁻¹_n+1

P

n+1

L ˜

⁻¹_n+1

L

n

.

Step 2. Let us analyze the expression for F

_n

. From (A 6), we have

D

₁₂

Ω

⁻¹

D

₂₁

D

₁₁⁻¹

C

_n+1^τ

= C

_n+1^τ

A

_n+1

Ω

⁻¹

A

^τ_n+1

C

_n+1

D

⁻¹₁₁

C

_n+1^τ

. Thus,

D

₁₂

Ω

⁻¹

D

₂₁

D

⁻¹₁₁

C

_n+1^τ

=

⎛

⎜ ⎜

⎝

B

_n+1^τ

A

_n+1

Ω

⁻¹

A

^τ_n+1

B

_n+1

0 · · · 0

0 0 · · · 0

.. . .. . .. .

0 0 · · · 0

⎞

⎟ ⎟

⎠

×

L ˜

⁻¹_n+1

− ˜ L

⁻¹_n+1

T

_n^τ

−T

_n

L ˜

⁻¹_n+1

∗

B

_n+1^τ

0 =

B

^τ_n+1

A

_n+1

Ω

⁻¹

A

^τ_n+1

B

_n+1

L ˜

⁻¹_n+1

B

_n+1

0 (A18) and we have the following four equations:

D

₁₁⁻¹

D

₁₂

Ω

⁻¹

D

₂₁

D

⁻¹₁₁

C

_n+1^τ

=

L ˜

⁻¹_n+1

− ˜ L

⁻¹_n+1

T

_n^τ

−T

n

L ˜

⁻¹_n+1

∗

M

n+1

0 =

L ˜

⁻¹_n+1

M

_n+1

−T

n

L ˜

⁻¹_n+1

M

n+1

, (A19)

D

⁻¹₁₁

C

_n+1^τ

=

L ˜

⁻¹_n+1

−˜ L

⁻¹_n+1

T

_n^τ

−T

n

L ˜

⁻¹_n+1

∗

B

_n+1^τ

0 =

L ˜

⁻¹_n+1

B

_n+1^τ

−T

n

L ˜

⁻¹_n+1

B

_n+1^τ

, (A20)

D

₁₁⁻¹

D

12

Ω

⁻¹

A

^τ_n+1

= D

⁻¹₁₁

C

_n+1^τ

A

_n+1

Ω

⁻¹

A

^τ_n+1

=

L ˜

⁻¹_n+1

B

^τ_n+1

A

_n+1

Ω

⁻¹

A

^τ_n+1

−T

_n

L ˜

⁻¹_n+1

B

_n+1^τ

A

_n+1

Ω

⁻¹

A

^τ_n+1

, (A21)

Ω

⁻¹

D

₂₁

D

₁₁⁻¹

C

_n+1^τ

= Ω

⁻¹

A

^τ_n+1

B

_n+1

L ˜

⁻¹_n+1

B

_n+1^τ

,

(A22)