A method for the calculation of orthogonal transformation matrices and its application to photogrammetry and other disciplines

(1)

A Method for the Calculation of

ORTHOGONAL TRANSFORMATION MATRICES and its Application to Photogrammetry

and other Disciplines

a"

(2)

(3)

A Method for the Calculation of

ORTHOGONAL TRANSFORMATION MATRICES

and its Application to Photogrammetry

and other Disciplines

Proefschrift

ter verkrijging van de graad van doctor in de technische wetenschappen aan de Technische Hogeschool Delft, op gezag van de rector magnificus dr. ir. C. J. D. M. Verhagen, hoogleraar in de afdeling der Technische Natuurkunde, voor een commissie uit de senaat te verdedigen op dinsdag 16 december 1969,

des namiddags te 4 uur door

Menno Tienstra

geodetisch ingenieur geboren te Wageningen

(4)

Dit proefschrift is goedgekeurd door de promotor Prof. dr. O. Bottema.

(5)

CONTENTS

Chapter I Review of procedures for obtaining orthogonal matrices

Introduction 7 Properties of orthogonal transformation matrices 9

Review of methods for obtaining three-dimensional orthogonal

matrices 10 Direct solution methods for three-dimensional orthogonal

trans-formations 15 Chapter II A general procedure for the calculation of overdetermined

orthogonal transformations

Introduction 19 The function to be minimized and its derivatives 20

Elimination of the matrix A, determination of the scalefactor . 24

The structure of the matrices K, T, L and M 29

Determination of matrix A 33 Derivation of a formula for the transformation matrix of the

two-dimensional orthogonal transformation 35 Derivation of a formula for the transformation matrix of the

three-dimensional orthogonal transformation 36 Second procedure for the determination of matrix A 38

Orthogonal transformations and the theory of the adjustment of

normally distributed observations 41 Chapter III Numerical procedures and applications

Numerical procedure and flow diagram for the general case . . . 44 Numerical procedure and flow diagram for the three-dimensional

transformation 46 Applications and results 46

Final remarks 51 Summary 53 Samenvatting (Summary in the Dutch language) 54

(6)

(7)

Chapter I R E V I E W O F P R O C E D U R E S F O R O B T A I N I N G O R T H O G O N A L M A T R I C E S 1. Introduction A linear transformation yz = a2iXi+a22X2 + --- + a2„x„ ym = a^iXi+a„2X2 + ---+am^n

establishes a relation between two sets of variables, y^ ... y^ and x^ ... x„, where m is not necessarily equal to n.

The matrix

/an a i 2 . . . ö i „

a,i a22...a2n . ^j 2)

is the transformation matrix.

Now if J = {yi.. .y^) and x = {xy... JC„) are conceived as column vectors, the right hand sides of the equations (1.1) are the elements of the product matrix Ax, which is a column vector with m elements. Thus (1.1) indicates the equality of the corre-sponding elements of the column vectors ^ and Ax, so that in matrix notation the transformation (1.1) takes the simple form

y = Ax (1.3) The matrix A as defined in (1.2) is arbitrary. We can obtain special types of linear

transformations by requiring that the transformation matrix fulfils certain conditions. Orthogonal transformations are obtained, if:

a. the matrix A is & square matrix

b. the length of a vector x is equal to the length of the transformed vector y. From (1.3) we find the condition which the matrix A should fulfil in order to satisfy this second requirement. The square of the length of a vector is obtained by calcu-lating the sum of the squares of its elements. In matrix notation this means that we have to premultiply a column vector by its transposed.

(8)

Application to (1.3) gives: y'f = (Ax)'Ax

where the ' sign denotes transposition.

Or: /y = x'A'Ax (1.4) According to our condition we should have:

y'y = x'x (1.5) Comparing (1.4) and (1.5), we see that the condition (1.5) is satisfied if

A'A =1 (1.6) where I denotes the unit matrix, the order of which is equal to that of A.

Matrix equation (1.6) represents «^ quadratic equations which should be satisfied by the elements of matrix A. As the product matrix A'A is symmetrical, only i«(«+1) of these equations are independent, consequently the number of independent para-meters is «^ — jn{n + 1) = jn{n — 1).

The lengths of vectors are not affected by an orthogonal transformation, so such a transformation can involve only rotations.

Hence, 2, 3, 4, and 5 dimensional orthogonal transformations permit 1, 3, 6 and 10 rotations respectively, a number which increases rapidly with the increase of n.

In the past, several mathematicians have given their attention to the problem of expressing the «^ elements of the matrix as functions of ^n(n— 1) parameters.

Leonid Euler was the first to formulate the properties of especially three-dimen-sional transformation matrices in a paper read in 1770 for the Petersburg Academy of Science [1]. Other mathematicians who contributed to the theory of orthogonal trans-formation matrices are Cayley, Rodrigues, Hamilton and Study. It is understandable that most effort was directed to the investigation of three-dimensional orthogonal transformations: the two-dimensional case does not present any difficulties and cases larger than three-dimensional ones have little physical meaning.

These investigations resulted in a number of possibilities for expressing the nine elements of the three-dimensional orthogonal transformation as functions of three independent parameters. A review of these procedures will be given subsequently.

The results, however, are not entirely satisfactory, as the expressions obtained are, on the whole, complicated which makes the numerical apphcation of these matrices rather awkward.

(9)

sur-veying and related disciplines, for instance photogrammetry. Here, coordinates measured or calculated in a local coordinate system frequently have to be transformed into another system, for instance the country's geodetic system. All coordinates in such systems depend in one way or another on observations and they are therefore subject to errors which are inherent in the observational process. To reduce the effect of such unavoidable errors, in general an adjustment procedure is preferred. That is to say, more data are used than are strictly necessary to determine the unknown quantities of the problem. In this way a best fit is obtained. Iterative procedures are required if the above mentioned expressions for three-dimensional orthogonal matrices are applied, because the formulae for the elements of the matrices are non-linear. Although this is in itself not a serious objection to the use of these matrices it is nevertheless necessary to determine approximate values for the unknowns, the transformation parameters. (The expression "approximate values" will be used in this context for the zero values or starting data of an iterative process.) The deter-mination of approximate values for the transformation parameters can be a rather complicated matter, requiring a special computational procedure.

In the first instance to avoid calculating approximate values, an attempt was made to try and determine the three-dimensional orthogonal matrix directly from the relations between coordinates in two systems, making use of course of the relations which exist among the elements of the orthogonal matrix. This proved to be entirely successful. Moreover, it was found that the derived procedure is generally valid for the calculation of «-dimensional orthogonal transformation matrices.

2. Properties of orthogonal transformation matrices The condition (1.6) for orthogonal matrices

A'A = ƒ (1.6) leads to a number of conditions among the elements of the matrix A.

Carrying out the matrix multiplication in (1.6) we immediately see that: 1. The sum of the squares of the elements in each column is equal to one.

II. The sum of the products of elements in corresponding positions in each two columns is equal to zero.

Furthermore we see that if A is an orthogonal matrix, its inverse is equal to its trans-posed :

A' = A-^ (2.1) This property leads to a third set of relations among the elements of the matrix:

III. Each element is either equal (if the determinant of A is positive) or opposite (if the determinant is negative) to its cofactor.

(10)

Another conclusion to be drawn from (2.1) is that the determinant of A has to be plus or minus one.

In the former case the transformation is called proper orthogonal (the transfor-mation formulae involve only rotation), in the latter case it is called improper ortho-gonal (the transformation formulae involve rotation and reflection).

Finally we may conclude that if matrix A is an orthogonal one, its transposed will also be an orthogonal matrix.

So, the relations which exist among the elements of the columns of A have to exist also among the elements of the rows of A.

The number of relations which we have now established far exceeds the number of in(n+ 1) independent ones that must be contained in the condition of orthogonality. Euler has already proved, in his paper mentioned before [1], that the ^n{n+\) relations contained in conditions I and II given above are independent, and that all other relations can be derived from those. As in most literature only proper ortho-gonal matrices will be considered here (this restriction is of advantage if successive transformations are applied: the resultant transformation matrix is again a proper orthogonal one). Besides, it is quite simple to transform an improper orthogonal transformation into a proper orthogonal one, for instance by an appropriate change of the signs of the coordinates.

3. Review of methods for obtaining three-dimensional orthogonal matrices

As was mentioned in the previous sections, the number of independent parameters by which a «-dimensional orthogonal matrix may be defined, is i«(«—1). Probably because of the physical significance of three-dimensional orthogonal matrices, methods for the construction of such matrices have in the past been thoroughly investigated.

To illustrate how complicated these matrices become, an outline of various methods will be given here.

These methods may be grouped into three major categories:

1. methods expressing the nine elements of the matrix as goniometric functions of three angles of rotation about three properly chosen axes;

2. methods expressing six elements of the matrix as functions of the three remaining elements;

3. methods expressing the nine elements of the matrix as functions of three well chosen parameters, other than those used above.

Of course, all methods lead basically to the same results and quite often matrices can pass by means of a simple substitution from one category into another. Schut [2] has made an interesting study on this subject. His findings will be given here in brief, together with some comments.

(11)

Historically the oldest methods are those derived by Euler in his paper of 1770, mentioned before. Euler choses as parameters three angles C, r] and 9. He then pro-ceeds as follows:

defining a^ — cos C the sum of squares of the elements of row one will be equal to unity, if 0,2 and a j j are defined as the product of sin C by cosine and sine of the second angle t] respectively

012 = sin C cos rj

013 = si"^ C sin^

Likewise, the sum of squares of the elements of column one will be equal to unity, if 021 and 031 are defined as the product of sin C by cosine and sine of the third angle 9 respectively

021 = sin C cos 9 031 = sin ( sin 9

The remaining elements are determined by considering the inverse of the orthogonal matrix. In this way two equations in ^22 and 033 and two equations in ^23 and 032 are obtained, from which these elements can be solved:

cosC cos^; cosö —sin»; sinö cosC sin;? sin0 —cos/jcosff cos^ sinjj cosÖ4-cos;; sinö cosC cos;; sin0-1-sin;; cos0

The angles (,;; and 0 are now known as Eulerian angles. As a second example, matrix (1) of table I has been constructed, starting from «33 =

cosC-In a following section of his paper Euler starts with a two-dimensional orthogonal transformation matrix. A three-dimensional transformation matrix is composed of three such two-dimensional transformations, one in each of the coordinate planes. This leads to three sets of transformation formulae, in each of which one of the coordinates is invariant.

If in each case the proper orthogonal form is chosen, their matrices become: / I 0 0 \ / cosiSOsinjS\ / cosy siny 0 /l^ = 0 cosa sina ; / 4 ^ = 0 1 0 j;.4y = j —siny cosy 0

\ 0 - s i n a cosa/ \ - s i n ; S 0 cos)S/ \ 0 0 1

^22 = ^33 =

023 = 032 =

(12)

The product of the successive transformations A^, A^ and A ^ leads to a matrix of type (2) in table I.

A property of this construction is that it defines a primary, a secondary and a tertiary rotation. In this example, a is the primary, fi the secondary and y the tertiary rotation.

Since six relations exist among the nine coefficients (as Euler already remarked), three can be chosen at will. According to this principle, the British Ordnance Survey constructs the matrix from three of its elements, without using angles. Starting from three off-diagonal elements, the others are computed by means of the properties of

Table I. Matrices constructed with three independent parameters

— cos;; cos 0 — cosC sin;; sin0 -l-sin;;cos0 —cos^cos;; sin0 4-sinCsin0 + cos;; sin0 — cos^sin;; COS0 — sin;; sin 0 —cos (cos;; cos 9 -|-sinCcos0

.H-sinCsin;; -j-sinCcos;; -l-cosC (1)

(2)

-I-cos j8 cosy

— sina sin^ cos y — cos a sin y . — cos a sin^S cos y — sina sin y

+ cosjSsiny +sin^ — sin a sin ^ sin y-I-cos a cosy -l-sinacosjS

— cos a sin )8 sin y —sin a cosy +cosacos^,

(3) (4) + V l - a i 2 ' - a i 3 ^ - 0 1 2 0 3 3 - 1 1 0 2 3 0 1 3 l - a , 3 ' + « 1 2 0 2 3 - 0 1 1 0 3 3 0 1 3 1 - 0 . 3 ' O n \{SIAB-SJCD) \{-JAC+-JBD)

w

0 2 2

K/

« 1 2 flllÖ33-Ol20230l3 l - a , 3 ' - a i l 0 2 3 - O l 2 0 3 3 0 l 3 1 - « 1 3 ' AB + yJCD) BC-S/AD) « 1 3 « 2 3 + V i - « 1 3 ^ - 0 2 3 '

i(V^c-VfiZ))'

iiy/BC + y/AD) « 3 3 where A = 1+^11 — 022-^33 B = 1-011+022-033 C = l - a i l - « 2 2 + « 3 3 D = 1-1-0,1 4-022+«33

(13)

orthogonality. Choosing the elements fli2, fli3 and «23 as independent parameters, matrix (3) of table 1 is obtained.

Another approach is to choose the elements a^, 022 and «33 as independent para-meters. The derivation of the functions for the remaining six elements is rather com-plicated and will be omitted here. The result however is given as matrix (4) of table I.

The matrices (5), (6) and (7) of table I are merely being presented for reasons of completeness; their practical applicability is rather limited.

A more interesting way of obtaining an orthogonal matrix is by using Cayley's formula (also known as Cayley-Klein Matrix):

(5) sinC cosC sin;; cos C cos;; where tanC = sinC' sinC" cosC'sin;;' cos^'sin;;" cos ('cos;;' cos C" cos;;"

-, tanC = , tan( =

cos(;;' —;;") cos(;;" —;;) cos(;; —;;') A^ = tan^C tan^C'tan^C" = — cos(;;' —;;")cos(;;" —;;)cos(;; —;;')

(6)

A^(l—cos a)-I-cos ot /(A(l — cosa)—vsina vA(l—cosa)-|-/^ sina ^A(l —cosa)+vsina ju^(l —cosa)-l-cosa /^v(l —cosa) —Asina .V/l(l—cosa) —^sina JL(V(1 —cosa) + Asina v^(l —cosa)-l-cosa where A^ + n^ + v^ = 1 (7) 00102+03 .£•0301-02 £ « 1 0 2 - « 3 2 , „ 2 00301+02 1 - 0 ( 0 3 ^ - 1 - 0 / ) o a 2 « 3 - « i 00203-1-01 1—0(0,^+02^) , 1 — c o s a . , 7 •> y rs

where 0 = ^—, sin a = Oi +02 +03 , 0 < a < ;r _sin^a

(8) l + o ^ - è ^ - o ' 2oè + 2c 2O0-26 2ab — 2c lac + lb l - f l ' + è ' - c ^ 2 è c - 2 o 2Ao + 2o l - o ^ - Z ; 2 + c ' a^ + Z)' + o^+l

(9)

W^ + a'-Z>2-o^ lah-lcd 2ac + 2bd 2ab + 2cd d'^-a^ + b^-c"- 2bc-2ad

(14)

A = ( / + S ) ( / - S ) - ' (3.1) where / is the unit matrix and S the skew-symmetric matrix (sometimes called the Rodrigues' matrix):

/ 0 - 0 +b\

S = i+c 0 - o (3.2) \-b + 0 0/

Proof of the orthogonality of A is obtained by equating the transposed of A to its inverse, which is a necessary and suflScient condition for the orthogonality of A. Matrix multiplication of Cayley's formula gives matrix (8) of table 1.

Substitution of o = A tan \a, b = ^ tan \a and o = v tan \a. brings this matrix in the form (6).

This means however, that matrices of type (8) will become undetermined if a = 7r. This situation would occur for instance if the transformation consists of just one rotation around one of the coordinate axes.

Accepting this limitation of matrices of type (8), a variant to this type may be introduced which is slightly more symmetrically built.

A fourth parameter d can be introduced, so that a^ + b^ + c^ + d^ = l^ where /^ can be interpreted as a scale factor through which each of the elements of the matrix has to be devided in order to give the determinant of the matrix the value 1. It is of course possible to choose that value for d which makes /^ = 1.

The orthogonal matrix can in this case be computed from the matrix product:

A =\{dl + S){dl-S)-'- (3.3)

The result of this multiplication is given as matrix (9) of table I.

Euler has already mentioned this matrix, only with the opposite sign for d, Cayley derived it using the skew-symmetrical matrix (3.2).

There exists a very simple relation between this matrix and matrix type (4). Substituting o = \^A c = i^/C

b = yB d = i^D

matrix (4) follows directly from matrix (9). In this case the factor a^ + b^ + c^ + d^ becomes equal to unity.

From this review it becomes clear that none of these matrices is a very convenient one if its parameters have to be computed from sets of known coordinates, as the relations which express their elements as functions of the parameters are non-linear.

(15)

Consequently, iterative methods have been developed for obtaining orthogonal matrices from coordinates. As approximate values for the elements of the matrix are rather difficult to obtain, most iterative methods are practical only if the transfor-mation involves small rotations, in which case the unit matrix may be chosen as a first approximation. Another complication of such iterative procedures is that the orthogonality properties are no longer valid for the linearized transformation for-mulae. Hence special care has to be taken to keep the matrix orthogonal from iteration step to iteration step.

It is understandable therefore, also from a practical point of view, that investiga-tions aimed at the derivation of direct solution methods for orthogonal transforma-tion matrices have been carried out. These will be outlined in the next sectransforma-tion.

4. Direct solution methods for three-dimensional orthogonal transformations

Historically the first method published was by Thompson [3]; Schut however gave a more elegant derivation of this method [4] and later improved and extended it [5]. A combination of these versions will be discussed here.

In general, a three-dimensional orthogonal coordinate transformation contains not only the three parameters of the orthogonal transformation, but also three shifts and a scale factor.

In this method, it is assumed that the scale factor has been already computed (for instance by comparing distances) and that two conjugate points have been brought into coincidence in order to eliminate the shifts. This common point will act as a common origin for the two coordinate systems.

Of the various approaches, the simplest is to apply the orthogonal matrix of type (9):

X = Ax (4.1) in which X and x are known position vectors (matrices consisting of one column

only) on the two coordinate systems with components X, Y, Z and x, y and z respect-ively.

Making use of Cayley's formula, equation (4.1) can be written as:

X =idI + S) (dl-Sy^x (4.2) d may be chosen so that the determinant of the orthogonal matrix is equal to unity.

As the transposed of an orthogonal matrix is equal to its inverse, premultiplication of both sides by the transposed yields:

(16)

or, after premultiplication of both sides by (dl + S):

idI-S)X = idI + S)x (4.4) This matrix equation is equivalent to three homogeneous equations in the four

un-knowns, a, b, c and d.

These equations are obtained if we write (4.4) as ^ d c

- 0 d o I • ( F I = ( 0 d -a\ • \ y \ (4.5) b —a

and perform the matrix multiplication on both sides of (4.5). We then obtain:

-{Z + z)b + {Y+y)c+{X~x)d = 0

{Z + z)a -{X+x)c + {Y-y)d =(i (4.6) -{Y+y)aJr{X+x)b +{Z~z)d=0

The rank of the coefficient matrix of equations (4.6) is two, as can easily be checked. The equations (4.6) form therefore a system which is not only homogeneous, but also dependent.

Before proceeding to the solution of the unknowns from the equations (4.6) we will first give another method for deriving these equations, making use of quaternion algebra.

A real quaternion q is defined by:

q = d+ai + bj + ck (4.7) As this definition shows, a quaternion can be considered as the sum of a real number

and a vector with real components. Quaternion addition, subtraction and multipli-cation by scalars follow the rules that are valid in vector and matrix algebra. To quaternion multiplication, which is distributive, the following rules should be applied:

f =f =k^ = -\

U = k, jk == i, ki = i ji = -k, kj = -i, ik = -j

A unit quaternion is one for which a^ + b^ + c^ + d^ = 1 and the reciprocal of a unit quaternion, denoted by g " ^ is equal to its conjugate: q~^ = d—ai — bj — ck.

A rotation in three-dimensional space about a line through the origin is expressed in quaternion algebra by:

(17)

Here X = Xi+Yj + Zk and x = xi+yj + zk are the previously defined known position vectors, q is the quaternion q = d+ ai + bj + ck and g ~ Ms the reciprocal of q.

Postmultiplication of both sides of (4.8) by q gives:

Xq=qx (4.9) This equation expresses the equality of two quaternions. Therefore their

correspond-ing components are equal. Thus this quaternion equation contains four equations, which are obviously linear and homogeneous, in the parameters o, b, c and d.

By performing the two multiplications, equating corresponding elements and collecting the terms with the same parameters, the following four equations are obtained

-(Z+z)b+(Y+y)c+(X-x)d = 0

(Z+z)a -iX+x)c + (Y-y)d =0 (4.10) -(Y+y)a + iX+x)b +{Z-z)d=0

{X-x)a + {Y-y)b+{Z-z)c = 0

The first three of these equations are the same as obtained by the approach using Cayley's formula (4.6), the fourth one is obtained only using quaternion algebra. The fourth equation is of course dependent on the other three, it can be obtained as a linear combination of these three.

Each point other than the origin will give rise to four of these equations. The minimum number of independent equations necessary for the computation of the unknowns is three, which means that the coordinates of two points other than the origin are necessary for the computation of the unknowns.

As the equations are homogeneous and as the four unknowns are not independent, one unknown may be chosen equal to unity and the other three can be computed. Afterwards the elements of the orthogonal matrix are scaled by deviding them through

l^ =a^ + b^ + c^ + d^

If the coordinates of more points than the required minimum number are known in the two coordinate systems, an adjustment is possible. For each point the four equa-tions (4.10) are used as correction equaequa-tions. It does not matter that these equaequa-tions are linearly dependent, as the purpose of an adjustment procedure is anyhow, to make a number of linearly independent correction equations linearly dependent, by correcting the observations.

The advantage of using all four equations (4.10) is, that all available coordinates are used in the same way and that the normal equations take a very regular form. The normal equations are obtained by premultiplication of the correction equations (4.10) by the transposed of their coefficient matrix (c.f. [5]).

(18)

The ratios of the unknowns can be solved from the resulting system.

In his procedure. Schut first eliminates the unknowns a and b. Then d is chosen equal to unity and c is computed. If o becomes very large, then o is chosen equal to unity and d is computed. If o and d are known (or strictly speaking, if the ratio of o and d is known) then a and b are computed. Finally the orthogonal matrix of type (9) is calculated.

A few disadvantages are inherent in this method. First of all, the solution breaks down, as Schut shows, if the rotation angle a about the line through the origin is n, as the parameters a, b, c and d are proportional to A sinâ, /( sinâ, v sinâ and cosâ respectively. Here X, /i and v are the direction cosines of that line through the origin.

Secondly, the determination of the shifts and of the scale factor are not in accor-dance with the method of least squares. For the shifts this could easily be remedied, but for the scale factor the matter is more complicated, as will be shown in the treat-ment given in the next chapter.

Finally a formulation of the elements of the orthogonal matrix as functions of the observed coordinates is hardly possible, which anyhow for several applications should be considered as a disadvantage.

An entirely different approach is the method developed by Van den Hout [6]. Without having to choose independent parameters for the orthogonal matrix, the nine elements of the matrix, the scale factor and the three shifts are computed directly, considering the orthogonality relations among the elements of the matrix, from the necessary seven relationships among coordinates in the two systems. Here also the computation of the scale factor is a weak point. The method was designed however to provide approximate values for the transformation parameters as starting data for an iterative procedure and it certainly fulfils its purpose. The method is basically one of elimination and resubstitution and does not lend itself to an extension to over-determined problems.

Its major point of interest however lies in the fact that it proved to be possible to manipulate the non-linear proporties of orthogonality so that a numerical result was obtained. This was so encouraging that it led to the conviction that a general procedure could also be derived without having to select independent parameters.

(19)

Chapter II

A GENERAL PROCEDURE FOR THE CALCULATION OF OVERDETERMINED ORTHOGONAL TRANSFORMATIONS

1. Introduction

From the review in the preceeding chapter we see that a number of procedures are possible for the calculation of (especially) three-dimensional orthogonal transforma-tion matrices. In particular, Schut's method for the solutransforma-tion of overdetermined trans-formations is quite a satisfactory one for many applications. Nevertheless we intend to developed yet another procedure. This then should have practical and/or theoretical advantages over other methods.

Reviewing the inadequancies of existing methods, we may list possible improve-ments as follows:

- a solution should be possible without exceptions,

- if an adjustment procedure is applied, all parameters of the problem should be solved by means of that procedure,

- the parameters of the transformation should be formulated as rational functions of the observations.

The procedure which will be derived subsequently will not entirely satisfy these con-ditions, as it does not yield an expression for the scale factor as a rational function of the observations. On the other hand, the procedure is a general one for «-dimen-sional transformations from which special formulae can be derived for the most common cases, the three- and two-dimensional ones. In this way relations are estab-lished between formulae for orthogonal transformation matrices of all dimensions.

The known coordinates from which the transformation matrices are to be com-puted are either observed quantities or are calculated from observed quantities. They therefore are subject to the normal errors which are inherent in any set of observations. In order to minimize the influence of such errors the transformation matrix is cal-culated from more observations than are strictly necessary, thus giving a "best fit" for the transformation of one set of points onto the other.

The method to be applied to the current problem is the method of least squares, which method minimizes the sum of squares of residual discrepancies. In this problem the discrepancies between the coordinates in the first system and the coordinates in the second system after transformation into the first one will be minimized. (The for-mulae are given in the next section).

The minimum problem which is thus obtained is solved in the normal way by differentiation of a function with respect to each of the unknowns successively and by equating these derivatives to zero.

(20)

However, as the minimum problem contains all « elements of the orthogonal matrix, the ^«(« + 1) independent relations between these elements have to be con-sidered. This is possible by treating the minimum problem as one with subsidiary conditions or constraints.

The ^«(«+1) relations between the transformation parameters then have to be added to the given function, each multiplied by an unknown, a Lagrange multiplier.

2. The function to be minimized and its derivatives

We will derive the desired function and calculate its derivatives only for the three-dimensional transformation; from the general appearance of the formulae it will become clear however that this is not an essential restriction, the resultant formulae also hold true for n-dimensional transformations.

Introducing residual discrepancies u,-, as mentioned in the previous section we chose as transformation formulae:

J l i + Ui, = O l l X i , + 012^2,+ Oi3X3, + è i

J21 + f2l = 021^1,+ 022:>C2; + «23^3( + ^ (2-1) y^i + Vii = 031X11 + 032X21 +033X3, +b^ (/ = 1 ...k)

These equations are conventionally called the correction equations. The suffix / stands for the point identification, the 0^ are the elements of the orthogonal matrix A in which a scale factor J is included. The coordinates x,-, and y,, are respectively the coordinates before and after transformation. The i,- represent shifts of the origin of the Xii system.

Because of the above chosen definition for the matrix /I, it is strictly speaking no longer an orthogonal matrix. Its properties are now:

\A\ = +s^ (or, in general \A\ = s")

A' = sU-' (2.2) A'A = s^/

AA' = sH

The determinant of A is defined to be positive, as we will only consider proper ortho-gonal transformations .The scale factor s is also by definition positive.

The function to be minimized may now be formed. It is basically:

E=Y.(vli + v^i + vid (2.3)

' = i

(21)

k

^ = E {(->'u + «ll^H + «12^2i + «13^3i + ^l)^ +

/=1

+ ( - y 2 i + «21^1( + «22^2i + fl23^31 + M ^ +

+ ( - y 3 i + «31^n + a32^2( + «33^3/+^3)^} +

+ lll(fllJ + a2? + 03?-S^) + /l22(fll2+«22 + « 3 2 - s ' ) + + '^33(«13+«23 + fl33-S^) + 2A23(«12«13 + «22«23 + «32«33) +

+ 21i3(Oiiai3 + 02ia23 + a3i033) + 2Ai2(OiiOi2 + a21«22 + «31«32'' (2-4) Here Au, A22. -^ss; 2A23, 2A13 and 2Ai2 are the Lagrange multipliers.

The function (2.4) contains thirteen variables and six unknown multipliers: 9 elements a^ of the transformation matrix

3 shifts bi 1 scale factor 5 6 multipliers A,y

To determine the value of the unknowns for which (2.4) attains a minimum, we have to differentiate this function £ with respect to each of the thirteen variables successively and to equate each of these derivatives to zero. Together with the six constraints we obtain a system of nineteen equations for nineteen unknowns, from which the latter may be solved.

The partial derivatives of E can be simplified if we choose the origin of the Xi (/ = 1,2,3) system in the centre of gravity of the given set of A; points, making:

k k k

Z J ^il — 1 J •"•21 — 2 J ^31 ~ ^ 1=1 ; = i 1=1

Differentiation with respect to the shifts b, (/ = 1,2,3) then yields immediately

t l = b2 = b3 = 1 ~k 1 k 1 k

Z

yu

1=1 k ( = 1 Z y^i 1 = 1 (2.5)

(22)

Differentiation with respect to the scale factor 5 gives:

All + ' ^ 2 2 + A 3 3 = 0

And differentiation with respect to the nine elements of A finally results in:

(2.6)

-^11+ Z^u]«ll+('^12+ Z^l(^2;j«12+Ul3+ Z^ll^3()«13 = Z^UJ'II

Ml2+Z^U^2ij«U+U22+ Z ^2^1 ) « ) 2 + U 2 3 + Z ^21^31 ]«13= Z^2l3'l( (2.7a)

U 1 3 + Z ^11^31 W1+M23+ Z^2l^3lj«12+U33+Z^3^l)«13= Z^3(3'u

U1I+ Z ^ u ) « 2 1 + U l 2 + Zxu^2lj«22+Ul3+ Z^ll^3/j«23= Z ^1I>'2I

Ui2+Z^ll^2(j«2I+U22+Z^2i)«22+U23+ Z^2(^3l]«23= Z ^2(J'21 (2.7b)

'^13+Z^U^3(j«21+U23+ Z ^21^31 j«22+('l33+Z ^3^1 j«23= Z ^31^21

'^Il+Z^u)«31+Ul2+Z^11^2ij«32+Ul3+Z^11^3()«33 = Z^ll3'31

^^12+ Z^11^2;j«31+U22+ Z^2^(ja32+U23+ Z ^2(^31 j«33 = Z ^2;>'3i

-^13+ Z ^11^31 ]«31+U23+ Z ^2/^^31 J«32 + U33+ZX3^()«33= Z ^31>'31

(2.7c)

Investigation of the equation (2.7) shows, that the coefficient matrices of the three groups of three equations are the same.

Substituting: r k k k "^ Z •"•11 Z •"ll-"2l Z •''•ll-"3i ( = 1 1 = 1 ( = 1 L = ^ • ^ l l A i 2 A i 3 A12A22A23 /l-13A2 3'^33^ , X = and k k k Z_,XyiX2i 2-1^21 1^X21X31 / = i ; = i ; = i k k k 2 J Xi,X3, 2 J ^2l-'^3/ Z -"3; v = l 1 = 1 1=1 J

(23)

f k k k -^ Z ^11^11 Z ^2(^11 Z^3I>'U ( = 1 11 1 = 1 k 1 = 1 k T= ^Xuyil Z^2(J'2/ Z^3I>'2I 1 = 1 1=1 1=1 k k k Z^ll>'31 Z^'2iy3; Z^3(J'3( V^l=i 1 = 1 1=1 J the formulae (2.7) can be combined into the matrix equation:

A{L+K) = T (2.8)

in which K and T are known matrices, A and L unknown ones. Or, introducing a symmetrical matrix M with elements Hij

M = L+K we finally obtain:

AM = T

(2.9)

(2.10) According to (2.6) the trace of matrix L has to be zero, thus it follows from (2.9) that the trace of matrix M has to be equal to the trace of K; hence:

/^ll+A'22+/^33= Z (^l^ + ^2^1 + ^3^i) 1 = 1 or, substituting Pi = Z (Xll + Xil + X3l), 1=1 / ^ l l + M 2 2 + / ' 3 3 = Pi (2.11)

This last relation will prove to be of the utmost importance for the solution of the problem, especially for the determination of the scale factor s.

As the three shifts b, (/ = 1,2,3) have been solved from (2.5), we are left with six-teen unknowns: the nine elements a^j of A, the scale factor i' and the six elements Hij of M the latter having replaced the six Lagrange multipliers. These unknowns have to be solved from the matrix equation (2.10) (equivalent to nine equations), equation (2.11) and the six constraints.

(24)

[ At this point we may conclude however, that equation (2.10) holds irrespective 1 of the dimension of the transformation and that equation (2.11) is not restricted to ; three-dimensional transformations either. It may be written as:

n k n

I Z /'ü = Z Z •"» = Pi

1 1 = 1 1 = 1 i = l

(2.11)

I In generally we may solve therefore the «^ elements of matrix A, the scale factor s and I the i « ( « + l ) Lagrange multipliers from the «^ equations contained in (2.10), from • (2.11) and from the i«(«+1) constraints.

f 3. Elimination of the matrix A, determination of the scalefactor [ In order to determine the scale factor s we will first eliminate matrix A. \ Transposing (2.10) we obtain, as M is symmetrical,

I MA' = T'

1. and A can be eliminated if we premultiply (2.10) by (3.1) [ MA'AM = T'T

'• so, because/I'/l = 5^/: '[ s^M^ = T'T

for convenience we may put: ? T T =D

(3.1)

(3.2)

(3.3)

(3.4) I where the known symmetrical matrix D has elements djj.

\ Matrix sM could now be obtained as the square root of matrix D.

Once sM is known, the scale factor s can be computed as the ratio of the traces f of sM and M, as the latter trace is known to be p^, according to (2.11).

I The square root of a matrix is in general not uniquely defined. However, matrices \ M and D have a few special properties which make an unique solution possible. ; First of all, both matrices are symmetrical, which considerably reduces the number 1 of possible solutions. Moreover, we will now prove that matrix M has to be

posi-tively semi-definite. Finally we will prove that in that case there is only one solution [ for matrix M.

(25)

the constraints, whose contribution to the value of £ is zero, are added. The value of E will therefore be definitely non-negative. So function £ will certainly have a mini-mum among its stationary points.

We will now investigate how the minimum of £ can be distinguished from its other stationary points.

We consider a point o,jo {i,J = 1 ... «) at which function £ is stationary, that is, a point at which all first partial derivatives of £ vanish. The occurrance of an extreme value is connected with the question whether the expression

E(aijo + Cij)-Eiaijo) (3.5) has or has not the same sign for all sufficiently small values of o.^- (i,j = 1 ... «).

If we expand (3.5) by Taylor's theorem, we obtain, as all first derivatives vanish, £(«Ü0 + C;;)-£(«Ü0) = ^ V ^ ^ ' • l / / t l + ^2 ( / , M , / = 1...«) (3.6) In this particular case, the remainder R2 vanishes because function E does not contain terms with an order higher than two.

From this we see that in a sufficiently small neighbourhood of the point 0,^0 the behaviour of the functional difference (3.5) is essentially determined by the expression

The function (3.7) is a quadratic form in the o,-j-. The coefliicients of this form are the second partial derivatives of function £ with respect to the o,y. We assume that the coefficients of (3.7) do not all vanish.

Function £ will attain a minimum in the point O;jo ('.7 = 1 ... «) if the quadratic form £(3.7) assumes positive values only, when the o,j assume all values and vanishes only if all o,j- are zero. The form F is then said to be positively definite. If F vanishes for other values of the 0;y than all c,j- = 0, but otherwise assumes positive values only, then £ i s called positively semi-definite. The point 0,^0 might be a minimum of func-tion £ in that case but is not necessarily one.

The coefficients of the quadratic form F in the current problem are easily obtained by differentiating the first partial derivatives (2.7) of function £ again with respect to the parameters aij.

We find for the three-dimensional case:

f(Co) = /'nCi?+^22Cl2+/i33Cl3+2/il2CiiCi2+2/li30iiCi3+2/l23Ci2Ci3 +

+ ^llC21+/i22C22+/^33C23+2yUl2C2lC22 + 2/il3C2lC23 + 2/<23C22C23+ (3-8)

(26)

This form should be positively semi-definite if function £ is to have a minimum. This requirement leads to a condition which should be fulfilled by the /ly. A quadratic form is positively semi-definite, if its coefficient matrix is positively semi-definite.

This coefficient matrix is:

N = \0 M O \ (3.9) where

/ / ^ U ^^12 / ' 1 3 ^ = U l 2 /^22 f^23

V l 3 ^^23 ^^33,

Matrix N is positively semi-definite if the determinants of all its principal minors are either positive or zero.

This however, means that matrix M must be positively semi-definite as M is one of the principal minors of iV.

We have now proved that function £ may attain a minimum if a. the first partial derivatives of £ vanish;

b. matrix M is positively semi-definite.

These conditions are necessary but not always sufficient. We will later investigate the case were matrix M is singular and determine the circumstances under which function £ does or does not attain a minimum. We will see that the answer to this question depends on the rank of M.

Considering again equation (2.10) AM = T

we now know that A is an orthogonal matrix and M a positive semi-definite sym-metrical one.

Because the scale factor presents a complication, not matrix M itself, but matrix sM can be obtained as the positively semi-definite square root out of matrix D (3.3) (c.f. [7], [9]) as s is defined to be positive.

Matrix D is obtained, according to (3.3) as the product of matrix T and its trans-posed, so matrix D is a positively semi-definite symmetrical one, according to a theorem of matrix algebra.

In this case the positively semi-definite matrix sM is called, according to Gantmacher [7], the arithmetical square root of D and it is denoted by

(27)

As both matrices sM and D are positively semi-definite, their eigenvalues are non-negative. Denoting the eigenvalues of iM by Ö-; and those of D by TJ (/ = 1 ... «) we immediately find the simple relation

Oi = +^T,^0 (3.11) With this result we can now calculate the scale factor s. As the trace of any matrix

is equal to the sum of the eigenvalues of that matrix, we find that n n

trace sM = Z «^i = Z +V'^i (^•12) i = l i = l

As was stated before, the scale factor J is equal to the ratio of the traces of matrices sM an M so, with (2.11)

trace sM 1 " ,

S = TT = — } x/T; (3.13)

trace M pi ,^1

or, in words, the scale factor is equal to the sum of the positive square roots of the eigenvalues of the matrix T'T = D, divided by the quantity p j .

For the calculation of the scale factor it is necessary to calculate the eigenvalues of matrices D and sM. Once the eigenvalues of sM are known, it is relatively simple to calculate matrix sM itself, and by dividing the elements of the latter by s, matrix M can be obtained.

First we calculate the eigenvectors of D. Because D is a positively semi-definite real symmetrical matrix, it will have, if its order is «, n mutually perpendicular eigen-vectors. Furthermore, as sM is the arithmetical square root of D, its eigenvectors will be the same as those of D.

Let C be the matrix whose columns are the eigenvectors of D (or of sM) and let S be the diagonal matrix with the eigenvalues of sM on the diagonal. The eigenvalues should be placed on the diagonal of S in the same sequence as their corresponding eigenvectors are placed in matrix C.

As the eigenvectors of sM are mutually perpendicular, matrix C is an orthogonal one and its inverse therefore exists. Matrix M is then uniquely defined by:

M=—CSC-^ (3.14) s

Proof of the various theorems of matrix calculus which were applied here is omitted, they can be found in, for instance, reference [7].

In case M is non-singular, also A is uniquely defined for:

(28)

However, in case M is singular, the solution of A according to (3.15) is not possible. We will subsequently show, that we must permit M to be singular with rank («—1) as in that case the transformation can still be determined. We will therefore have to derive a formula, where A has a coefficient which can be inverted even if the rank of M i s ( n - l ) .

The proof that matrix M might become singular will follow from a general in-vestigation of the various matrices we have so far introduced.

We will however first develop a special procedure for the determination of the scale factor of the three-dimensional transformation.

The formula which we derived for the scale factor is simple in appearance, but, being deduced from an eigenvalue problem, is rather difficult to handle.

So, although on most electronic computers a subroutine will be available for the determination of eigenvalues, we may still derive a rather simple iteration formula for the determination of the scale factor of the three-dimensional transformation.

The formula for the scale factor of the three-dimensional transformation is, according to (3.13):

S = --(S/T:I+S/T:2 + SJ-^3) PI

where the T; are the eigenvalues of the positively semi-definite matrix D. Taking the square of the above formula we get

S^ = — { T I + T 2 + T 3 + 2 ( V T 2 T 3 + V T 3 T I + V T I T 2 ) } PI

The sum of the eigenvalues of D is known, so

i<^S^---(rfll+rf22 + ^33)^ = ^{VT2T3 + VT3fl+VTlT2} [ Pi ) Pi

Squaring this equation again yields:

\W -—Ad^^+d22 + d3^\ =-T{T2T3+1^3fl+'!^l'r2 + 2\/TiT2T3(VTi+VT2 + V'!^3)} i P\ ) P\

where we may substitute:

<ll = ^ 1 1 + ^ 2 2 + ^33

The sum of the products of the eigenvalues of D is equal to the sum of the second order principal minors of D :

(29)

The square root of the product of the three eigenvalues of D is equal to the product (which is non-negative) of the eigenvalues of T:

VTIT2T3 = IT]

and finally the sum of the square roots of the eigenvalues of D is equal to PiS. So:

i ^ s ^ _ J _ ^ L = ^ { « 2 + 2 p i s | r | } Pi J Pi

or, drawing the square root twice

Putting s = j\qi+2 \q2 + -^s\T\ Pi ^ Pi Pi

qi=-^1u «2 =-5^2 and | r | = ~ i r |

Pi Pi Pi we obtain finally: s = ^q,+ 2-Jq2 + 2s\T\ (3.16) Using this formula an iterative calculation of s is possible. The procedure is even an

exact one if T is singular. The speed of convergence of the procedure is quite high, as experience has shown. Moreover, it requires certainly fewer operations than the eigenvalue procedure, which is also iterative.

4. The structure of the matrices K, T, L and M

Until now, we have introduced, besides the unknown orthogonal matrix A, the known matrices K and T and the unknown matrix L. In addition we have denoted the sum of K and L by M.

In the previous section we have already found that matrix M has to be positively semi-definite, but still other properties of the four matrices concerned must be in-vestigated.

We will first investigate matrix L.

Restricting ourselves again to the three-dimensional case, we may write all 3 x /r equations (2.1) as:

^ J l l + « ' l l , . . . , 7 l i k + y i t \ / « l l «12 «13^1 y2l+V2l,..;y2k + V2k I = «21 «22 «23*: \y3l+V3l,--;y3k + V3k/ \ « 3 1 « 3 2 « 3 3 *

(30)

Postmultiplication of (4.1) by

r

X i i X21 X31 1 ^ \Xlk X2k ^3)t 1 ^ yields, with k k k Z Xi, = Z X2l= Z ^31 = 0 1 = 1 1 = 1 1 = 1 f k k k k k k k k ^ Z ^ ' l l ) ' u + Z ^ l l ' ' U ' Z^'21>'ll+ Z ^ 2 l ' ' i ; . Z ^ 3 i } ' l l + Z ^31^11' Z J ' H + Z ' ^ H ( = 1 1 = 1 1 = 1 1 = 1 1 = 1 1 = 1 1 = 1 1 = 1 k k k k k k k k

T,Xuy2l+Y^uV2h Z ^21^2/+ Z ^2(1^21. Z ^3/}'2i + Z ^3if21. Z ) ' 2 1 + Z ' ' 2 I ( = 1 k 1=1 1=1 k k 1=1 1 = 1 ( = 1 k 1 = 1 1 = 1 k k k k Z^'ll3'3l+ Z^ll'^'31' Z^2l}'31+ Z^2)f3(. Z^3(J'31+ Z ^31^31. Z>'31+Zl'31 \ . 1 = 1 1=1 ' = 1 1=1 ' = 1 1=1 1=1 1 = 1 > ' ( = 1 ( = 1 •"«11 « 1 2 « 1 3 * 1 " " « 2 1 « 2 2 « 2 3 l>2 ^ « 3 1 « 3 2 « 3 3 bi^ f k k k Z-'^ll' Z-'^ll^2(' Z^ll-'^3/' 0 1 = 1 ( = 1 1 = 1 k k k J. 2_, JC1/X21, 2 ^ X2/, 2 ^ X2l-'^3i' 0 1 = 1 1 = 1 1 = 1 Z ^11^31' Z-''^21-''^3I' Z ^31' 0 k k L ^11^31' Z. 1 = 1 1 = 1

L 0'

1 = 1 0, 0, kj (4.2)

If we perform the matrix multiplication of the right hand side of (4.2) and use the rule that if two matrices are equal, all corresponding elements are equal, we find by equating the elements of the last columns:

k k

Z>'ii+Z'^ii = ''^i

1 = 1 1 = 1 k k Z j ' 2 l + Z ' ^ 2 l = fc&2 (4.3) 1 = 1 1 = 1 k k Z 3 ' 3 l + Z ' ' 3 I = 'fb3 1 = 1 1 = 1

(31)

If we compare (4.3) with (2.5) we obtain:

k k k

Z ' ^ i i = Z'^2;= Z 1^31 = 0 1=1 1=1 1=1

which is a well-known property of linear transformations. Moreover, we find from (4.2) by substituting

/<• * k k ~\ Z •"^U'^ll' Z •'^2('^11' Z •'^3(''ll 1 = 1 1 = 1 1 = 1 k k k Y,X^,V2„ Z^2l'^2i, Z^3/f21 =V 1= 1 k ( = 1 k Z-''-ll^3(' Z •*'2;'^3I' Z-''•3l''3l \^1 = 1 (= I 1=1 J T+U=AK (4.4) Earlier we had found the relation (2.8):

T = A(L + K)

Substitution of (2.8) in (4.4) gives the relation between the matrices U and L:

AL = -U (4.5) This result shows that if the two sets of points in the two coordinate systems fit exactly

onto each other and therefore all discrepancies v will be zero, then U and consequently L will be zero matrices. So, if we know that there exists an exact orthogonal relation-ship between the two sets of points, the orthogonal transformation matrix A may be computed from AK = T.

But, if the coordinates depend in some way or other on observations and are there-fore subject to observational errors, the relationship will not be an exact orthogonal one.

However, we may assume that the observational errors are small compared to the observed quantities. This means that the elements of L will be small compared to the elements of K. Matrix L may thus be interpreted as a correction of matrix K, necessary to ensure the orthogonality of matrix A. Which was exactly the reason for the introduction of the Lagrange multipliers.

Furthermore we will investigate the matrices K and T, which are of course quite similar.

(32)

They were introduced as: K = X i i , . . . , X n •X21, . . . , X2ii X i i X21 X31 = XX' (4.6) and ^ ^ 3 1 , . . ., X3^ ^ \_Xi,^ X2k X3fc ^ Xii X21 X31 O n , ....Jt'u"^ T = J 2 1 , . . . . 7 2 *

^ysit •••! y^k ^ _V.-^u

= YX' (4.7)

' 3 * ^

First of all we notice that matrix K is obtained as the product of a matrix by its trans-posed. So matrix üT is a positively semi-definite one.

K is positively jo/wi-definite if its determinant might become zero. Determinant K is zero, according to Gramm's theorem, if the column vectors Xi,, X2i and X31 (l=l...k) of the matrix X' are linearly dependent. But these column vectors will be linearly dependent if there are not more than two independent row vectors amongst those ofX'.

However, if this is the case, then the set of points forms a subspace of the three-dimensional space. This subspace may be two- or one-three-dimensional. In the first case, all points lie in a plane, in the second one on a straight line.

We may thus conclude that in general the rank of K will be n. However, if the set of points forms an («—l)-dimensional subspace of the n-dimensional space, the rank of K will be («—1). Thus we find that the rank of K is equal to the dimension of the subspace formed by the set of points involved in the transformation.

Our conclusion must be that, as the transformation of a plane in a three-dimensional space or of a line in a plane is a very normal procedure, we should allow matrix K a loss of rank of one. A greater loss of rank causes the transformation to become un-determined.

As matrices K and T have matrix X' as a common factor, it follows that if K is singular, also T will be singular. Moreover, from (2.10) it is clear that if T is singular also M will be singular, as A is obviously not singular.

Finally, if M is not singular, the sign of | r | will be equal to the sign of |i4|, but as A is defined as a proper orthogonal matrix, the sign of | T | will be positive if T is not singular. This can also be seen from (4.7): the matrices X and Y are, except for rotation and possibly scale, virtually the same. Therefore the sign of |K| and \T\ will differ only if the transformation involves also a reflexion (a possibility which we have excluded in this treatise).

(33)

We see therefore, that the ranks of the matrices K, T and M are equal and that they might be (n—1) in an n-dimensional transformation. So we have to stipulate that the matrix A can also be determined in this case.

So the solution

A = TM-' (3.15) is inadequate for calculating A.

5. Determination of matrix A

If-In section 3 of this chapter we have already stated that we will have to derive a for-mula in which the coefficient of matrix A is invertable even if the rank of matrix M is equal to (n— 1). So the coefficient of ^ in such a formula should be either a scalar which does not vanish for \M\ = 0 or a matrix which has rank «, even if M has rank («—1). We can derive such a formula from the characteristic polynomial of matrix M. Having obtained this formula we will then derive special formulae for two- and three-dimensional transformations. In these cases we will see that it is possible to eliminate matrix M altogether, so it does not have to be calculated.

The characteristic polynomial of M is

M''-p,M''-'-p2M'--'-...-p„_,M-p„I = 0 (5.1) The Pi in this equation are the sums of the respective principal minors of M, with

their proper signs. The quantity Pi is of course the same as the one introduced in (2.11).

As the quantity/;„ is equal to (— 1)""'|M|, equation (5.1) may be written as:

M ( M " - ' - p i M " - ^ - . . . - p „ _ i / ) = ( - l ) " - ' l M | / (5.2) We find the relation:

M " - ' - p , M " - ^ - . . . - p „ _ i / = ( - ] ) " - ' M * (5.3, were M* denotes the adjugate or cofactor matrix of M.

If we now premultiply equation (5.3) by A, we obtain an expression for p„_iA. This gives:

p„_,A=A(M"-'-p,M"-'-p2M"-' -...-p„_2M) + i-\rAM' (5.4) But, AM = T (2.10) and for AM* an expression independent of ^4 can be found.

(34)

Forming the adjugate of both sides of (2.10), we get: but: so: M"A' = T' A' = \A\A-'=s"^A' = s"-^A' s

s"~'M'A'^r

or, after transposition:

AM' = ~ T" (5.5)

Substitution of (2.10) and (5.5) in (5.4) finally yields:

p „ _ i / l = r ( M " - ^ - p , M " - ^ - P 2 M ' " " ' - . - - P „ - 2 / ) + ( - l ) ' ' ^ ^ 2 ^ * ' (5.6)

As we had presupposed that the rank of M is not less than («—1), the/?„_ i is different from zero. Matrix A is therefore defined by formula (5.6).

For the determination of the polynomial in M in formula (5.6) the method of Faddeev for simultaneous computation of the coefficients of the characteristic polynomial and of the adjugate matrix may be applied. The method is presented here, for proof refer to publication list [7].

The formulae are:

M i = M p i = t r M M2 = MiVi P2 = i t r M 2 Ni = M i - p i / N2 = M2-P2l M„_2 = MiV, "-3 Pn-2 = ; ^ ^ t r M „ _ 2 N„_2 = M„_2-P„-2^ M„_i=MiV„_2 p„-i n-l trM„_i N „ _ i = M „ _ i - p „ _ i J M„ =MiV„_i p„ = - t r M „ N„ =M„-p„I = 0

(35)

The polynomial N„ is the characteristic polynomial of M, the polynomial V„_i is equal to minus (for even «) or plus (for odd n) the adjugate of M and the polynomial iV„_2 has to be computed for formula (5.6).

It is clear that, once M is known, the computation of iV„_2 is relatively simple. Only to obtain formula (5.5) the fact that A is proper orthogonal matrix has been used. It is therefore quite simple now to obtain a formula for improper orthogonal transformation matrices: in formula (5.6) only the sign of the factor s"*"~2'j*' has to be changed.

We will now proceed to apply formula (5.6) to two- and three-dimensional trans-formations. For the two-dimensional case this is merely to show that the well-known formulae are obtained and for the three-dimensional one we will derive special for-mulae for the computation of orthogonal transformation matrices.

6. Derivation of a formula for the transformation matrix of the two-dimensional

orthogonal transformation

Here « is two, so formula (5.6) yields:

PiA = T+T" (6.1) Matrix T is in this case:

^ k k -^ Z ^ l l > ' l l Z^2I>'U 1 = 1 1 = 1 T = k k Z^ll>'21 Z^2I>'21 V = l 1=1 _y

and the transposed of its adjugate is consequently:

r k k -\ Z^2lJ'2I - Z ^ l l > ' 2 1

**r*' =**

1= 1 k 1 = 1 k -Z^2l3'u Z^ll)'ll V^ 1 = 1 1 = 1 J

Adding T and T*' we obtain for the elements oi A:

«11 = « 2 2 = — i Z ^ l l ) ' l l + Z^21>'2( \ Pi (l=l i = l J « 1 2 = - « 2 1 = — i Z ^ 2 i y i l - Z^ll>'21 Pi ( l = l ( = 1 (6.2) (6.3)

(36)

These formulae are indeed the usual formulae for the two-dimensional orthogonal transformation. It is interesting to note that no separate determination of either the scale factor or matrix JVf is required.

7. Derivation of a formula for the transformation matrix of the three-dimensional orthogonal transformation

Here « = 3, so formula (5.6) yields:

P2A=T{M-pJ)--r' (7.1) s

If the scale factor is to be determined by means of the eigenvalue procedure, calcula-tion of matrix M is little extra work. If however, the scale factor is known, or is determined by means of the iterative procedure (3.16), then the calculation of M requires a considerable amount of additional computations. We will therefore derive yet another formula which does not contain matrix M.

If we consider the characteristic polynomial of M for n = 3: M ^ - P i M ^ - p 2 M - p 3 / = O

we find, after premultiplication by A and substitution of AM = T : - ^ ( P i M ^ + P 3 J ) = T ( - M ^ + P2/)

Equation (7.1) can be written as: -A(M'-p2l)=~p,T--r'

(7.2)

(7.3)

After multiplication bypi formula (7.3) is subtracted from formula (7.2), giving

~A{p3 + p,p2)=T{-M' + P2l + pll) + ^r'

or, with M^ = s'^T'T

A = ^ \\TT'T- T(P2 + P?) - ^ T "

P3 + PlP2ls S

(7.4)

(37)

We should ensure that the factor/;3+piP2 does not vanish. We may write:

P3 + PiP2= -PI + PI + PIP2 + P3 (7.5)

Comparing (7.5) with the characteristic equation of M :

+ A ^ - P i A ^ - p 2 A - p 3 = 0 (7.6) we see that (7.5) will vanish if pi = A i.e. if Pi is an eigenvalue of M. However , Pi is

equal to the sum of the non-negative eigenvalues of M. So pi can only be eigenvalue of M, if the other two eigenvalues are both zero. But in that case the rank of M would be one. We have shown earlier that the rank of M should not be less than two, so we may reject the possibility that Pi is an eigenvalue of M, and may conclude there-fore that the factor p3+p,p2 will not vanish, as long as the rank of M remains either three or two.

As we do not want to calculate the matrix M, we should find a way to compute P2 and P3 without knowing M.

The latter is simple, from AM = T (2.10) we see directly that

\M\=\\T\

s and asp3 = +|iVf|

i'3 = - ^ | r | (7.7)

The computation of p2 is slightly more complicated. From Faddeev's method we see that P2 is computed from

P2 = i t r M ( M - p i / ) or

P2 = i t r M ^ - i t r p i M

The trace of M^ is known, a s M ^ = s~^ T'T = s^^D according to (3.3) and (3.4) so

t r M ' = - t r D = - ( r f i i + d 2 2 + ^ 3 3 ) = ^ ^ i

As the trace of M is pi the trace of p i M equals pi^ and we obtain

(38)

(7.7) and (7.8) can now be substituted in (7.4). As a result we have obtained a formula for the orthogonal transformation matrix A in which the elements of the matrix are expressed as functions of observations. These functions are not rational only because the scale factor s cannot be expressed as a rational function of the observations.

8. Second procedure for the determination of matrix A In section 4 we proved that matrix M in our basic relation

AM = T (2.10) has to be a positively semi-definite matrix.

Once this was known, we have in fact applied the theorem which states that every square linear operator (T in this case) can be represented as the product of two linear operators, a proper orthogonal one and a positively semi-definite one. Investigating the proof (c.f. [9]) of this theorem we will obtain yet another procedure for the calculation of matrix A.

Applying the Gaussian transformation on matrix T we can obtain the two positively semi-definite matrices T'T and TT'.

Because T is a square matrix, these two matrices possess the same eigenvalues TJ. Denoting the eigenvectors of T'T by c,- and those of TT' by 6,-, we obtain the relations:

T'T Ci = T,Ci (8.1a) TT'b,= Xibi (8.1b) We assume that the two sets of eigenvectors C; and b,- have been determined such that

they form two orthonormal systems.

For every non-zero eigenvalue T,- we now determine a new vector by means of equation

Tci = Gib, (8.2a) where the a, denote the positive square roots of the T;. We will prove that the b;

in-troduced in (8.1b) are equal to those defined by (8.2a). Premultiplication of (8.2a) by T' yields, with (8.1a)

T'TCf = oCQi = (^iT'ti

and, as ö-i> 0

(39)

Premultiplication of (8.2b) by T gives: cTiTci = TT'hi

or, considering (8.2a)

a?b, = TT'b; = x,b I *• 1 ^ 1

(8.3)

(8.1b) So we see, that the vectors b, defined by (8.2a) are eigenvectors of matrix TT'.

The vectors b,- are orthonormal, because from (8.2a) it follows for two vectors b, andbj-.

<^i(^jb!bj = c[T'Tcj = OiclCj Because the c,- are orthonormal we see that

1 i f / = 7 blbj =

0 if i ^ y

For eigenvalues T,- = 0 vectors C; and b, are determined in the usual way. It follows from (8.1a) and (8.1b) that for such eigenvalues

Tci = 0 T'b: = 0

(8.4a) (8.4b) We have therefore obtained two sets of orthogonal unitvectors which can be grouped into two proper orthogonal matrices

C = ( C i , C 2 , . . . , C „ )

B = ibi, b2,..., b„)

Introducing also the diagonal matrix . . . . n >v

(8.5)

S = (8.6)

(40)

We can write equations (8.2a) and (8.2b) in the form

TC = BS (8.7a) T'B =CS (8.7b) From (8.7a) it follows that, as B is a proper orthogonal matrix:

BTC = S (8.8) In section 3 we have obtained the relation

M=^CSC' (3.14) s

substitution of (8.8) in (3.14) yields:

M = — CB'TCC' = ^ CB'T s s or, as C and B are proper orthogonal matrices:

sBC'M = T (8.9) Comparing (8.9) to (2.10) we find, as the product matrix BC' is again a proper

ortho-gonal one

A = sBC' (8.10) We see therefore, that the required matrix A can be obtained as the product of the

matrices of eigenvectors of the matrices T T' and T' T.

The geometrical interpretation of the matrix A can now be, that it transforms the eigenvectors of the matrix T'T into those of T T ' as

AC=sB (8.11) The value of formula (8.10) for practical calculations is of course rather limited

because it requires the calculation of eigenvalues and eigenvectors of both matrices T'T and TT'.

This derivation contributes however to the insight into the problem of the calcu-lation of orthogonal transformation matrices. It shows for instance, to which entent

(41)

matrix A is defined if matrix T is singular. If matrix T T possesses only one eigen-value equal to zero, an eigenvector belonging to this eigen-value can be uniquely determined, if we require A to be proper orthogonal. If however matrix T'T possesses more than one eigenvalue equal to zero, the eigenvectors belonging to these values are not any longer uniquely defined. Consequently, matrix A is also no longer uniquely defined.

9. Orthogonal transformations and the theory of the adjustment of normally distributed observations

The procedure derived in the previous sections is a purely numerical one for the calculation of orthogonal transformation matrices. No attention has been paid to the influence of observational errors on the results that were obtained, although at the beginning it was stated that the reason for the application of overdetermined trans-formations was the occurrence of such observational errors. The theory of the adjust-ment of normally distributed observations on the other hand is primarily concerned with studies of the propagation of errors in an adjustment problem. It is the purpose of this section to show where the derived procedure deviates from the classical theory of adjustment and to investigate the consequences.

The classical theory of adjustment of normally distributed observations recognizes linear relationships only. The reason is that quantities which are obtained as linear functions of normally distributed quantities, are also normally distributed. As the theory of adjustment was developed to handle those types of observations that are normally distributed, this means that the only kind of distribution encountered in an adjustment problem is the normal distribution. However, not all relationships that occur are linear. This complication is then solved by the linearization of such non-linear relationships by means of an expansion in a Taylor series, of which only the linear terms are considered.

The application of linearized formulae implies that the adjustment will be an itera-tive one. The number of iterations largely depends on the magnitude of the neglected second and higher order terms.

If the problem contains unknowns, initial values for these unknowns are required. For orthogonal transformations, this means that initial, or approximate, values are required for the transformation parameters. For three-dimensional transformations this is a problem which, as may be recalled from chapter I, was solved only around

1960.

It should therefore be considered as a great advantage of any direct solution method that approximate values for unknowns are not required. Even more so, because for the determination of approximate values often special computational procedures are necessary.

This then, is the first and major difference between the procedure derived in this treatise and the theory of the adjustment of normally distributed observations: the