MM i i ■i u i i M I I ■ • I' i " ■ MM I I ,i I i n ., .1. w. !• ' ' ' ' • •:, : ii■■■ ■i - i ■ . • f ! M' i l i i ! ■ : ! ■ ! ' i - • > f ! |,.1 ■ IT I
' i.' i
i : i ' ■ l . . ' i : , I • ' i ; I I I ' ■ I I i ' i r l ' i - i1! , i ,!; i i i - ! ■ ; . : ! I | 11. 11" il .''THE GEOMETRY OF
GEODETIG MYERSE LINE AR MAPPING
;!;■ -
:
AND NON-LINEAR
I , Ml ],.,i.i
I ' l I .
M i ii'
ADJUSTMENT
■I " H I I'.-, . . ' . . . I ' I11 , i M i , ; . i i |- i...
;. IE .r;:ï: M
i' i 1'i-r.i' a
tm
I i ';' I I'- . I '
i|. Mi' I-1..i' I I
I ' I Hi ! 11 ".; II -t 'I'M I Ml I l - ' . r l l - l • I . . , in i ! , ll." !'! ri . ;,'! ' I I . ' i ' Fl.''|-'I ■ " • . ! , H ' I". W " l I ' M I ' j i , i Mi I.. , 11.1 I- h>;|i;-"i'"i| Ml; . .; i.i. ,, i ; 1.1 I T | , I'i I ' | , ' . i ! i - ! M i'l "I - i >'■ ' i '
\'. H ! " i < ifi'i'iMi-a
■■■ : - H ' ' ! • . '■> ; . ' r i . ! 4 ! :;■ ■:;■.';;!■ 1 " . "hi. • , , ! r . ,■ . , ^ I'll I ' i u. !| 'I I l l . 1 1 I ' M . . l l > ^ ■ M : Ü 1 , M ' ! . Ü ,1' I " I '■I ■ t i n ' : ! ' ■ !.! . ] . ' r ;,<■■ « . . I . M1 ' I 'I I' ' ' I I"—'!!"» 'I> " | ■' «-►■■.. « f . ■;:••■■ i 11 • i . i i n ii" ■ ■ •! r | . ' ' ' ! i. .-I ' " I I M l i " ' I ' II I " - I . I r,, . : i ■ ; , : * i i " i : I I i T i i j i " ',' . M'-" I,i|l',|'-i TM..|i!lli»|lM!. A '■*,■.. ' !l ;' i -i ' ' -i ;. ■ Ü ' :■ I-I !' '•!'! . i , .';|
1 " I f r l l l l i n . , I' ' I • KM l - M ' l l ' l i ,1 ■■ i . , 1 . |, I i ' | . ' . , ' . ' i ! ■ I ' I'll l l : I i i ,i I. , i | I-' , ! . , - . ! . II-'HI-'";
;MI Ii
i I I '■ ;■ 11 ■.'ll I . I i M'.. . ' l ! U ! M.!'
! : U " : l - ! l ' |. I , , '. M I I ' , i ■ - ; ' * ■ ' ■ ";c i 1,1,1;., i..; ƒ:?1 i f ^ ^ ' i !
, !
! r , ! t
;
t i
J!H:lH^M
!i:
.
M
;''
b-'
k4riMI" ":
! :'!;.''■:
iflilrlli I I'll • (!■ .jlf .1 i ; | | : , r - i ( , t - ' i i . i ' i i.i i - 1 i | . H ' , . ' ; • . i ' ' . ; ; ; ; ' . , ' . . . I 'i ;.. i :•<PWM& JIG'. i*ËÜNISSËN'i-. •-$:■■ „
; ii"!.,..; • i Mr, H i f|.|*i'h§to
'.-i'?&>:?}'<■-■ f i:-11 i'i' i i i . M ' i | . , i ' » ' ! ■ ■ . » . ■ I I I I , , | , "! ' I. ' I ! I ' i:' I l l ..,|i ;l I | i | : I' I
i:
;,;
'if !f:tv' t.Mfc
■ 'I | .. , | . ■ - , „ , - l , - ! , I ' M ' l i i 'v.1 ' H l r l . l M l h i ' i I'M'. i I . I •"■ i t 1.1 ■I. ■!■: i| |' | M . . ! ! ' , ! M , i l*'M ■iif 11 -I • i I' 11Ji.1 JFtt ' '
i , M i '. i i : i , i n i i i ' i., ' i i . . M' Ml:',], , . I ' i , : , i ■ : • , I I . ; 11 : !, i, 11- ..i: i |i ' , i |:,,i ,;. , ,i«Mjj II .i •! I I !r t R dlss
1446
i i ii i . : i i . ' » ' - M M ! M,l I ' l ' ! ' ■ ! i ' 1.1 "i-, i Mi ■;; ri 11. 'I'*M. li 1 1 t l' I i < IH.'M'H"i'
1"'"!! ^■■""iii'Mf/ipiHv.'.'i
• . • - • V r - ! r i 4 l
i, p . \ r . i . : i r . - H f ' ! l
I , . i ,i-•■' li-' • ■ !'. Mi|!|i i-'.\:< i l . i b"".,in|N M I . I |,( ilUMll,;,,! |.;, I,,-,! -;i|i,i,:l!!| M.i i . i . " , I : , 1 " , ' , l ' l Li • '|.-i (,'i. MI,,:I|;,I ','i, I ' 1 ' ' • I I T ' I I '; I I ' " I ' J . l
' I ' M. .i.n,|, s <.]■"'! - i "
:'"!l;!'|! b!JMii;li
:i:' :|".' '
l"
; i ,, H i ; MU'Ml! !:
t;.|||.:! j|J I'. I; , j . ■ • I, . . . I I . " ' I' I I," l l ' , ■ i " ; ;:■' I '! . i!!'•
' | ; , i : • ■ < " ! | . |' | ' f I'i " I f4 - ; . ' :
-: r L " i i ^ , U :
M,.n ,:,..-> ir. . r ; l
I " ' I ' : ■' ' . i '' , ' , , • " I .'' '' n.ii .- , l i r • ■-* -*# ^
r ^ t
vb
T
\ '
■o ^THE GEOMETRY OF
GEODETIC INVERSE LINEAR MAPPING
AND NON-LINEAR ADJUSTMENT
PROEFSCHRIFT
TER VERKRIJGING VAN DE GRAAD VAN DOCTOR IN
DE TECHNISCHE WETENSCHAPPEN AAN DE TECH
NISCHE HOGESCHOOL DELFT, OP GEZAG VAN DE
RECTOR MAGNIFICUS, PROF. DR. J. M. DIRKEN,
IN HET OPENBAAR TE VERDEDIGEN TEN OVER
STAAN VAN HET COLLEGE VAN DEKANEN OP
DONDERDAG 5 SEPTEMBER 1985, TE 14.00 UUR
DOOR
PETER J.G. TEUNISSEN
GEODETISCH INGENIEUR
GEBOREN TE OWERRI (NIGERIA)
TR diss
1446
Dit proefschrift is goedgekeurd door de
SAMENVATTING
D i t p r o e f s c h r i f t behandelt:
1 ° de theorie van de inverse lineaire afbeeldingen en
2 ° het probleem van de niet-lineaire v e r e f f e n i n g .
Na de i n l e i d i n g , die de m o t i v e r i n g voor de in het p r o e f s c h r i f t gehanteerde geometrische benadering bevat, w o r d t in hoofdstuk II de theorie van de inverse lineaire afbeeldingen behandeld. Er w o r d t onder andere aangetoond dat iedere inverse B van een gegeven l i n e a i r e afbeelding A ondubbelzinnig geka r a k t e r i s e e r d kan worden via de keuze van drie l i n e a i r e d e e l r u i m t e n , die we S, C en V noemen.
Hoofdstuk i n laat hiervan de consequenties zien voor het inversieprobleem bij 2- en 3 dimensionele geodetische n e t w e r k e n . Voor verschillende s i t u a t i e s worden basisvectoren geconstrueerd die de n u l r u i m t e Mu(A) opspannen. Daarna w o r d t het probleem van de aansluiting van n e t w e r k e n besproken. Onder v r i j algemene aannamen betreffende de vrijheidsgraden van de betrokken n e t w e r k e n worden drie a l t e r n a t i e v e oplossingsmethoden gepresenteerd.
Hoofdstuk IV behandelt het probleem van de n i e t - l i n e a i r e v e r e f f e n i n g . Na de probleemstelling en een beknopte i n t r o d u c t i e in de d i f f e r e n t i a a l g e o m e t r i e , w o r d t het convergentiegedrag van de Gausse i t e r a t i e a l g o r i t m e (GM) beschreven. Voor zowel één dimensionele als meer dimensionele gekromde varië t e i t e n w o r d t aangetoond dat het lokale gedrag van GM over het algemeen asymptotisch lineair is. Be l a n g r i j k e conclusies z i j n verder dat het lokale convergentie gedrag van GM, 1 ° . overwegend bepaald w o r d t door de kleinste kwadraten residuen v e c t o r en de u i t e r l i j k e k r o m m i n g van de v a r i ë t e i t , 2 ° . in het asymptotisch l i n e a i r e geval invariant is onder h e r p a r a m e t r i s e r i n g e n , 3 . asymptotisch k w a d r a t i s c h is indien de kleinste kwadraten residuen vector of het orthogonale normaal v e c t o r v e l d B gelijk nul z i j n , 4 ° . bepaald w o r d t door de C h r i s t o f f e l s y m b o l e n van de tweede soort in h e t geval van asymptotisch kwadratische convergentie, en 5 ° . praktisch geen versnelling zal ondergaan door toepassing van " l i n e search strategies" indien z o w e l de u i t e r l i j k e k r o m m i n g en de kleinste kwadraten residuen v e c t o r k l e i n z i j n . Vervolgens geven we condities die globale convergentie van G M naar een lokaal m i n i m u m garanderen.
We l a t e n zien dat voor een bepaald type van v a r i ë t e i t e n , namelijk regeloppervlakken, een belangrijke vereenvoudiging door r e d u c t i e van dimensie m o g e l i j k is. Door toepassing van d i t idee werd een i n v e r s i e - v r i j e oplossing van een niet-lineaire v a r i a n t van de klassieke twee-dimensionele H e l m e r t t r a n s f o r m a t i e m o g e l i j k . Deze niet-lineaire v a r i a n t hebben we de Symmetrische H e l m e r t t r a n s f o r m a t i e genoemd. Bovendien geven we een i n v e r s i e - v r i j e oplossing van de twee dimensionele Symmetrische H e l m e r t t r a n s f o r m a t i e wanneer een n i e t - t r i v i a l e r o t a t i e invariante c o v a r i a n t i e s t r u c t u u r w o r d t aangenomen. Hierna generaliseren we de r e s u l t a t e n naar drie dimensies. In de l a a t s t e paragrafen van hoofdstuk IV geven we enkele suggesties voor de a f s c h a t t i n g van de u i t e r l i j k e k r o m m i n g , geven we bovengrenzen voor de u i t e r l i j k e k r o m m i n g van enkele eenvoudige geo detische netwerken en beschrijven we in het k o r t enkele consequenties van de n i e t - l i n e a r i t e i t voor de s t a t i s t i s c h e behandeling van een vereffening. H i e r b i j w o r d t tevens aangetoond dat de onzuiverheid in
de kleinste kwadraten residuen vector bepaald w o r d t door de gemiddelde k r o m m i n g van de v a r i ë t e i t en dat de onzuiverheid in de kleinste k w a d r a t e n p a r a m e t e r s c h a t t e r s bepaald w o r d t door het spoor van de C h r i s t o f f e l s y m b o l e n van de tweede s o o r t .
In de afsluitende discussie geven we t e n s l o t t e aan welke deelproblemen nog een nader onderzoek v e r eisen.
C U R R I C U L U M VTTAE
P e t e r Teunissen werd op 9 oktober 1957 geboren te O w e r r i (Nigeria). De lagere school doorliep hij in r e s p e c t i e v e l i j k P o r t H a r c o u r t (Nigeria), Willemstad (Curacao) en Paramaribo (Suriname). Na een jaar HBS in Paramaribo, doorliep hij de Atheneum-B in Nederland r e s p e c t i e v e l i j k te Venray, U t r e c h t en Assen. De studie geodesie aan de T . H . t e D e l f t ving h i j aan in 1975. Tijdens zijn studie heeft hij deelgenomen aan het Doppler-projekt Opper Volta (tegenwoordig: B u r k i n a Faso) en aan het o p g r a v i n g s - p r o j e k t . S a t r i c u m (construeren van een meetkundige grondslag voor de akropolis van de 2500 j a a r oude stad S a t r i c u m , 50 k m . t e n zuiden van Rome). Naast de studie was hij a c t i e f in het studentenleven en in enkele bestuursorganen en commissies van de a f d e l i n g der geodesie. Het p r a k t i s c h werk is bij de Topo. D e p t . Shell UK te Londen gedaan. In het l a a t s t e jaar van zijn studie vervulde hij een student-assistentschap op het L a b o r a t o r i u m voor Geodetische Rekentechniek. In november 1980 studeerde hij cum laude af, bij p r o f . d r . i r . W. Baarda. Na z i j n afstuderen t r a d hij als wetenschappelijk medewerker in dienst van de afdeling der geodesie en w e r d hem de mogelijkheid gegeven de studie te v e r r i c h t e n die ten grondslag ligt aan d i t p r o e f s c h r i f t .
In 1981 heeft hij met een stagebeurs van de Nederlandse Organisatie voor Zuiver-Wetenschappelijk Onderzoek (Z.W.O.) een half jaar aan de U n i v e r s i t e i t S t u t t g a r t in samenwerking m e t p r o f . d r . - i n g . E.W. G r a f a r e n d onderzoek v e r r i c h t op het gebied van de geodetische d i f f e r e n t i a a l g e o m e t r i e . In 1984 was hij l e c t u r e r van de I n t e r n a t i o n a l School on Advanced Geodesy ( " O p t i m i z a t i o n and Design of Geodetic N e t w o r k s " ) . H i j is l i d van de Special Study Groups 4.56 ( " D i f f e r e n t i a l geometry of the g r a v i t y f i e l d " ) en 4.60 ("Statistical methods for e s t i m a t i o n and testing of geodetic data") van de I n t e r n a t i o n a l Association of Geodesy.
Sinds september 1985 is hij in het kader van het Constantijn en C h r i s t i a a n Huygens programma als wetenschappelijk medewerker in dienst van Z.W.O.
DANKWOORD
Voor de verkregen ondersteuning bij het onderzoek is de auteur veel dank verschuldigd aan de volgende organisaties:
De Rijkscommissie voor Geodesie voor het toekennen van reisbeurzen,
De Nederlandse Organisatie voor Zuiver Wetenschappelijk Onderzoek (Z.W.O.) voor het verlenen van een stage beurs, en
H e t Geodetisch I n s t i t u u t van de U n i v e r s i t e i t S t u t t g a r t voor de geboden f a c i l i t e i t e n tijdens het v e r b l i j f van de auteur aldaar.
T e n s l o t t e is de auteur Janna B l o t w i j k bijzonder erkentelijk voor de v o o r t r e f f e l i j k e wijze waarop z i j de vaak m o e i l i j k e en " n e r v e - r a c k i n g " t e k s t v e r w e r k i n g van het onderhavige p r o e f s c h r i f t t e r hand heeft genomen. C o l o f o n : T e k s t v e r w e r k i n g & t y p e w e r k : Tekeningen: D r u k : M . J . B l o t w i j k M.G.G.J. J u t t e A . B . Smits Meinema B.V., D e l f t .
T H E G E O M E T R Y OF GEODETIC INVERSE L I N E A R M A P P I N G A N D N O N - L I N E A R ADJUSTMENT SAMENVATTING i i i C U R R I C U L U M V I T A E v D A N K W O O R D v I INTRODUCTION 1
H GEOMETRY OF INVERSE L I N E A R MAPPING
1 . The Principles 10 2. A r b i t r a r y Inverses Uniquely Characterized 13
3. Injective and Surjective Maps 18 4. A r b i t r a r y Systems of L i n e a r Equations and A r b i t r a r y Inverses 22
5. Some Common Type of Inverses and their R e l a t i o n
t o the Subspaces S, C and V 24 6. C - and 5 - T r a n s f o r m a t i o n s 30
m GEODETIC INVERSE M A P P I N G
1 . Introduction 35 2. Geodetic N e t w o r k s and t h e i r Degrees of Freedom 36
2 . 1 . Planar networks 36 2.2. Ellipsoidal networks 42 2.3. Three dimensional networks . 52
3. (Free)Networks and t h e i r Connection 65 3.1. Types of networks considered 65
3.2. Three a l t e r n a t i v e s 68
IV GEOMETRY OF N O N - L I N E A R ADJUSTMENT
1 . General Problem S t a t e m e n t 84 2. A B r i e f I n t r o d u c t i o n i n t o Riemannian G e o m e t r y 87
3. Orthogonal P r o j e c t i o n onto a P a r a m e t r i z e d Space Curve 91
3 . 1 . Gauss' i t e r a t i o n m e t h o d 91 3.2. The F r e n e t f r a m e 92 3.3. The " K i s s i n g " c i r c l e 95 3.4. One dimensional Gauss- and Weingarten equations 97
3.6. Examples 102 3.7. Conclusions 109 4. Orthogonal P r o j e c t i o n onto a P a r a m e t r i z e d Submanifold 110
4 . 1 . Gauss' method 110 4.2. The Gauss' equation 112 4 . 3 . The n o r m a l f i e l d B 116 4.4. The local r a t e of convergence 118
4.5. Global convergence 125 5. Supplements and Examples 134
5 . 1 . The two dimensional H e l m e r t t r a n s f o r m a t i o n 134 5.2. Orthogonal p r o j e c t i o n onto a ruled surface ' 139 5.3. The t w o dimensional S y m m e t r i c H e l m e r t t r a n s f o r m a t i o n 141 5.4. The t w o dimensional Symmetric H e l m e r t t r a n s f o r m a t i o n w i t h a n o n - t r i v i a l
r o t a t i o n a l invariant covariance s t r u c t u r e 145 5.5. The three dimensional H e l m e r t t r a n s f o r m a t i o n and its s y m m e t r i c a l
generalization 148
5.6. The extrinsic curvatures estimated 156
5.7. Some t w o dimensional networks 163 6. Some S t a t i s t i c a l Considerations 166
7. Epilogue 170
I. I N T R O D U C T I O N
This p u b l i c a t i o n has the i n t e n t i o n to give a c o n t r i b u t i o n to the theory of geodetic adjustment. The t w o main topics discussed are
1 ° The problem of inverse linear mapping and
2 ° The problem of non-linear adjustment
In our discussion of these t w o problems there is a strong emphasis on geometric thinking as a means of visualizing and thereby improving our understanding of methods of adjustment. I t is namely our belief t h a t a geometric approach to adjustment renders a more general and simpler t r e a t m e n t of various aspects of adjustment theory possible. So is i t possible to carry through quite rigorous trains of reasoning in g e o m e t r i c a l terms w i t h o u t t r a n s l a t i n g them into algebra. This gives a considerable economy both in thought and in c o m m u n i c a t i o n of thought. Also does i t enable us to recognize and understand more easily the basic notions and essential concepts i n v o l v e d . And most i m p o r t a n t , perhaps, is the f a c t t h a t our g e o m e t r i c a l imagery in two and three dimensions suggests results for more dimensions and o f f e r s us a p o w e r f u l t o o l of inductive and creative reasoning. A t the same t i m e , when precise m a t h e m a t i c a l reasoning is required i t w i l l be carried out in terms of the theory of f i n i t e dimensional vector spaces. This theory may be regarded as a precise m a t h e m a t i c a l framework underlying the heuristic patterns of g e o m e t r i c thought.
In Geodesy i t is very common to use g e o m e t r i c reasoning. In f a c t , geodesy b e n e f i t e d considerably f r o m the development of the study of d i f f e r e n t i a l geometry which was begun very early in history. P r a c t i c a l tasks in cartography and geodesy caused and influenced the c r e a t i o n of the classical theory of surfaces (Gauss, 1827; H e l m e r t , 1880). And d i f f e r e n t i a l geometry can now be said to constitute an essential p a r t of the foundation of both m a t h e m a t i c a l and physical geodesy (Marussi, 1952; H o t i n e , 1969; G r a f a r e n d , 1973).
But i t was not only in the development of geodetic models t h a t geometry played such a p i v o t a l röle. Also in geodetic adjustment theory, adjustment was soon considered as a g e o m e t r i c a l problem. Very early ( T i e n s t r a , 1947; 1948; 1956) already advocated the use of the R i c c i - c a l c u l u s in adjustment t h e o r y . I t p e r m i t s a consistent g e o m e t r i z a t i o n of the adjustment of c o r r e l a t e d observations. His approach was l a t e r f o l l o w e d by (Baarda, 1967 a,b; 1969), (Kooimans, 1958) and many others.
More r e c e n t l y we witness a renewed i n t e r e s t in the geometrization of adjustment theory. See e.g. (Vanicek, 1979), (Eeg, 1982), (Meissl, 1982), (Blais, 1983) or (Blaha, 1984). The incentive to this re newed i n t e r e s t is probably due to the i n t r o d u c t i o n into geodesy of the modern theory of H i l b e r t spaces w i t h kernel functions ( K r a r u p , 1969). As ( M o r i t z , 1979) has put i t r a t h e r plainly, this theory can be seen as an i n f i n i t e l y dimensional generalization of Tienstra's theory of c o r r e l a t e d observations in its g e o m e t r i c a l i n t e r p r e t a t i o n .
Probably the best m o t i v a t i o n for t a k i n g a geometric standpoint in discussing adjustment problems in linear models is given by the f o l l o w i n g discussion which emphasizes the g e o m e t r i c interplay between
best linear unbiased estimation and least-squares e s t i m a t i o n :
L e t y be a random vector in the m-dimensional Euclidean space M w i t h m e t r i c tensor { . , . " ) • We assume t h a t y has an expected value y e M , i.e.,
E{y} = y e M, (1.1)
where E { . } is the m a t h e m a t i c a l expectation operator, and that y has a covariance map
Q : M* + M , defined by Q_ 1y1 = ( y , , . > V y . e M . (1.2)
y y 1 x 1 'M 1
The linear vector space M * denotes the dual space of M and is defined as the set of a l l real-valued (homogeneous) linear functions defined on M . Thus each y e M is a linear f u n c t i o n
y : M ■*■ IR . Instead of w r i t i n g y ( y . ) we w i l l use a more s y m m e t r i c f o r m u l a t i o n , by
* .
considering y ( y1) as a bilinear f u n c t i o n in the two variables y and y-^. This bilinear f u n c t i o n is denoted by ( . , . ) : M x M ■* IR and is defined by ( y , y , ) = y ( y , ) V y e M , y , e M . The f u n c t i o n ( . , . ) is called the duality pairing of M* and M i n t o l R .
We define a linear model as
y e N c M , Q , (1.3)
where W is a linear manifold in Al. A linear m a n i f o l d can best be viewed as a translated subspace. We w i l l assume that W = { y . } + U , where y-i is a f i x e d vector of M and U is an n-dimensional proper subspace of M.
The problem of linear e s t i m a t i o n can now be f o r m u l a t e d as: given an observation ys on the random vector y, its covariance map Qy and the linear m a n i f o l d M, e s t i m a t e the position of y i n N c M . If we r e s t r i c t ourselves t o Best Linear Unbiased E s t i m a t i o n (BLUE), then the problem of linear estimation can be f o r m u l a t e d dually as: given an y e M , f i n d a e IR and y e M such t h a t
s« * * ~
the inhomogeneous linear f u n c t i o n h ( y ) = a + ( y , y ) is a BLUE's e s t i m a t o r of ( y , y ) . The
* ~
sf u n c t i o n h(y) is said to be a BLUE's e s t i m a t o r of ( y , y ) i f , s
1 ° h(y) is a linear unbiased e s t i m a t o r of ( y , y ) , i.e., i f E { h ( y ) } = ( y * , y ) , V y e N , S
and (1.4) 2 ° h(y) is best, i.e.,
Variance { h ( y ) } <_ Variance { g ( y ) } for all linear unbiased
estimators g ( y ) = a + ( y , y ) , a e IR, y e M , of ( y , y ) .
s
From (1.4.1°) follows that
a = ( y - y » y ) > V y e N s
a = ( y - y , y , ) f o r some y e M and ( y - y , U ) = 0 , (1.5)
since M
The set of y E M f o r which ( y , U) = 0 , f o r m s a subspace of M . I t is c a l l e d the annihilator of ü c II and is denoted by (J c M , i.e. ( U , (J) = 0 . This gives f o r (1.5),
* A *
a = ( y - y , y , ) for some y , e N, and y - y e U .
s i 1 s
From (1.4.2°) follows with (1.6) that y e { y } + U must satisfy
(1.6)
( y , Q y ) < ( y * , Q y * ) , V y* e { y * } + Uu .
y y s
(1.7)
I f we now define the dual m e t r i c of M* by pulling t h e m e t r i c of M back by Q , i.e.,
<y . y > „ = < Qvy >Qyy > v y , y e M ,
JK*
y y
MA x X i O
i t f o l l o w s t h a t y e { y } + U must satisfy
<y >y >
MJt< <y .y >
M, v
y e{ y
s}
+u .
(1.8) G e o m e t r i c a l l y this problem can be seen as the problem of finding t h a t point y in { y } + U which has least distance to the origin of M*. And i t w i l l be i n t u i t i v e l y clear t h a t y is found by orthogonally p r o j e c t i n g y onto the orthogonal complement ( U ) of U° (see f i g u r e 1).f i g u r e 1
N o w , before we characterize the map w h i c h maps y into y , l e t us f i r s t present some generalities on linear maps.
L e t M and M be t w o linear vectorspaces of dimensions n and m respectively, and l e t A : N -+- M be a linear map between t h e m . Then we define the image of II c N under A as
A Ü = { y e M | y = A x for some x e U } . (1.9)
The inverse image of V c M under A is defined as
A "1 ( 1 / ) = { X E N I A x e l / } . (1.10)
In the special case t h a t U = W , the image of U under A is called the range space R ( A ) of A . And the inverse image of { 0} e M under A is called the nullspace N a ( A ) of A. I t is easily v e r i f i e d t h a t i f V and U are linear subspaces of M and N r e s p e c t i v e l y , so are A U and A ( V ) .
A linear map A : W -»■ M is injective or one-to-one if f o r every x1 , x e W , x ^ x „ implies t h a t A x , / A Xy. The map A is surjective or onto i f A N = M . And A is called bijective or a b i j e c t i o n if A is both i n j e c t i v e and s u r j e c t i v e .
W i t h the linear map A : N -»■ M and the dual vector (or 1-form) y e M i t follows t h a t the x x x composition y o A is a linear f u n c t i o n which maps W i n t o |R, i.e. y o A e W . Since the map
x x x x
A assigns the 1 - f o r m y o A e W t o y E M we see that the map A induces another linear map, A * say, w h i c h maps M* into N * . This map A * is c a l l e d the dual map t o A and is defined as
X X X
A y = y o A . W i t h the duality p a i r i n g i t is easily v e r i f i e d t h a t
(A*y*,x) = ( y * , A x ) . (1.11)
A n i m p o r t a n t consequence of this b i l i n e a r i d e n t i t y is t h a t f o r a non-empty inverse image of subspace 1/ c M under A , we have the duality r e l a t i o n
( A
_ 1( l / ) )
0= A*(l/°) . (1.12)
N o t e t h a t here the four concepts of i m a g e , inverse i m a g e , a n n i h i l a t i o n and d u a l i t y come together i n one f o r m u l a . For the special case t h a t 1/= { 0} the r e l a t i o n reduces t o N u ( A ) = R ( A ) .
Maps t h a t play an i m p o r t a n t role in l i n e a r e s t i m a t i o n are t h e so-called p r o j e c t o r maps. Assume t h a t the subspaces U and V of N are c o m p l e m e n t a r y , i.e. W = 1/ ffl f , w i t h "©" denoting the d i r e c t sum. Then for x e W we have the unique decomposition x = x + x w i t h x e I I , x „ e V . We can now define a linear map P : N -»- N through
P x = x
with x = x + x , x e U , x e V and N= H e 1/
(1.13)
This map is called the p r o j e c t o r which projects onto U and along V. I t is denoted by P (see f i g u r e 2).
f i g u r e 2
I f P p r o j e c t s onto (J and along V then"I - P, w i t h I the i d e n t i t y map, projects onto 1/ and along Ü. Thus
I - P = P
u,v v,u
(1.14)
For t h e i r images and inverse images we have
V i /
N=
u>
p
w™ -
v
'
p u , / u ) = w(I
~
p
u,v
)N=v
'
( i
-
p
ü y ~
1 ( 0 ) = u
-
( I
- v
r V )
-»
(1.15)
I t is easily v e r i f i e d that the dual P* of a p r o j e c t o r P is again a p r o j e c t o r operating on the dual space. For we have w i t h (1.12) and (1.15):
Thus,
U,V
v°,u°
and ( I - P , , „ ) * = P *
U,V V,U = P uO) ( /o (1.16)
F i n a l l y we m e n t i o n t h a t one can check whether a linear m a p is a p r o j e c t o r , by v e r i f y i n g whether the i t e r a t e d operator coincides w i t h the operator i t s e l f (Idempotence).
Now l e t us r e t u r n t o the point.where we l e f t our BLUE's p r o b l e m . We noted t h a t y * could be found by
K O X
orthogonally p r o j e c t i n g y onto ( U ) . Hence, the projector map needed is the one which o -L **
projects onto ( U ) and along (J°, i.e.,
y
(u°)\u°
y s(1.17)
F r o m (1.6) and (1.17) f o l l o w s then t h a t the linear f u n c t i o n h(y) is the unique BLUE's e s t i m a t o r of ( y " , y ) ^
h ( y ) = a + ( y , y )
( ( i -
P, )y , y , ) +
( P, y , y ) ,
, , , o , l o s i ,,,o,-L o s (U ) ,U ( U ) , ( j
h ( y ) = ( yc> y J + (p „ , „ V e - y - y i ) '
s x
( u
0)
1^
0 s 1where y-i is an a r b i t r a r y element of M .
A p p l i c a t i o n of the d e f i n i t i o n of the dual map gives
h(y) = ( y j . y , ) + ( y j , p" ^ y - y , ) )
And since P = P ( Ü ° )X, Ü0 u.u1 we get h ( y ) = ( y , y , + P . ( y - y . ) ) , s 1 u > ul 1in which we recognize the least-squares estimate
y = y + p (ys- y i ) ' yT e M ,
(1.18)
(1.19)
which solves the dual problem
<v
y
> v
y
>
M
± < v
y
- v
y
>
M v y e
'
u
»
(1.20)
(see figure 3).
w = {y
x} + u
Thus we have recovered the existing duality between BLUE's e s t i m a t i o n and least-squares e s t i m a t i o n . We m i n i m i z e a sum of squares (1.20) and emerge w i t h an o p t i m u m e s t i m a t o r , namely one which m i n i m i z e s another sum of squares (1.8), the variance. F r o m the g e o m e t r i c a l viewpoint this arises simply f r o m the duality between the so-called observation space M and e s t i m a t o r space M*, established by the d u a l i t y pairing ( y , y ) .
The above given result is of course the well known Gauss-Markov t h e o r e m which probabilistically j u s t i f i e s least-squares e s t i m a t i o n in case of linear models.
Observe t h a t the above discussion shows another advantage of g e o m e t r i c reasoning, namely that the language of geometry embodies an element of invariance. That is, geometric reasoning avoids unnecessary reference to p a r t i c u l a r sets of coordinate axes. Concepts such as linear projections and linear manifolds for instance, may be visualized in a coordinate-free or invariant way. A l l results obtained by an invariant approach therefore necessarily apply t o all possible representations of the linear m a n i f o l d M. That is, one could define N by a linear map A f r o m the parameter space N into the observation space M (in Tienstra's terminology this would be "standard problem II") or i m p l i c i t l y by a set of linear constraints ("standard problem I"). Even a m i x e d representation is possible. Consequently, in general we have t h a t if a coordinate representation is needed one can take the one which seems t o be the most appropriate. That i s , the use of a convenient basis rather than a basis f i x e d at the outset is a good i l l u s t r a t i o n of the f a c t t h a t c o o r d i n a t e - f r e e does not mean freedom f r o m coordinates so much as i t means freedom to choose the appropriate coordinates f o r the task at hand. With respect t o our f i r s t t o p i c , note t h a t a direct consequence of the c o o r d i n a t e - f r e e f o r m u l a t i o n is t h a t the d i f f i c u l t i e s are evaded which might possibly occur when a n o n - i n j e c t i v e linear map A is used to specify the linear m o d e l . This indicates that t h e actual problem of inverse linear mapping should not be considered to c o n s t i t u t e an essential part o f the problem of adjustment. That is, in the context of BLUE's e s t i m a t i o n i t is insignificant which pre-image of y under A is t a k e n . This viewpoint seems, however, s t i l l not generally agreed upon. The usually merely algebraic approach taken o f t e n makes one o m i t to distinguish between the actual adjustment problem and the actual inverse mapping p r o b l e m . As a consequence, published studies in the geodetic l i t e r a t u r e dealing w i t h the theory of inverse linear mapping surpass in our view often the essential concepts involved. We have therefore t r i e d to present an a l t e r n a t i v e approach; one t h a t is based on the idea t h a t once the causes of the general inverse mapping problem are classified, also the problem of inverse linear mapping itself is solved. Our approach s t a r t s f r o m the i d e n t i f i c a t i o n of the basic subspaces involved and next shows t h a t the problem of inverse linear mapping can be reduced to a few essentials.
As t o our second t o p i c , t h a t of non-linear a d j u s t m e n t , note t h a t the Gauss-Markov theorem f o r m u l a t e s a l o t of " i f s " before i t states why least-squares should be used: i f the mean y lies in a linear m a n i f o l d N , i f the covariance map is known to be Q , i f we are . w i l l i n g t o confine ourselves to e s t i m a t e s t h a t are unbiased in the mean and i f w e are w i l l i n g t o apply the q u a l i t y c r i t e r i u m of m i n i m u m variance, then the best e s t i m a t e is to be had by least-squares. These are a l o t of " i f s " and i t would be interesting t o ask "and i f n o t ? " . For all " i f s " this would become a c o m p l i c a t e d task indeed. But i t w i l l be clear t h a t the f i r s t " i f " which called f o r m a n i f o l d N to be linear, already breaks down in case of non-linear models. F u r t h e r m o r e , in non-linear models a r e s t r i c t i o n to linear estimators does not seem reasonable anymore, because any estimator of 7 must be a mapping f r o m M into
M, which w i l l be curved in general. Hence, s t r i c t l y speaking the Gauss-Markov t h e o r e m does not apply anymore in the non-linear case. A n d consequently one might question whether the excessive use of the theorem in the geodetic l i t e r a t u r e f o r t h e o r e t i c a l developments is j u s t i f i a b l e in all cases. Since almost all f u n c t i o n a l relations i n our geodetic models are non-linear, one may be surprised to realize how l i t t l e a t t e n t i o n the c o m p l i c a t e d problem area of non-linear geodesic adjustment has received. One has used and is s t i l l predominantly using the ideas, concepts and results f r o m the theory of linear e s t i m a t i o n . Of course, one may argue t h a t probably most non-linear models are only moderately non-linear and thus p e r m i t the use of a linear(ized) model. This is t r u e . However, i t does in no way release us f r o m the o b l i g a t i o n of really proving whether a linear(ized) model is s u f f i c i e n t as approximation. What we need t h e r e f o r e is knowledge of how non-linearity manifests itself at the various stages of adjustment. Here w e agree w i t h ( K u b i k , 1967), who points out t h a t a general t h e o r e t i c a l and p r a c t i c a l i n v e s t i g a t i o n into the various aspects of non-linear adjustment is s t i l l lacking.
In the geodetic l i t e r a t u r e we only know of a f e w publications in which non-linear adjustment problems are discussed. In the papers by (Pope, 1972), (Stark and M i k h a i l , 1973), (Pope, 1974) and (Celmins, 1981; 1982) some p i t f a l l s t o be avoided when applying variable transformations or when updating and re-evaluating f u n c t i o n values in an i t e r a t i o n procedure, are discussed. And in ( K u b i k , 1967) and (Kelley and Thompson, 1978) a brief r e v i e w is given of some iteration methods. A n investigation i n t o the various e f f e c t s of n o n - l i n e a r i t y was started in (Baarda, 1967 a,b), ( A l b e r d a , 1969), (Grafarend, 1970) and more recently in ( K r a r u p , 1982a). (Alberda, 1969) discusses the e f f e c t of n o n - l i n e a r i t y on the misclosures of condition equations when a linear least-squares estimator is used and i l l u s t r a t e s the things mentioned w i t h a q u a d r i l a t e r a l . A s i m i l a r discussion can be found in (Baarda, 1967b), where also an expression is derived f o r the bias in the e s t i m a t o r s . (Grafarend, 1970) discusses a case where the c i r c u l a r normal d i s t r i b u t i o n should replace the ordinary normal d i s t r i b u t i o n . And f i n a l l y (Baarda, 1967a) and ( K r a r u p , 1982a) exemplify t h e e f f e c t of n o n - l i n e a r i t y w i t h the aid of a c i r c u l a r model. Although we accentuate some d i f f e r e n t and new aspects of non-linear adjustment, our c o n t r i b u t i o n t o the problem of non-linear geodesic adjustment should be seen as a c o n t i n u a t i o n of the work done by the above mentioned authors. We must admit though t h a t unfortunately we do not have a c u t and dried answer to all questions. We do hope, however, t h a t our discussion of non-linear adjustment w i l l make one more susceptible to the i n t r i n s i c d i f f i c u l t i e s of non-linear adjustment and t h a t the problem w i l l receive more a t t e n t i o n than i t has received h i t h e r t o .
The plan of this publication is the f o l l o w i n g :
In chapter II we consider the g e o m e t r y of inverse linear mapping. We w i l l show t h a t every inverse B of a linear map A can be uniquely c h a r a c t e r i z e d through the choice of three subspaces S , C and V. F u r t h e r m o r e , each of these t h r e e subspaces has an i n t e r e s t i n g i n t e r p r e t a t i o n of its o w n . In order t o f a c i l i t a t e reference the basic results are summarized in table 1 .
In chapter III we s t a r t by showing t h e consequences of the inverse mapping problem for 2 and 3-dimensional geodetic n e t w o r k s . This p a r t is easy-going since the planar case has to some e x t e n t already been treated elsewhere in the geodetic l i t e r a t u r e . The second p a r t of this chapter presents a discussion on the in geodesy almost omnipresent problem of connecting geodetic n e t w o r k s .
F i n a l l y , chapter IV makes a s t a r t w i t h the problem of non-linear adjustment. A d i f f e r e n t i a l geometric approach is used throughout. We discuss Gauss' method in some detail and show how the e x t r i n s i c
curvatures of submanifold N a f f e c t s its local behaviour. A n d amongst other things, we also show how in some cases the geometry of the problem suggests i m p o r t a n t s i m p l i f i c a t i o n s . T y p i c a l examples are our generalizations of the classical H e l m e r t t r a n s f o r m a t i o n .
0 . G E O M E T R Y OF INVERSE LINEAR MAPPING
1 . The principles
Many problems in physical science involve the estimation or computation of a number of unknown parameters which bear a linear (or linearized) relationship t o a set of e x p e r i m e n t a l d a t a . The data may be contaminated by (systematic or random) errors, i n s u f f i c i e n t to determine the unknowns, redundant, or all of the above and consequently, questions as existence, uniqueness, s t a b i l i t y , approximation and the physical description of the set of solutions are a l l of i n t e r e s t .
In econometrics for instance (see e.g. Neeleman, 1973) the problem of i n s u f f i c i e n t data is discussed under the heading of " m u l t i - c o l l i n e a r i t y " and t h e consequent lack of d e t e r m i n a b i l i t y of the parameters f r o m the observations is known there as the " i d e n t i f i c a t i o n p r o b l e m " . And in geophysics, where the physical i n t e r p r e t a t i o n of an anomalous g r a v i t a t i o n a l f i e l d involves deduction of the mass d i s t r i b u t i o n which produces the anomalous f i e l d , t h e r e is a fundamental non-uniqueness in p o t e n t i a l field inversion, such t h a t , for instance, even c o m p l e t e , p e r f e c t data on the earth's surface cannot distinguish between t w o buried spherical density anomalies having the same anomalous mass but d i f f e r e n t radii (see e.g. Backus and G i l b e r t , 1968).
Also in geodesy s i m i l a r problems can be r e c o g n i z e d . The f a c t t h a t the data are generally only measured at discrete points, leaves one in physical geodesy for instance w i t h the problem of determining a continuous unknown f u n c t i o n f r o m a f i n i t e set of data (see e.g. R u m m e l and Teunissen, 1982). Also the non-uniqueness in coordinate-system definitions makes i t s e l f f e l t when i d e n t i f y i n g , i n t e r p r e t i n g , qualifying and comparing results f r o m geodetic network adjustments (see e.g. Baarda, 1973). The problem of connecting geodetic n e t w o r k s , which w i l l be studied in chapter three, is a prime example in this respect.
A l l the above mentioned problems are very s i m i l a r and even f o r m a l l y equivalent, i f they are described in terms of a possible inconsistent and under-determined linear system
y = Ax , (1.1)
where A is a linear map f r o m the n-dimensional parameter space N into the m-dimensional observation space M .
The f i r s t question t h a t arises is whether a solution t o (1.1) exists at a l l , i.e. whether the given vector y is an element of the range space R(A), y e R(A). I f this is the case we c a l l the system consistent. The system is c e r t a i n l y consistent if the rank of A , which is defined as rank A = d i m . R(A) = r, equals the dimension of M. In this case namely the range space R(A) equals M and t h e r e f o r e y e M= R(A). In a l l other cases, r < d i m . M , consistency is no longer guaranteed, since i t would be a mere coincidence i f the given vector y e M lies in the smaller dimensioned subspace R ( A ) c M . Consistency is thus guaranteed if y eR(A) = N u ( A * ) ° .
n o t , i.e. whether the vector y contains enough i n f o r m a t i o n to d e t e r m i n e the vector x. I f not, the system is said to be under-determined. The s o l u t i o n is only unique i f the rank of A equals the dimension of its domain space N , i.e. i f r = d i m . N . To see t h i s , assume Xi and x2 ^ xi t 0 D e two solutions t o (1.1). Then Ax-^ = A x2 or A ( x j - x2) = 0 m u s t hold. But this means that r < d i m . W.
F r o m the above considerations follows t h a t i t is the r e l a t i o n of r = d i m . R ( A ) t o m = d i m . M and n = d i m . N , which decides on t h e general c h a r a c t e r of a linear system. In case r = m = n, we know t h a t a unique inverse map B of t h e b i j e c t i v e map A e x i s t s , w i t h the properties
B A = I and A B = I . (1.2)
For n o n - b i j e c t i v e maps A , however, in general no map B can be found f o r which (1.2) holds. For such maps t h e r e f o r e a more relaxed type of inverse p r o p e r t y is used. Guided by the idea t h a t an inverse like map B should solve any consistent system , t h a t is, map B should furnish f o r each y e R ( A ) , some solution x = By such that y = ABy, one obtains as defining property o f B
A B A = A . (1.3)
Maps B : M -*■ W , w h i c h satisfy this relaxed t y p e of inverse c o n d i t i o n are now called generalized inverses of A .
In the geodetic l i t e r a t u r e there is an overwhelming list of papers w h i c h deal w i t h the theory of generalized inverses (see e.g. Teunissen, 1984a and the references c i t e d i n i t ) . I t more or less s t a r t e d w i t h the pioneering work of Bjerhammar (Bjerhammar, 1951) ,who defined a generalized inverse for rectangular m a t r i c e s . And a f t e r the publication of Penrose (Penrose, 1955) the l i t e r a t u r e of generalized inverses has p r o l i f e r a t e d rapidly ever since.
Many of the published studies, however, follow a r a t h e r algebraic approach making use of anonymous inverses w h i c h merely produce a solution to the l i n e a r system under consideration. As a consequence of this anonymity the essential concepts involved i n the problem of inverse linear mapping often stay concealed. Sometimes i t even seems t h a t algebraic manipulations and the stacking of theorems, lemma's, c o r o l l a r i e s , and what have you, are p r e f e r r e d to a clear g e o m e t r i c i n t e r p r e t a t i o n of what r e a l l y is involved in the problem of inverse linear m a p p i n g .
In this chapter we t h e r e f o r e approach the problem of inverse mapping f r o m a d i f f e r e n t viewpoint. Our approach is based on the idea that once the causes of the inverse mapping problem are classified, also the problem of inverse mapping i t s e l f is solved. The f o l l o w i n g r e m i n d e r may be h e l p f u l . We know t h a t a map is uniquely determined once its basis values are given. B u t as the theorem of the next section shows, condition (1.3) does not f u l l y specify all the basis values o f the map B. Hence its non-uniqueness. This means, however, that analogously to the case where a basis of a subspace can be extended in many ways t o a basis which generates the whole space, various maps satisfying (1.3) can be found by specifying t h e i r f a i l i n g basis values.
To give a p i c t o r i a l explanation of our procedure, observe t h a t in the general case of rank A = r < min.(m,n), the nullspace Ma (A) c N and range space R(A) c M both are proper subspaces. That is,
they do not coincide w i t h respectively N and M (see figure 4 ) .
N : parameter space M : observation space
d im. Wu(A) = n-rank A
d i m . R ( A ) = rank A
f i g u r e 4
Now, just like there are many ways in which a basis of a subspace can be extended to a basis which generates the whole space, there are many ways to extend the subspaces Nu.(A) c M and
R ( A ) c M to f i l l W and M respectively .(see f i g u r e 5).
s
* \
'is
\
I s
\ I '
\
/ /
/
ci/ M / \ / / o\
\
/
/ /
\ Is
Nu(A) \l/ f i g u r e 5/
R(A)Let us choose t w o a r b i t r a r y subspaces, say S c W a n d C c M > such t h a t the d i r e c t sums 5 e Ma(A) and R ( A ) s C coincide w i t h Mand M (see f i g u r e 6).
W : parameter space M : observation space
dim. S = rank A dim. Wu(A) = n-rank A1 d i m . R(A) = rank A d i m . C = m-rank A M = 5 e Nu(A) M = R (A) e C
The c o m p l e m e n t a r i t y of S and N u ( A ) then i m p l i e s that the subspace S has a dimension which equals t h a t of R ( A ) , i.e. d i m . 5 = d i m . R ( A ) . But this means t h a t map A , when r e s t r i c t e d t o S ,
\
is b i j e c t i v e . There exist therefore linear m a p s B : M •* W w h i c h , when r e s t r i c t e d to R(A),become the inverse of A . (see figure 7):
B, A , ' R ( A ) 'S and A , B,
's
' R ( A ) (1.4) d i m . S = rank AS c ( l
dim.R(A) = rank A R(A) c MThe inverse-like properties (1.4) are thus the ones which replace (1.2) in the general case of rank A = r < min.(m,n). The second equation of (1.4) can be rephrased as A B A = A , and t h e r e f o r e constitutes the classical d e f i n i t i o n of a generalized inverse o f A . The f i r s t equation of (1.4) states t h a t
B A x V x e S (1.5)
In the next section we w i l l prove what is already i n t u i t i v e l y c l e a r , namely t h a t equation (1.5) is equivalent to the classical d e f i n i t i o n (1.3), and therefore (1.5) can just as w e l l be used as a d e f i n i t i o n of a generalized inverse. I n f a c t , (1.5) has t h e advantage over (1.3) t h a t i t clearly shows why generalized inverses are not unique. The image of 5 under A is namely only a proper subspace of M. To f i n d a p a r t i c u l a r map B w h i c h satisfies (1.5), we t h e r e f o r e need t o specify its f a i l i n g basis values.
2. Arbitrary inverses uniquely characterized
In this section we w i l l f o l l o w our lead t h a t a map is only uniquely d e t e r m i n e d once its basis values are c o m p l e t e l y s p e c i f i e d .
As said, the usual way to define generalized inverses B of A is by r e q u i r i n g
A B A = A . (2.1)
This expression, however, is not a very i l l u m i n a t i n g one, since i t does not t e l l us what generalized inverses of A look like or how they can be computed. We w i l l t h e r e f o r e r e w r i t e expression (2.1) in such 'a f o r m t h a t i t becomes r e l a t i v e l y easy t o understand the mapping c h a r a c t e r i s t i c s of B. This is done by the f o l l o w i n g t h e o r e m :
Theorem
l O A B A = A «=> For some unique S c M ,
c o m p l e m e n t a r y to Mu.(A),
B A x = x , V x e S , holds.
A B A = A *-* A B y = y , V y e R ( A ) .
Proof o f 1 °
(-►) From p r e m u l t i p l y i n g A B A = A w i t h B f o l l o w s B A B A = BA. The map B A is thus idempotent and t h e r e f o r e a p r o j e c t o r f r o m N into W.
F r o m A B A = A also follows t h a t Nu(BA) = Mu(A).
To see t h i s , consider x e Nu(BA). Then B A x = 0 or A B A x = A x = 0, which means t h a t x e Mu(A). Thus Nu(BA) c Nu(A). Conversely, i f x e Nu(A), then A x = 0 or B A x = 0, w h i c h means x e Mu(BA). Thus we also have N u ( A ) c N u ( B A ) . Hence Nu.(BA) = N u ( A ) . Now l e t us denote t h e subspace R ( B A ) by S, i.e. R(BA) = S . The p r o j e c t o r property of B A then implies t h a t B A x = x , V x e S . And i t also implies t h a t
M = R ( B A ) e N u ( B A ) . W i t h R ( B A ) = S and Nu(BA) = Mu(A) we t h e r e f o r e have t h a t N = S e Ma ( A ) . Hence the c o m p l e m e n t a r i t y of S and Nu(A).
( * ) F r o m N = S m Nu(A) follows the c o m p l e m e n t a r i t y of 5 and Nu(A). We can t h e r e f o r e c o n s t r u c t the p r o j e c t o r P<j uutn) = I - P M U . ( A ) S • w't n t h i s Pr oJe c t o r w e c a n n o w replace
B A x = x , Vx e S ,
by
B A PS , N u ( A )x = PS , M a ( A )x' V x e N *
And since A P5 j N a ( A ) = A ( I - PN u ( A ) > s> = A , we get
B A PS, N a ( A )x = B A x = Ps > W u ( A )x , Vx e N ,
or f i n a l l y , a f t e r p r e m u l t i p l i c a t i o n w i t h A ,
A B A x = A x , V x e N .
Proof o f 2 °
The above theorem thus makes precise what already was made i n t u i t i v e l y clear in section one. There are now two i m p o r t a n t points which are put f o r w a r d by the t h e o r e m . F i r s t of a l l , i t states t h a t every linear map B : M "•" M which satisfies
B A x = x , V x e S , (2.2)
w i t h N = S a Nu(A), is a generalized inverse of A . And since
R ( A ) = A M = { y e M I y = A x f o r some x e M }
= l y £ M I y = A x f o r some x = x + x , x £ S , x e N u . ( A ) l
1 2 1 2
= { y e M I y = A x f o r some x e S } = A S ,
this implies t h a t a generalized inverse B of A maps the subspace R ( A ) c M onto a subspace S c N c o m p l e m e n t a r y t o Wu(A). Map B therefore determines a one-to-one r e l a t i o n between R(A) and S , and is i n j e c t i v e when r e s t r i c t e d to the subspace R(A).
A second point t h a t should be noted about the theorem is t h a t i t gives a way of constructing a r b i t r a r y generalized inverses of A . To see t h i s , consider expression (2.2). Since R(A) = A N = A S , expression (2.2) only specifies how B maps a subspace, namely R(A), of M . Condition (2.2) is t h e r e f o r e not s u f f i c i e n t f o r d e t e r m i n i n g map B uniquely. Thus in order to be able t o compute a p a r t i c u l a r generalized inverse of A one also needs to specify how B maps a basis of a subspace complementary to R(A). L e t us denote such a subspace by C c M , i.e. M = R ( A ) e C Then if e:, i = l , . . . , m ,
-Li * ) and e n , a= l , . . . , n , are bases of M and W, and C e . , p = l , . . . , ( m - r ) , ' forms a basis of
C , a p a r t i c u l a r generalized inverse B of A is uniquely c h a r a c t e r i z e d by specifying in addition to (2.2) how i t maps C°, say:
l i «
B C e . = D e , i = l , . . . , m ; a = l , . . . , n ; p = l , . . . , ( m - r ) (2.3)
p i p a
(Einstein's s u m m a t i o n convention). a
Thus if V denotes the subspace spanned by D e , we have,
P a
B C ° = V c N , w i t h M = R ( A ) e C° . (2.4)
A l t h o u g h the choice f o r V c M is completely f r e e , we w i l l show t h a t one can impose an e x t r a c o n d i t i o n , namely p c N u ( A ) , without a f f e c t i n g g e n e r a l i t y . N o t e t h a t point 2 ° of the theorem says t h a t A B is a p r o j e c t o r , projecting onto the rangespace R(A) and along a space, say C , c o m p l e m e n t a r y to R ( A ) . With (2.4) we t h e r e f o r e get t h a t
-L i x) The kernel l e t t e r " C " expresses the f a c t t h a t C <5. . CJ = 0 , i,j = l , . . . , m ; p = l , . . . , ( m - r ) ;
X t P ' J Q q = l,...,r, or in m a t r i x notation t h a t ( C ) C = O
R ( A ) , C = o C = A P .
But this means t h a t if B is characterized by mapping C° onto V, there exists another subspace of M complementary to R(A) which is mapped by B to a subspace of Mu(A). We can t h e r e f o r e just as w e l l s t a r t c h a r a c t e r i z i n g a particular generalized inverse B of A by (2.2) and (2.4), but now w i t h the additional condition t h a t V c Mu(A)
Summarizing, we have for the images of t h e two complementary subspaces R ( A ) = A S and C under B: w i t h
!
and B A S = S a n d N = S 9 Nix(A) , M = V c N u ( A ) B C° = V , R ( A ) ffl C °(2.5)
A f e w things are depicted in figure 8.
N : parameter space M : observation space
d i m . S = rank A d i m . Nu(A) = n-rank A1 d i m . R(A) = rank A d i m . C = m-rank A N = S e Nu (A) V e Wu(A) f i g u r e 8 M = R(A) € C
Our objective of finding a unique r e p r e s e n t a t i o n of an a r b i t r a r y generalized inverse B o f A can now be reached in a very simple way indeed. The only t h i n g we have t o do is t o combine (2.2) and (2.3). I f we take the coordinate expressions of B and A t o be
a i B e = B e and A e = A e ,
i i a a a i
where e , i = l , . . . , m , and e , a = l , . . . , n are bases of M and N, and i f we t a k e as bases of i a
S, Cu and V,
a
±i a
S e , C e and D e , p = l ( m - r ) ; q = l , . . . , r , q a p i P a
then (2.2) and (2.3) can be expressed as a a i a i g 3 B A S e = S B A e = S A B eD = S eD q a q a j q a i 3 q Ë and
ü x i 3 e
B C e = C B e Q = D e , p i p i P p P or as 3 i a . x i P . P B ( A S . C ) ea = (S . D ) eQ, i a q . p p q . p P which gives in m a t r i x n o t a t i o n B ( A S I C ) = ( S . D ) . nxm mxn n x r m x ( m - r ) n x r n x ( m - r ) (2.6) . ,J->Now, since the subspaces R(A) = A S and C° are c o m p l e m e n t a r y , the m x m m a t r i x (AS '. C " ) has f u l l rank and is thus i n v e r t i b l e . The unique representation of a p a r t i c u l a r generalized inverse B of A t h e r e f o r e becomes B = ( S . D ) ( A S n x m n x r n x ( m - r ) m x r
c
1) -
: mx ( m - r ) (2.7)A more s y m m e t r i c representation is obtained if we substitute the easily v e r i f i e d m a t r i x i d e n t i t y
(AS I C )
f
(C^ASrV
( ( u V c V W ) '
w i t h U° = R ( A ) ° = Nu(A*), into (2.7) ( r e c a l l that C and U are m a t r i x representations of respectively the subspaces C° and U°):
B = S ( C A S ) " c + D ( ( U X ) C 1 ) " (U1)
n x m n x m nxm
(2.8)
W i t h (2.7) or (2.8) we thus have found one expression which covers all t h e generalized inverses of A . F u r t h e r m o r e we have the i m p o r t a n t result t h a t each p a r t i c u l a r generalized inverse of A ,defined through (2.2) and (2.3), is uniquely c h a r a c t e r i z e d by the choices made f o r the subspaces S , c o m p l e m e n t a r y to Mu(A), C ° c o m p l e m e n t a r y to R(A) and V, a subspace of Nu(A).
In the next two sections we w i l l give the i n t e r p r e t a t i o n associated w i t h the three subspaces 5 , C and V. Also the r e l a t i o n w i t h the p r o b l e m of solving an a r b i t r a r y system of linear equations w i l l become clear then.
3. Injective and surjective maps
From t h e theorem of the previous section we learnt that the inverse-like properties
B A = I and A B R ( A ) S S R ( A )
= I (3.1)
hold f o r any a r b i t r a r y generalized inverse B of A . That is, the maps B A and A B behave like i d e n t i t y maps on respectively the subspaces S c M and R(A) c M . Thus in the special case t h a t rank A = r = n, t h e generalized inverses of A become l e f t - i n v e r s e s , since then B A = I. And s i m i l a r l y they become right-inverses if rank A = r = m , because then A B = I holds.
In order to give an i n t e r p r e t a t i o n of t h e subspace S c N , l e t us now f i r s t c o n c e n t r a t e on the special case t h a t rank A = r = m.
If rank A = r = m then R(A) = M , w h i c h implies t h a t the subspaces c o m p l e m e n t a r y to R(A) reduce to C° = { o } . With (2.5) we then also have t h a t V - {o} (see figure 9). The general expression of right-inverses therefore readily f o l l o w s f r o m (2.8) as
B = n x m S nxrr ( A S ) l mxm 1 , w i t h W = Ü ffl Nu(A) (3.2)
N: parameter space M : observation space A dim. S = r dim.Nu(A) = n-r N = S e Nu. (A) figure 9 dim.R(A) = r = m M = R(A)
Thus the only subspaces which play a role in the inverses of surjective maps are the subspaces S c o m p l e m e n t a r y to Mu(A).
In order t o f i n d out how (3.2) is r e l a t e d to the problem o f solving a system of linear equations
y = A x ,
mx 1 mx n n x 1 (3.3)
f o r which m a t r i x A has f u l l row rank m, f i r s t observe t h a t the system is consistent f o r all m m x n
y e R . W i t h a p a r t i c u l a r generalized inverse (right-inverse), say B , of A , and
mi i n xm mx n
1/ = M u ( A ) , the solution set of (3.3), which a c t u a l l y represents a linear m a n i f o l d in N, can t h e r e f o r e be w r i t t e n as
{ x } = { x l x = B y + , V . , a . , } . n x l n x l n x l n x l n x ( n - r ) ( n - r ) x l
By choosing a , say a : = a , , we get thus as a p a r t i c u l a r solution x , e { x } :
x = By + V a ,
1 1 n x l n x l n x l
(3.4)
where a-i so to say contributes the extra i n f o r m a t i o n , which is lacking in y, to determine x ^ . Since R(B) = S , i t follows f r o m (3.4) t h a t
I t 1 t J_, c a l l
(S ) x = ( ( S ) V ] a = c! '
( n - r ) x n n x l ( n - r ) x ( n - r ) ( n - r ) x l ( n - r ) x l
(3.5)
But this means t h a t , since a , or c , contributes the extra i n f o r m a t i o n which is lacking in y to determine x-i, equation (3.5) and (3.3) together s u f f i c e t o determine x , uniquely. Or in other words, the solution of the uniquely solvable system
1 J A 1 t (S ) ( m + n - r ) x l ( m + n - r ) x n n x l (3.6) is precisely x-^: A 1 t (S ) . n x l n x ( m + n - r ) ( m + n - r ) with 1/ = Mu(A) - 1 . 1 I t 1 , - 1 , ( S ( A S ) . V ( ( S ) V ) ) n x m c v 1 ' n x ( n - r ) ( m + n - r ) x l (3.7)
need t o extend the system of linear equations f r o m (3.3) to (3.6) by introducing the additional equations c = (S ) x , so t h a t the extended m a t r i x
A
(sV
( m + n - r ) x n
becomes square and regular. F u t h e r m o r e the corresponding r i g h t - i n v e r s e of A is obtainable f r o m the inverse of this extended m a t r i x .
Let us now consider the case rank A = r = n. Then all generalized inverses of A become l e f t - i n v e r s e s . Because of the i n j e c t i v i t y of A we have t h a t its nuUspace reduces to N u ( A ) = { o } . But this implies t h a t S =N and V = { o } , since P c Ma ( A ) . (see figure 10).
N: parameter space M : observation space
d im. S = n
im. R(A) = rank A = r = n
dim,
,<?
M = S
figure 10
M= R(A) e C
For the dual map A : M ■*■ N we t h e r e f o r e have a s i t u a t i o n which is comparable t o the one sketched in figure 9 (see f i g u r e 11). N o w , t a k i n g advantage of our result (3.2), we f i n d the general matrix-representation of an a r b i t r a r y generalized inverse B* of A * t o be
t , t - 1 B = C ( A C ) . mxn m x n n x n
M : e s t i m a t o r space N : co-parameter space
dim. Nu(A ) = m-r
M * = C e Nu(A*)
figure 1 1
The general expression of left-inverses therefore readily f o l l o w s as B = n x m t ( C A ) n x n - 1 t C nxm W l t h M = K ( A ) 9 0 C (3.8)
Thus dual to our result (3.2), we f i n d t h a t the only subspaces which play a role in the inverses of i n j e c t i v e maps, are the subspaces C° complementary to R(A).
W i t h the established duality relations i t now also becomes easy to see how (3.8) is related t o the p r o b l e m of solving a generally inconsistent but otherwise uniquely determined system of linear equations
y = A x , w i t h r a n k A = r = n .
m x l m x n n x l (3.9)
The dual of (3.6) m o d i f i e d to our present situation gives namely
y = ( A . c r )
m x l m x n m x ( m - r )
( m + n - r ) x l
(3.10)
A n d dual to (3.7), the unique solution of (3.10) is given by:
( n + m - r ) x l ( A :CX)'1y = ( n + m - r ) xm m x l w i t h U = Nu.(A ) , t - 1 t ( C A) C
((uVcVW
( n + m - r ) x m y m x l (3.11)We t h e r e f o r e have recovered the dual rule t h a t in order t o find a p a r t i c u l a r solution to (3.9), we need to e x t e n d the system of linear equations f r o m (3.9) to (3.10) by i n t r o d u c i n g additional unknowns such t h a t the extended m a t r i x
( A . CA )
mxn m x ( m - r ) (3.12)
becomes square and regular. F u r t h e r m o r e the corresponding l e f t - i n v e r s e of A is obtainable f r o m the inverse of this extended m a t r i x .
4 . A r b i t r a r y systems of l i n e a r equations and a r b i t r a r y inverses
In t h e previous section we showed t h a t a p a r t i c u l a r solution of an underdetermined but otherwise consistent system of linear equations could be obtained by extending the m a t r i x A r o w w i s e . A n d
mxn
especially the principal röle played by the subspace S c M c o m p l e m e n t a r y to Nu(A) in removing the underdeterminability was demonstrated. S i m i l a r l y we saw how consistency of an inconsistent, but otherwise uniquely determined system of linear equations was restored by extending the m a t r i x
A columnwise. And here the subspace C c M c o m p l e m e n t a r y to R ( A ) played the decisive röle. We also observed a complete duality between these results; f o r the dual of an i n j e c t i v e map is surjective and vice versa.
These results a r e , however, s t i l l not general enough. In p a r t i c u l a r we note t h a t the subspace
V c Na(A) was annihilated as a consequence of the assumed i n j e c t i v i t y and s u r j e c t i v i t y . The
reason f o r this w i l l become clear i f we consider the i n t e r p r e t a t i o n associated w i t h the subspace V . Since S n V = { o } i t f o l l o w s f r o m expression (2.8) t h a t R ( B ) = S e V . With d i m . S = dim R(A) = rank A we t h e r e f o r e have t h a t rank B >_ rank A , w i t h equality i f and only if V - { o } . But this shows why the subspace V gets annihilated in case o f i n j e c t i v e and surjective maps. The l e f t ( r i g h t ) inverses have namely the same rank as the i n j e c t i v e (surjective) maps. From the above i t also becomes clear t h a t the rank of B is c o m p l e t e l y d e t e r m i n e d by the choice made for V. In p a r t i c u l a r B w i l l have m i n i m u m rank if V is chosen t o be V - { o } , and m a x i m u m r a n k , rank B = min.(m,n), if one can choose V such t h a t d i m . V = min.(m,n)-r. N o w to see how the subspace V c Nu(A) gets incorporated in the general case, we consider a system of linear equations
m x l mx n n x 1 A x , w i t h rank A = r < min.(m,n), (4.1) i.e. a system which is possibly inconsistent and underdetermined at the same t i m e . F r o m the r a n k -deficiency of A in (4.1) follows t h a t the unknowns x cannot be determined uniquely, even i f
y £ R ( A ) . Thus the i n f o r m a t i o n contained in y is not s u f f i c i e n t to determine x uniquely. F o l l o w i n g the same approach as before, we can a t once remove this u n d e r d e t e r m i n a b i l i t y by extending (4.1) t o
( m + n - r ) x l
x , w i t h W= S e N u ( A ) ( m + n - r ) x n n x l
(4.2)
' But although the extended m a t r i x of (4.2) has f u l l c o l u m n rank, the system can s t i l l be inconsistent. To remove possible inconsistency we t h e r e f o r e have t o extend the m a t r i x of (4.2) columnwise so t h a t
I the resulting m a t r i x becomes square and regular. Now since M = R ( A ) e C , the f o l l o w i n g
extension is a feasible one:
A
(sV
c
0 w i t h M = R ( A )But the most general extension would be A ( S1) ' X (4.3) ( m + n - r ) x l ( m + n - r ) x ( m + n - r ) ( n r w - n - r ) x l w i t h M = S s> Nu(A) , M = R ( A ) a> C solution of (4.3) is then given by:
and X being a r b i t r a r y . The unique ( n - r ) x ( m - r ) A C
1 t
(S
±)
X
- 1t - 1 t 1
( ,
I t
J-1-1
(,
I t
± l " l ,
±
Nt .
±f.
I t
± i - l
C -V l ( S ) V J
X I ( U
) C J (U ) .V l ( S ) V J
S ( C A S )((uVc^rV)
1
, (4.4) D or X = - ( S )lD , our w i t h M = S s Mu(A) , M = R ( A ) e C ° , |/° = Mu(A) a n d U° = N u ( A * ) .In this expression we recognize, if we put -V I (S ) V J X
general m a t r i x representation (2.8) of an a r b i t r a r y generalized inverse B of A . Thus as a generalization of (3.7) and (3.11) we have:
(sV -(st)
l
D
1 y c t -It f ± t ± i - l ± t . ±f ± t ± i - l S ( C A S ) C +DL ( U ) C J ( U ) . V I ( S ) V J((uVc
1 . 1 ± 1 - 1 ± t ■ - ( u ) w i t h l / ° = Nu(A) a n d U° = N u ( A * ) (4.5)This result then completes the c i r c l e . In section one namely, we s t a r t e d by describing the geometric principles behind inverse linear mapping. In section two these principles were made precise by the stated t h e o r e m . This t h e o r e m enabled us t o find a unique r e p r e s e n t a t i o n concerning all generalized inverses B of a linear map A . In section three we then specialized to i n j e c t i v e and surjective maps, showing the r e l a t i o n between the corresponding inverses and the solutions of the corresponding systems of linear equations. A n d f i n a l l y this section generalized these results to a r b i t r a r y systems of linear equations whereby our general expression of generalized inverses was again obtained.
5. Some common type of inverses and their relation to the subspaces S, C and V
With our i n t e r p r e t a t i o n of the three subspaces S , C and V , and an expression like (2.8) i t now becomes very simple indeed t o derive most of the standard results which one can f i n d in the many textbooks available. See e.g. (Rao and M i t r a , 1971). As a means of e x e m p l i f i c a t i o n we show what role is played by the three subspaces S , C and V in the more common type of inverses used:
— least-squares inverses —
L e t M be Euclidean w i t h m e t r i c tensor / . , .\ and l e t Q : M -»■ M be the covariance map
- i w y
defined by Q y = ( y , . Y , . We know f r o m chapter one t h a t f o r
x = B y
t o be a least-squares solution of m i n . ( y - A x , y - A x / ,
A B = P , w i t h Ü = R ( A ) , (5.1)
must h o l d . From (2.8) f o l l o w s , however, t h a t in general
A B = P , w i t h U = R ( A ) . (5.2) o
U, C
Namely, expression (2.8) shows t h a t
And since A B = A S ( ClA S ) "1^ . (5.3) mxm mxm A S ( ClA S ) "1^ . C1 0 mxm mx ( m - r ) mx ( m - r ) and A S(ClA S ) "1Ct. A S = A S , mxm mx r mx r
i t follows that (5.3) is the m a t r i x representation of the p r o j e c t o r P . F r o m comparing (5.1) and (5.2) we thus conclude t h a t least-squares inverses are obtained by choosing
o ± 1