• Nie Znaleziono Wyników

GPS position time-series analysis based on asymptotic normality of M-estimation

N/A
N/A
Protected

Academic year: 2021

Share "GPS position time-series analysis based on asymptotic normality of M-estimation"

Copied!
19
0
0

Pełen tekst

(1)J Geod (2012) 86:15–33 DOI 10.1007/s00190-011-0489-4. ORIGINAL PAPER. GPS position time-series analysis based on asymptotic normality of M-estimation A. Khodabandeh · A. R. Amiri-Simkooei · M. A. Sharifi. Received: 29 July 2010 / Accepted: 22 May 2011 / Published online: 7 June 2011 © Springer-Verlag 2011. Abstract The efficacy of robust M-estimators is a well-known issue when dealing with observational blunders. When the number of observations is considerably large— long time series for instance—one can take advantage of the asymptotic normality of the M-estimation and compute reasonable estimates for the unknown parameters of interest. A few leading M-estimators have been employed to identify the most likely functional model for GPS coordinate time series. This includes the simultaneous detection of periodic patterns and offsets in the GPS time series. Estimates of white noise, flicker noise, and random walk noise components are also achieved using the robust M-estimators of (co)variance components, developed in the framework of the least-squares variance component estimation (LS-VCE) theory. The method allows one to compute confidence interval for the (co)variance components in asymptotic sense. Simulated time series using white noise plus flicker noise show that the estimates of random walk noise fluctuate more than those of flicker noise for different M-estimators. This is because random walk noise is not an appropriate noise A. Khodabandeh (B) · M. A. Sharifi Department of Surveying and Geomatics Engineering, Geodesy Division, Faculty of Engineering, University of Tehran, North Kargar Ave., Amir-Abad, Tehran, Iran e-mail: khodabandeh@ut.ac.ir M. A. Sharifi e-mail: sharifi@ut.ac.ir A. R. Amiri-Simkooei Department of Surveying Engineering, Faculty of Engineering, University of Isfahan, 81746-73441 Isfahan, Iran A. R. Amiri-Simkooei Acoustic Remote Sensing Group, Faculty of Aerospace Engineering, Delft University of Technology, Kluyverweg 1, 2629 HS, Delft, The Netherlands e-mail: a.amirisimkooei@tudelft.nl. structure for the series. The same phenomenon is observed using the results of real GPS time series, which implies that the combination of white plus flicker noise is well described for GPS time series. Some of the estimated noise components of LS-VCE differ significantly from those of other Mestimators. This reveals that there are a large number of outliers in the series. This conclusion is also affirmed by performing the statistical tests, which detect (large) parts of the outliers but can also leave parts to be undetected. Keywords M-estimation · Least squares variance component estimation (LS-VCE) · Robust variance component estimation · GPS coordinate time series · Asymptotic normality. 1 Introduction GPS-derived coordinate time series are regarded as one of the most important geodetic measurements to study geophysical phenomena such as plate tectonics and crustal deformation. Using the time series of daily positions of GPS sites, the velocities are commonly estimated by the least squares method through the linear regression to their individual coordinate components. Apart from the optimal properties of the least squares estimation, the method is highly sensitive to outliers in the data (Koch 1999). Any environmental disturbance and/or equipment configuration change may introduce offsets or outliers into the coordinate time series (Kenyeres and Bruyninx 2004). The occurrence of these misspecifications may lead to biased estimates in both the functional and stochastic models. To detect and to correctly identify such unmodeled effects or structures, some research is ongoing in functional and stochastic models.. 123.

(2) 16. In the case of functional models, Williams (2003b) discussed offset detection and estimation procedure and investigated the impact of offset estimation on rate uncertainty for three different noise types (white noise, flicker noise, and random walk noise). Ding et al. (2005) studied the seasonal and secular variations of the coordinate time series of some co-located GPS and VLBI stations and evaluated the effects of seasonal variations on the rate estimation. Perfetti (2006) applied the detection identification adaptation (DIA) procedure, introduced by Teunissen (2000), to the GPS coordinate time series of permanent stations for detection of offsets and for rejection of outliers. The occurrence of a large number of outlying observations may incorrectly lead to the identification of the above-mentioned misspecifications. As a consequence, the remaining unmodeled effects will be reflected in receiving unrealistic estimates for parameters in the functional model and inflate variance components in the stochastic model. Therefore, one may take recourse to applying robust estimation techniques to limit the influence of the possible blunders. In the case of stochastic models, Calais (1999) processed three permanent GPS sites using spectral analysis and found that the position time series best fit a noise model combining white noise and flicker noise. In global GPS solutions, Williams et al. (2004) demonstrated that a combination of white and flicker noise is an appropriate stochastic model for all three individual coordinate components. Amiri-Simkooei et al. (2007) proposed a methodology to assess the noise characteristics in GPS coordinate time series and recognized that the model consisting of white plus flicker noise generally best characterizes the noise in the coordinate components. They, in addition, acknowledged the presence of first-order autoregressive noise in the series. The traditional approach to dealing with outliers is based on statistical tests, through which one has to make a judgment about the validity of the data with the aid of a typical test statistic. To find possible blunders, the test statistic is checked as to whether it exceeds a critical value. The test statistic itself is a function of the estimates which are all prone to be biased by possible outliers. In addition, the critical value chosen for the test corresponds to a level of significance which is usually selected in a subjective fashion (Shao 2003). Consequently, some of the outliers may still remain in the observations. At the same time, some of the observations may incorrectly be flagged as outliers. Another drawback is that once suspicious data have been flagged, the common action is to remove them from the remaining observations, thereby losing some possible useful data. To deal with outliers more effectively, one can utilize robust estimation methods which are more resistant to such observations and they can, therefore, deliver more reliable results (Huber 1981; Hampel et al. 1986; Maronna et al. 2006). We highlight that robust estimators should not be thought as alternatives to the classical least squares method.. 123. A. Khodabandeh et al.. These estimators together with the method of least squares would improve the overall reliability and accuracy of the estimates as any insignificant difference between such estimates results from accurate observations (Maronna et al. 2006). The efficacy of using robust M-estimators, in treating observational blunders, is a well-known issue. The use of these estimators for gross error detection is currently subject to extensive research in geodetic literature (e.g., Yang 1994, 1999; Koch 1996; Hekimoglu 1997; Gui and Zhang 1998; Amiri-Simkooei 2003; Peng 2005; Baselga 2007; Erenoglu and Hekimoglu 2009; Khodabandeh and Amiri-Simkooei 2010). There is also research on robust estimation for the variance component models (e.g., Gervini and Yohai 1998; Dueck and Lohr 2005; Yang et al. 2005; Peng 2009). In the context of robust time-series analysis, one can also refer to previous work in the statistical literature such as Martin (1981); Martin and Yohai (1986, 1991); Rousseuw and Leroy (1987); Davis (1996), in which robust analysis is based on the assumption that the given time series are stationary and have been modeled using the autoregressive moving average (ARMA) framework (Maronna et al. 2006). The majority of time series encountered in the geophysical discipline are, however, known to be non-stationary (Vanicek 1969). Therefore, using the existing theory may then provide unsatisfactory results. This urges the robust analysis of time series to deserve further consideration. Motivated by the property of asymptotic normality of M-estimation, especially when we deal with a significant large number of observations like GPS time series, we can compute reasonable asymptotic confidence regions for the unknown parameters in the mathematical model. In a similar way, using Bahadur-type linear representation (Bahadur 1966; Babu 1989), Peng (2005) derived asymptotic covariance matrix for L1 norm estimates and utilized the asymptotic version of the Baarda test statistic for detecting observational blunders. In the present paper, we employ the following five leading M-estimators: L1 norm estimation, M-estimation of Huber (Huber 1981), M-estimation of Hampel (Hampel et al. 1986), M-estimation of Krarup (Krarup and Kubik 1983) and the method of IGGIII (Yang et al. 2001), to analyze mathematical models of the coordinate time series. In this respect, we use the theory of robust tests for the linear hypotheses and perform a likelihood ratio-type test (Shao 2003; Maronna et al. 2006) to correctly identify the periodic signal components along with offsets in the functional model of the time series in the presence of outliers. We also make a comparison between this methodology and the one relying on the least squares method. After identifying the most likely functional model and disregarding the outliers in the data, based on the LS-VCE theory introduced by Teunissen (1988), the amplitudes of the noise components of a few coordinate time series are estimated by the M-estimators. In other words, a new.

(3) GPS position time-series analysis. 17. robust approach to estimating (co)variance components is proposed. In the case that the data are normally distributed, in contrast to the existing robust variance component estimation techniques, this new approach allows one to compute asymptotic confidence interval for the noise components. The structure of this paper is organized as follows. In Sect. 2, the M-estimation principle along with its asymptotic normality in linear models is briefly recapitulated. In Sect. 3, a robust strategy for jointly detecting periodic signal components and the offsets in the time series will be presented. Section 4 is devoted to the estimation of the (co)variance components by means of the M-estimation technique. Numerical examples of the simulated and real data are given in Sect. 5. Finally, the conclusions are presented in Sect. 6.. 2 M-estimation and its asymptotic normality in linear models M-estimation is generally defined through the generalization of maximum likelihood estimation (MLE) (Maronna et al. 2006). This generalization has been conducted by replacing the log-likelihood function with the sum of some convex functions in a negative sign and consequently, choosing a reasonable convex function that can make the estimate robust. The similarity of MLE and M-estimation, together with the asymptotic normality of MLE, may motivate one to investigate such an asymptotic property for the M-estimators. Since the asymptotic theory of M-estimation is mostly based on the independent and identically distributed residuals (Huber 1981) one can utilize one of the decorrelation methods as an auxiliary transformation to transform the general covariance matrix of observables, namely Q y , into an almost diagonal one. There are different decorrelation algorithms (see e.g., Erenoglu and Hekimoglu 2009). Here, we consider the Cholesky algorithm to decorrelate the original covariance matrix. Applying the Cholesky decomposition to the inverse = of the covariance matrix of the observables (i.e., Q−1 y ∼ T T . G y G y ) together with the transformation ( ) = G y (.), leads to the following transformed functional model (Koch 1999): ˜ + e˜ y˜ = Ax. (1). which y˜ has an identity covariance matrix. The underscore indicates a random vector. Note that the covariance matrix Q y is not often completely known. Therefore, the unknown part, formed as (co)variance components, needs to be estimated in order for the covariance matrix to be decomposed (see Sect. 5). Approximate values of the components can be provided using the existing methods of variance component estimation. The covariance matrix is then estimated through an iterative procedure.. In view of the close relationship between MLE and M-estimation, we begin with the criterion of MLE. The principle of MLE is to maximize the likelihood function f y (˜y|x), which reads (Koch 1999) xˆ = arg max f y (˜y|x). (2). Making use of the concept of log-likelihood function, the preceding equation might be turned into the following minimization problem:   (3) xˆ = arg min − log f y (˜y|x)   Now, we focus our attention on the term − log f y (˜y|x) , which must be minimized. In case of normally distributed observations, this term becomes proportional to a sum of squares of the residuals and the estimates will then coincide with the standard least squares solution. However, one may prefer to alleviate the adverse effects of the outlying observations by replacing the quadratic functions e˜i2 , i = 1, . . . , m,with other appropriate functions of residuals, say ρ(e˜ i ), i = 1, . . . , m. Consequently, replacing {− log f y (˜y|x) with the sum of the nonnegative convex functions ρ(e˜i ) in Eq. (3), brings us the M-estimation criterion as follows (Huber 1981; Maronna et al. 2006)  m   ρ(e˜i ) (4) xˆ = arg min i=1. Differentiating the M-estimation criterion with respect to the parameter vector x, based on the linear functional model in Eq. (1), results in a system of equations similar to the normal equations system of the classical least squares method   ˆ = ψ(ˆe )ψ(ˆe ) . . . ψ(ˆe ) T ˆ = 0 with ψ ˜Tψ A (5) 1 2 m dρ where ψ = de denotes the ψ-function (i.e., the derivai tive function of ρ) and uniquely specifies the method of M-estimation and eˆ i is the ith component of the transformed adjusted residual vector eˆ to be estimated. According to Eq. (4), M-estimation encompasses several estimators, of which the least squares estimator is well known. The least squares criterion follows immediately by setting ρ(e˜i ) = 1 2 2 e˜i or equivalently, by setting the ψ-function equal to ψ(e˜i ) = e˜i . Since Eq. (5) is a nonlinear system of equations in general, one may apply an iterative algorithm to compute the estimates. In particular, if the ψ-function can be approximated linearly at the origin, one may then define the following equivalent weights (Maronna et al. 2006):. w(e˜i ) =. 1 ψ(e˜i ); e˜i = 0 i = 1, 2, . . . , m e˜i. (6). in which the equivalent weights w(e˜i ), i = 1, 2, . . . , m, separately approximate the derivative of ψ-function at the origin.. 123.

(4) 18. A. Khodabandeh et al.. Table 1 ψ-functions of some M-Estimators as a function of the standardized residuals along with their asymptotic parameters for normally distributed data. Method. ψ−function. Asymptotic parameters. Least squares. ψ(ei ) = ei. L1 norm. ψ(ei ) =. σψ2 = 1; μψ  = 1. σψ2 = 1; μψ  = π2. ei. |ei | . e ≤ γ ei ; i. ψ(ei ) = γ ei ; e. > γ i ei | |. ⎧. ; e e ⎨ i e i ≤ γ. o ψ(ei ) = γo |ei | ; γo < ei ≤ γ1 i. ⎩. e > γ1 0; i. ⎧. e ≤ γo ei ; ⎪ i. ⎪ ei ⎪. ⎨ γo ; γ o < ei ≤ γ1 |ei |. ψ(ei ) = (γ2 −|ei |)ei ⎪ γo ; γ1 < e ≤ γ2 ⎪ ⎪ ⎩ |ei |(γ2 −γ1 ). i. e > γ2 0; i. . e ≤ γ ei ; i ψ(ei ) = |ei |. e exp(1 − ); e > γ. Huber. IGGIII. Hampel. Krarup. γ. i. Substituting Eq. (6) into Eq. (5), based on the relation ˜ x yields an iterative algorithm, the so-called iteraeˆ = y˜ − Aˆ tively reweighted least squares as follows (Krarup et al. 1980). i. for: γ =1.5 σψ2 ≈ 0.7872; μψ  ≈ 0.8664 for: γo = 2, γ1 = 4.5 σψ2 ≈ 0.9291; μψ  ≈ 0.9545 for: γo = 1.5, γ1 = 3, γ2 = 6 σψ2 ≈ 0.9872; μψ  ≈ 0.9928 for: γ =3 σψ2 ≈ 1.0005; μψ  ≈ 0.9989. where σψ2 is the variance of ψ-function and μψ  is the expected value for the derivative of ψ(ei ) with an arbitrary index (e.g., i = 1), ‘→d ’ stands for ‘tends in distribution’. σ2. ˜ xˆ k+1 = (A Wk A) ˜T. −1. ˜T. A Wk y˜. (7).  −1 = AT G y Wk GTy A AT G y Wk GTy y. ψ. (8). where subscript k refers to the iteration counter and the diagonal matrix Wk contains the weights defined in Eq. (6) at step k. Note that Eq. (8) will be reduced to the familiar least squares estimation, when the matrix Wk is an identity matrix. The functional relation between the M-estimates and the observations is generally nonlinear which also follows from the implicit relation in Eq. (8). Therefore, performing the covariance propagation law is not always applicable in such cases and as a result, one may have to rely on the numerical techniques to compute the (co)variances of the estimated parameters such as bootstrapping methods (Efron and Tibshirani 1993). However, one may use the asymptotic version of the covariance matrix of the estimation which can be analytically obtained and provides reasonable estimates for the (co)variances especially when the number of observations is large. Now the task is to investigate the asymptotic distribution of the M-estimates and consequently to derive the asymptotic covariance matrix of the estimated parameters. The asymptotic normality of M-estimation in linear models follows as (Appendix A.1)  (ˆx − x) →d Nn 0,. 123. σψ2  μ2ψ . A. T. ψ. distribution of 2order ‘n’ with the mean ‘0’ and the covariσ −1 ance matrix ‘ μ2ψ (AT Q−1 y A) ’.. or in the original form of xˆ k+1. −1 Nn (0, μ2ψ (AT Q−1 y A) ) denotes the multivariate normal. −1. Q−1 y A.  (9). The important feature of the asymptotic normality of M-estimation is to enable one to perform the robust statistical testing and to provide confidence region for the estimates as an approximate alternative for the classical ones (Shao 2003). This approximation improves as the number of observations increases. The ψ-functions of a few M-estimators as a function of the standardized residuals along with their asymptotic parameters σψ2 and μψ  (computed numerically) are given in Table 1. As considered, the main difference between ψ-function of the least squares method and its robust counterparts is that the function is not bounded with respect to the residuals, i.e., ψ(˜ei ) = e˜ i . In fact, Hampel et al. (1986) showed that ψ-function of an M-estimator should be bounded in order to fulfill the robustness property. As a popular example of robust M-estimators, one can mention L1 norm estimation which, in the case of linear models, uses a special sufficient (and not redundant) subset of observations for estimating the parameters. As seen in Table 1, this estimator has a bounded ψ-function, i.e., ψ(ei ) = |eeii | . The other estimators, listed in Table 1, have been designed to behave like the least squares estimation for having the smallest possible residuals. For instance, ψ-function of Huber is identical to its least squares’ counterpart for the residuals bounded by the parameter γ . However, to mitigate the undesirable effects of the blunders, ψ-function of Huber is changed to perform similar to L1 norm estima-.

(5) GPS position time-series analysis. 19. tion. By setting ψ-function equal to zero for the large residuals, one may also completely remove the contribution of observations corresponding to such residuals. Of this type of M-estimators, the M-estimation of Hampel and the method of IGGIII have been included in Table 1. Parameter γ , associated with the aforementioned estimators, is usually chosen on the basis of the objective requirement of the problem (Yang et al. 2005). 3 Robust functional model identification Once the observations are collected, the next step is to find the most appropriate model which could satisfactorily describe the observations. In case of coordinate time-series analysis, except for the sporadic significant deformation, a simple linear trend is usually a good representative of the deformation behavior to study the surface displacement (Amiri-Simkooei 2007). However, when there are periodic effects and/or jumps in the series, one has to extend the functional model to avoid receiving biased estimates. To make a correct decision on choosing such a functional model, one may perform statistical testing on the basis of the likelihood ratio test (Teunissen et al. 2005). Note, however, that one always needs to have a firm background on the overall model to restrict likely alternative hypotheses. In this way, Perfetti (2006) used DIA procedure (Teunissen 2000) and detected offsets and outliers in the series in two separate phases. Amiri-Simkooei (2007) developed least squares harmonic estimation (LS-HE) based on the parameter significant test to capture unmodeled periodic signals using trigonometric functions. To detect unknown offsets in a sequence of stationary white noise observations, Williams (2003b) proposed a change detection algorithm (Basseville and Nikiforov 1993) which is also based on statistical testing (see Appendix B). Since finding different types of functional effects (i.e., periodic effects and offsets) based on the statistical testing might be mutually influenced by each other (see Sect. 5.1), one may prefer to simultaneously apply such a testing procedure to the given coordinate time series to detect both the periodic effects and the offsets. One may also be tempted to use the robust version of likelihood ratio test to limit the undesirable effects of possible outliers. In the same line as Amiri-Simkooei et al. (2007), the procedure of functional model identification might be conducted in a stepwise manner. In each step, the most recent functional model as the null hypothesis Ho must be tested against the most likely alternative hypothesis Ha which are expressed as follows (at step k) Ho : E{y} = ιyo + ur + A[k−1] x[k−1] versus Ha : E{y} = ιyo + ur + A[k−1] x[k−1] + Ak xk. (10). where yo and r denote the intercept and the slope of the linear trend respectively, the m-vector ι contains ones and the mvector u contains the time instants ti (i = 1, 2, . . . , m). The m × qk extended design matrix Ak along with the qk -vector xk describes additional parameters to be added in the alternative hypothesis at step k, and for the previous steps, we also have A[k−1] = A1 A2 . . . Ak−1 with T  T x[k−1] = x1T x2T . . . xk−1 . Given the most likely alternative hypothesis, the task is to test Ho against Ha by a test statistic with known distribution. For the case that the variance of unit weight of the observations, namely σ 2 is unknown (i.e. Q y = σ 2 Qo ), one may use τ -distribution (Pope 1976). In this case, the following test statistic can be used (Koch 1999; Teunissen et al. 2005) Tqk =. ⊥ ˆk xˆ kT AkT Q−1 o P A Ak x. (11). qk σˆ 2. with.   T −1 −1 T −1 P⊥ A = Im − A(A Qo A) A Qo , A = ι u A[k−1] (12) 2. where σˆ is the unbiased estimator for the variance of unit weight under Ho . xˆ k is the least squares estimate for the parameters xk . When the observations are normally distributed, the square root of the test statistic introduced in Eq. (11) has a central τ -distribution under Ho . This follows because the least squares estimator has a normal distribution (Koch 1999). If, however, the least squares estimator xˆ k is replaced with its corresponding robust M-estimates, regarding the asymptotic normality of M-estimation (see Eq. 9), the square root of the test statistic Tq is then approximately distributed as central k τ -distribution   k.  Tqk ∼ τ qk , m − 2 − qk (13) i=1. where ‘∼’ stands for ‘approximately distributed as’. Such a testing as a robust version of likelihood ratio test is known as likelihood ratio-type test (Maronna et al. 2006). The test is approximately valid for large number of observations. To find any potential periodicity or any offset in the time series, one has to first specify the corresponding quantities Ak and xk in the alternative hypothesis. For the periodic unmodeled effects with the unknown frequency ω, one may write (Amiri-Simkooei et al. 2007) ⎡. cos(ωt1 ) ⎢ cos(ωt2 ) ⎢ Ak = ⎢ .. ⎣ . cos(ωtm ). ⎤ sin(ωt1 )   sin(ωt2 ) ⎥ ak ⎥ with x and qk = 2 = ⎥ .. k bk ⎦ . sin(ωtm ). (14). 123.

(6) 20. A. Khodabandeh et al.. and for the offsets with the unknown position toff (Williams 2003b) ⎡ ⎤ p1 ⎢ p2 ⎥ ⎢ ⎥ Ak = ⎢ . ⎥ with xk = ck and qk = 1, ⎣ .. ⎦ pm such that pi =. . 1; 0;. ti > toff ti ≤ toff. (15). To find the most likely alternative hypothesis or equivalently to detect the misspecifications in the model, one may deal with the following maximization problem  Tqk ω ∈ R; periodic effects max 2 ; such that λ = toff ∈ {1, 2, . . . , m}; offsets λ τα (16) where τα2 denotes the critical value at the significant level set of α. To come up with the maximization problem in Eq. (16), one has to perform numerical search algorithm in which the set of {1, 2, . . . , m} for the offset position toff together with the set of discrete values for frequency ω can be used to evaluate the score function. The solution is then achieved through an element that belongs to these sets of which the score function attains the maximum value (see Sect. 5). To check all possible frequencies, the step size used for making the set of discrete values is taken small (e.g., 0.1 or 0.01 in units of day). This procedure should be iterated until no further significant unmodeled effect exists at the significance level of α (e.g., α = 0.0001). This procedure can, therefore, be considered as an extended version of LS-HE by which both the significant offsets and the periodic effects are simultaneously captured. It is also important to recall that an effect is thought as a significant unmodeled effect, if its corresponding test statistic Tq is greater than the critical value τα2 (i.e., the score funck tion in Eq. (16) should be greater than one). It can be shown that the proposed algorithm includes the change detection algorithm introduced by Basseville and Nikiforov (1993) as a special case (see Appendix B).. 4 M-estimation of Variance components In an ideal case where all functional effects in a given coordinate time series are fully understood, one would desire that the stochastic behavior of the time series is described by white noise. In practice, however, the noise of coordinate time series is not completely white. In other words, there are still some unmodeled effects remaining in the data which are reflected in the stochastic model. Though parts of such undetected effects can be of a deterministic nature, their. 123. variations cannot be distinguished from the actual noise and hence they can be interpreted as colored noise. Several studies demonstrated that a combination of white noise and flicker noise is the most appropriate model to describe the noise characteristics in daily GPS coordinate time series (see e.g., Zhang et al. 1997; Mao et al. 1999; Calais 1999; Williams et al. 2004; Amiri-Simkooei et al. 2007). To compute the amplitude of the noise components, one may employ one of the methods of variance component estimation. Apart from the systematic unmodeled effects, the estimates of (co)variance components could be seriously biased by some outlying observations. In fact, in addition to the first-order moment of the observables (i.e., unknown parameters), the higher-order moments might also be affected by the outliers. To mitigate the effects of the outliers, similar to the functional model, one can apply robust variance component estimation to the stochastic model. Yang et al. (2005) used the equivalent weights and formed the robust Helmert variance component estimation. Peng (2009) employed the expectation-maximization (EM) algorithm to jointly compute robust MLEs of the unknown parameters and variance components. Here, we have developed a method of variance components estimation that relies on the theory of M-estimation in linear models (see Sect. 2) with the help of linear (co)variance component model derived as LS-VCE theory (Teunissen and Amiri-Simkooei 2008; Amiri-Simkooei 2007). For normally distributed data, the linear model for the (co)variance components (similar to the one given in Eq. (1)) can be represented as (Amiri-Simkooei 2007) with Q yvh = 2D + (Qt ⊗ Qt )D +T (17)  T where the p-vector σ = σ1 σ2 . . . σ p denotes unknown (co)variance components to be estimated, and D + is the pseudo-inverse of the duplication matrix D, which together with the covariance matrix of misclosure vector Qt , constructs the covariance matrix of the corresponding observables of the stochastic model Q yvh . evh is the vector of the corresponding residuals and ‘⊗’ is the Kronecker product. Using the vh-operator, the design matrix Avh and observations vector yvh of the stochastic model are made as follows yvh = Avh σ + evh. yvh = vh(t t T ),. and. Avh. = [vh(B Q1 B), . . . , vh(BT Q p B)] T. (18). where t = BT y is the misclosure vector, the symmetric cofac. . . , p) are known and the covaritor matrices Qi (i = 1, 2, p ance matrix Q y = i=1 σi Qi is assumed to be positive definite, and B is an m × (m − n) basis matrix of which the columns span the null space of AT (i.e., AT B = 0). Theoretically, having the linear (co)variance component model of Eq. (17), the M-estimates of (co)variance components along with their asymptotic covariance matrix are.

(7) GPS position time-series analysis. 21. easily obtained through the derivations in Sect. 2. Because the size of observations as well as the order of the design matrix Avh is extremely large, this idea is not very attractive from a computational point of view. To overcome this difficulty, our task is to find a transformation similar to the one mentioned in Eq. (1) by which the covariance matrix Q yvh becomes diagonal. This can be accomplished with the aid of matrix Qt which is relatively small in size compared to matrix Q yvh . Using the properties of the Kronecker product and vh-operator, one may propose the following transformation (see Eq. (1)) T T T ˜ vh = Gvh yvh , e˜ vh = Gvh evh , A Avh y˜ vh = Gvh. (19). where the regular transformation matrix Gvh is given as T Gvh = D T (Gt ⊗ Gt )D +T  , such that Q−1 t = Gt Gt (20). with. ⎛. ⎞ times p−2 times !" # !" # 1 1 ⎟ ⎜ 1  = diag ⎝ √ , 1, 1, . . . , 1, √ , 1, 1, . . . , 1, . . . , √ ⎠ 2 2 2 p−1. (21) where diag(.) is the diagonal operator. Applying the transformation introduced in Eq. (19) to the linear model of Eq. (17) converts the covariance matrix Q yvh to an identity matrix (see Appendix A.3). In other words, decorrelation of the covariance matrix of the misclosure vector t leads to the decorrelation of Q yvh (see Eq. (20)). This fact persuades one to work with the estimated misclosure vector t rather than the adjusted residuals eˆ vh to form the equivalent weights. Given the equivalent weights for t as a diagonal matrix, namely Wt , the equivalent weights for the stochastic model can be expressed as Wvh =. 1 −1 T  D ( Wt ⊗ Wt )D−1 2. (23). In a similar way to that presented by Amiri-Simkooei (2007), the elements of the normal matrix and the right-hand side vector, namely npl and l p respectively are given as (see Appendix A.2) npl =. 1 1 tr(BT Q p BWBT Ql BW), l p = t T WBT Q p BW t 2 2. (24) with W = Gt Wt GtT. σˆ k = N−1 k lk. (26). Now, one has to compute the remaining part of the equivalent weights Wt . As previously mentioned, the equivalent weights of the functional model given in Eq. (6) are made based on the standardized residuals (see Table 1). In a similar way, the diagonal matrix Wt can be made based on the standardized residuals of the misclosure vector t. The nonnegative adjusted residuals of the misclosure vector namely tˆ may be computed as follows T tˆ

(8) tˆ = vecdiag(tˆ tˆ ), . with  p  T T ˆt tˆ T = abs GtT tˆ tˆ T Gt − σˆ i Gt B Qi BGt. (27). i=1. where‘

(9) ’ is the elementwise product (Hadamard product). vecdiag(.) stands for diagonal elements of a matrix, and abs(.) makes the elements of a matrix become nonnegative. In other words, the nonnegative adjusted residuals of the misclosure vector are the square roots of the elements of the vector tˆ

(10) tˆ. Once the vector tˆ is obtained, the diagonal elements of Wt (i.e., w(t˜i ) = t˜1 ψ(t˜i ), i = 1, 2, . . . , m − n), for each i of the M-estimators are computed. To get a better understanding of the above derivations, a brief review on basic concepts and results in LS-VCE theory has been provided in Appendix A.2. In the sequel, the performance of the method is investigated based on the M-estimators given in Table 1 and the results are compared with those obtained by the least squares method.. (22). According to Eq. (8), one has to construct the normal matrix N and the right-hand side vector l as follows T T T T Gvh Wvh Gvh Avh , l = Avh Gvh Wvh Gvh yvh N = Avh. where tr(.) stands for ‘trace’ of a matrix. Note, if the matrix Wt is an identity matrix, then the relations obtained in Eq. (24) coincide with the ones derived by LS-VCE. Finally, the M-estimates of the (co)variance components are computed through iteration as follows (see Eq. (8)). (25). 5 Numerical results and discussions As a primary validation of the proposed method for variance component estimation, 50,000 sequences of normal white noise observations (m = 200) with the standard deviation of σ = 5 mm(zero-mean time series) are simulated using MATLAB software. One may spoil the data by outliers of size between 3σ − 6σ with the contamination fraction of ε = 0.05, where ε denotes the ratio of outliers to the whole observations. The goal is to estimate the amplitude of the noise and thus to investigate the overall performance of the given M-estimators against outlying observations using the mean values of the estimates. When there is no outlier in the data, the results obtained by LS-VCE approximately coincide with the theoretical value. 123.

(11) 22. A. Khodabandeh et al.. Table 2 Mean values of the noise amplitude estimates ( mm), with amplitude of σ = 5 mm, obtained by a simulated data set of size 50,000 Method. Least squares−. Least squares+. L1 norm+. Huber+. IGGIII+. Krarup+. Hampel+. Mean value. 4.814. 7.414. 4.613. 4.595. 4.436. 4.385. 4.250. The fraction of outliers is 5%. The superscripts ‘_ ’ and ‘+ ’ mean ‘free of outliers’ and ‘in the face of outliers’, respectively. (Table 2, σˆ = 4.8 mm). However, when the outliers contaminate the data, the estimate is significantly inflated and deviates from its theoretical value. To reduce such effects, we may apply other M-estimators. In contrast to the least squares estimate, they deliver more satisfactory results which are not far from their theoretical counterpart (Table 2). To illustrate the advantages of the robust M-estimators for obtaining reliable results in both functional and stochastic models of the coordinate time series, the proposed methods have been applied to a simulated time series and the coordinate time series of a few permanent GPS stations. Three different types of stochastic models, namely, model I (purely white noise), model II (combination of white noise and flicker noise), and model III (combination of white noise and random walk noise) have been considered, in which the cofactor matrices of flicker noise and random walk noise introduced by Zhang et al. (1997) are utilized. Note, however, that one can also utilize the Hosking flicker noise covariance matrix, introduced by Williams (2003a), which is different in magnitude. Therefore, the flicker noise variances used here are roughly one-half the size of those quoted in the paper by Williams (2003a).. 5.1 Simulated data An 8-year time series of daily positions containing white noise with amplitude of σw = 4 mm plus flicker noise with amplitude of σ f = 6 mm is simulated. In addition to a simple linear trend, two trigonometric functions with amplitude of 3 and 2 mm, respectively, for the annual and semi-annual signals, have been added to the time series (i.e., annual period = 365.25 days, semi-annual period = 182.625 days). Three offsets with magnitude of 2σw , 3σw and 1.5σw have been imposed on the data which are positioned in the time epochs of 290, 1,000 and 2,400, respectively. Now, a contaminated version of the time series with contamination fraction of ε = 0.01 is considered, in which the size of outliers is chosen between 15 and 30 mm. The aim is to find an appropriate functional model and to estimate the noise components of the time series. To remove outliers, one may perform statistical testing. Using the significance level of α = 0.01, 1.16% of data are removed, which implies that some of the observations are incorrectly flagged as outliers. On the other hand, the significance level of α = 0.001 detects only 65% of the outliers. In the following, for the purpose of. 123. comparison, the M-estimators of Table 1 are independently applied to the data. To find the simulated offsets and periodic signals, the procedure introduced in Sect. 3 will be carried out. One should first have proper information on the stochastic structure of the data. In other words, the cofactor matrix Qo , given in Eq. (11) should be available. To make this issue more apparent, we first tried to capture the misspecifications using white noise structure, i.e., Qo = I. On the basis of this assumption, we found nine significant unmodeled effects (five periodic effects plus four offsets), which exceeds the five simulated ones (two periodic effects plus three offsets). The numerical values of the score function of Eq. (16) are plotted in left panel of Fig. 1. Using white noise plus flicker noise model, our solutions turned out to be different. We set the approximate values of σw = 3 mm and σ f = 5 mm which result in Qo = σw2 I + σ 2f Q f , with Q f the cofactor of flicker noise. After removing the total five simulated effects, there are no other significant misspecifications in the data, i.e., all values of the score function are below one (Fig. 1, right panel). This is what we would expect, since improper knowledge about the stochastic model will be reflected as unmodeled effects in the functional part, and vice versa. The duality between the functional and stochastic models has been already pointed out by Amiri-Simkooei (2007). The reader can find more explanations of the anomalous periodic effects in GPS time series in Ray et al. (2008) or Langbein (2008). After removing offsets, the fluctuations due to the periodic terms show up themselves as some significant offsets in data (Fig. 1, right panel). Strictly speaking, it is possible that one type of effect to show itself as another effect. It is, therefore, advisable to jointly perform statistical testing for finding the periodic effects and the offsets. The estimates of the offsets positions along with the period of the periodic terms, based on the flicker noise structure, are computed stepwise (Table 3). In the case where the data are free from outliers, the offset positions as well as the annual/semi-annual periodicities are appropriately identified. In the contaminated cases, the results are slightly different. All of the methods estimate the position of the offset at epoch 1,000 much better than the other cases. It makes sense, since its position is not far from the center of the series and as Williams (2003b) demonstrated, the offsets positioned in the middle of the series have the highest chance of being correctly detected. For the offset with the smallest size (i.e.,.

(12) GPS position time-series analysis. 23. Fig. 1 Numerical values of the score function in Eq. (16) for offsets (red dashed line) and periodic effects (blue line); Left (using purely white noise structure): before removing offsets (top-left), after removing offsets (middle-left), after removing both simulated offsets and periodic effects (bottom-left); Right (using white noise and flicker noise. structure): before removing offsets (top-right), after removing offsets (middle-right), after removing both simulated offsets and periodic effects (bottom-right).The significance level of (0.0001) has been depicted by green dashed line. epoch 2400), the L1 norm method together with the method of IGGIII provide more realistic results over the others. After providing a proper functional model, one can use the proposed method of variance component estimation to assess the noise characteristics of the series. The basis matrix B introduced in Eq. (18) is computed by the subroutine ‘null.m’ of MATLAB software. The estimates of the noise components along with their asymptotic uncertainties are presented in Table 4. To evaluate the efficiency of the robust M-estimators, the least squares estimates, after removing outliers at significance level of α = 0.001, are also included in Table 4. As considered, the results obtained in this way are close to those computed by the M-estimators which show the robustness of the methods against outliers. Based on the results corresponding to model I, the inflated least squares estimate has been reduced from 6.32 to 5.83 mm after removing outliers. But, it is still larger than the one obtained using. the original data (σˆ w = 5.72 mm). This implies that there are some undetected outliers in the data as mentioned earlier using statistical testing. Another issue is that the random walk noise amplitude estimates vary more due to the presence of outliers than the corresponding flicker noise components. This is probably due to the fact that the stochastic model of the data has been simulated to behave like a combination of white noise and flicker noise. Since this model defines the random behavior of the data, it can be expected to be more resistant to outliers. We also highlight that essential care is needed to choose an appropriate stochastic model. For example, if one improperly over-parameterize a default model (e.g., purely white noise model), misspecifications may then manifest themselves unrealistically as incorrect estimates of additional parameters (e.g., colored noise components). This phenomenon is not restricted to the least squares adjustment and also holds for the other M-estimators. The reader can find. 123.

(13) 24. A. Khodabandeh et al.. Table 3 Offset position estimates (steps: 1, 2 and 4) and the annual (step 5), semi-annual (step 3) period estimates (days) in the simulated time-series for contamination fractions of 1% Method. Step 1. Step 2. Step 3 (semi-annual). Step 4. Step 5 (annual). Least squares_. 1,001. 286. 182.11. 2,445. 342.89. squares+. 1,000. 297. 191.80. 2,439. 350.19. Least squares∗. 1,001. 287. 183.39. 2,441. 341.95. 998. 292. 188.14. 2,413. 344.80. Huber +. 1,000. 294. 189.91. 2,433. 356.37. IGGIII+. 1,000. 294. 184.25. 2,413. 344.63. Krarup+. 1,001. 297. 189.82. 2,439. 350.24. Hampel+. 1,001. 294. 190.77. 2,439. 351.05. Least. L1 norm+. The superscripts ‘_ ’, ‘+ ’ and ‘∗ ’ mean ‘free of outliers’, ‘in the face of outliers’ and ‘after removing outliers (α = 0.001)’, respectively Table 4 Noise components estimates: white noise (WN ( mm)), flicker noise (FN ( mm)), random walk noise (RW ( mm/yr1/2 )) and their asymptotic precision of the simulated time series for three stochastic models: model (I), model (II) and model (III) Method. Stochastic model (I) WN. (II) WN & FN. (III) WN & RW. σˆ. σσˆ. σˆ. σσˆ. σˆ. σσˆ. σˆ. σσˆ. Least squares−. 5.72. 0.07. 3.95. 0.09. 6.23. 0.28. 4.66. 0.05. 13.30. 0.32. squares+. 6.32. 0.07. 4.67. 0.11. 6.42. 0.29. 5.32. 0.06. 14.16. 0.43. Least Squares∗. 5.83. 0.08. 4.21. 0.09. 6.29. 0.29. 4.74. 0.04. 13.49. 0.31. L1 norm+. 5.64. 0.11. 3.86. 0.12. 6.11. 0.41. 4.12. 0.06. 13.45. 0.41. Huber+. 5.31. 0.07. 3.88. 0.08. 6.29. 0.30. 4.34. 0.05. 12.93. 0.32. IGGIII+. 5.29. 0.06. 4.10. 0.08. 6.16. 0.27. 3.98. 0.05. 13.24. 0.32. Krarup+. 5.14. 0.06. 4.07. 0.08. 6.07. 0.29. 4.53. 0.06. 13.54. 0.32. Hampel+. 4.89. 0.06. 4.18. 0.08. 6.21. 0.29. 4.69. 0.05. 13.51. 0.31. Least. σˆ. σσˆ. The fraction of outliers is equal to 1%. The superscripts ‘_ ’, ‘+ ’ and ‘∗ ’ mean ‘free of outliers’, ‘in the face of outliers’ and ‘after removing outliers (α = 0.001)’, respectively. useful discussions about determining a proper noise model in Williams et al. (2004) or Amiri-Simkooei et al. (2007). According to Table 1, one would expect that the asymptotic variances of the robust M-estimates are larger than the ones of least squares estimates, which is not realized in some results. In other words, one expects that the least squares estimates are more precise than the other estimates. The above statement only holds when they deliver the component estimates of the same amplitude. More precisely, in case of (co)variance component estimation, the covariance matrix of observables and consequently the uncertainty of the components depend on the value of (co)variance component estimates. It means that the covariance matrix Qσˆ of the (co)variance components depends on the (affected) estimates ˆ σˆ rather than of (co)variance components σˆ . We then have Q Qσˆ . 5.2 Real-world data Eight years of daily solutions of the coordinate time series for 12 globally distributed permanent stations have been. 123. used. They are provided by the GPS Analysis Center at JPL (Beutler et al. 1999). The data were processed by the GIPSY software using the precise point positioning method (Zumberge et al. 1997). Each method of M-estimation is independently applied to remove the possible unmodeled effects (i.e., periodic effects and offsets). Amplitudes of the noise components of the three stochastic models are then estimated for the individual series. To provide a fair comparison between the results, a statistical test has been performed for removing possible blunders. The fractions of outliers on α = 0.01and α = 0.001 significance levels are shown in Fig. 2. The least squares estimates of the noise components, based on the refined data at α = 0.001 significance level, are also presented. White noise amplitude estimates of model I and their 95% asymptotic confidence intervals are depicted in Fig. 3. The noise amplitudes of the up components are approximately 2–3 times larger than the noise amplitudes of the horizontal components. As considered in Table 1, the methods of M-estimation, except for L1 norm estimation, have been designed to behave like the least squares estimation in the sense of.

(14) GPS position time-series analysis. 25. Fig. 2 The fractions of outliers in given time-series on 0.01 (left) and 0.001 (right) significance levels. The north, east and up components are shown in blue, green and red, respectively. having the smallest possible residuals. As a result, a significant difference between the least squares estimates and the other robust M-estimates indicates the occurrences of outliers in the data. Therefore, one can directly deduce that there exists a large number of outliers in the data, especially in the east components of ALGO and PETP, which follows from the considerable difference between the least squares estimates and other robust estimates. This issue may also be verified by performing the statistical tests (see Fig. 2). As shown in Fig. 4, it can be clearly seen that the results illustrated in Figs. 2 and 3 are relatively correlated. For example, the noise amplitude estimates corresponding to the north components of BSRY and MANA (Fig. 3) are close to each other, which correspond to the small fraction of outliers in Fig. 2. On the other hand, the large number of outliers in the east components of PETP and ALGO results in the large difference between the least squares estimate and those estimated by the other M-estimators (see Fig. 4). On average, for the. case of refined observations, the least squares estimates are relatively close to the M-estimates. At the same time, we observe that these estimates are a bit larger than their corresponding M-estimates in several cases. This may result from some undetected outliers that are still masked in the data. Figures 5 and 6 show the flicker noise and random walk noise amplitude estimates of models II and III, respectively. The behavior of the flicker noise amplitudes on average is relatively uniform for the M-estimation methods (i.e., the differences between the amplitudes are negligible). But this is not the case for the amplitudes of random walk noise. One may, therefore, infer, as shown in the simulated case, that the GPS position time series most likely contain noise closer to a flicker than a random walk structure, which is in agreement with the previous work of Zhang et al. (1997); Mao et al. (1999); Calais (1999); Williams et al. (2004); Amiri-Simkooei (2007). Therefore, in terms of the. 123.

(15) 26. A. Khodabandeh et al.. Fig. 3 White noise amplitude estimates of model I (purely white noise) and their 95% asymptotic confidence intervals (vertical lines) of 12 permanent GPS stations obtained by the methods of least squares (red),. least squares after removing outliers (grey), L1 norm (blue), Huber (black), IGGIII (green), Krarup (pink) and Hampel (orange). robustness, this also reveals that the stochastic model described as a combination of white noise and flicker noise, is likely to be more realistic than the one made by a combination of white noise and random walk noise.. 6 Conclusions. 123. Integrated with the least squares adjustment, robust M-estimators can serve as strong tools to guard mathematical.

(16) GPS position time-series analysis. 27. Fig. 4 The fractions of outliers on 0.001 significance level (red thick bars) versus the difference between the least squares estimate of white noise amplitudes and those obtained by the methods of least squares. after removing outliers (grey), L1 norm (blue), Huber (black), IGGIII (green), Krarup (pink) and Hampel (orange). models against abnormal data. At the same time, they are not always far superior to the traditional outlier removal methods. Combination of the least-squares method with M-estimators. was shown to produce a unified analysis of coordinate time series, in both the functional and stochastic models. Based on the asymptotic normality of M-estimation in linear models,. 123.

(17) 28. A. Khodabandeh et al.. Fig. 5 Flicker noise amplitude estimates of model II (white noise and flicker noise) and their 95% asymptotic confidence intervals (vertical lines) of 12 permanent GPS stations obtained by the methods. of least squares (red), least squares after removing outliers (grey), L1 norm (blue), Huber (black), IGGIII (green), Krarup (pink) and Hampel (orange). the likelihood ratio-type test as a robust version of the familiar likelihood ratio test was applied to correctly detect and identify the periodic effects and the offsets. Such a testing is. valid for the case where the number of observations is significantly larger than the unknowns, which generally holds for GPS coordinate time series.. 123.

(18) GPS position time-series analysis. 29. Fig. 6 Random walk noise amplitude estimates of model III (white noise and random walk noise) and their 95% asymptotic confidence intervals (vertical lines) of 12 permanent GPS stations obtained by the. methods of least squares (red), least squares after removing outliers (grey), L1 norm (blue), Huber (black), IGGIII (green), Krarup (pink) and Hampel (orange). Using the linear (co)variance component model provided by the LS-VCE theory, we developed a robust method of variance component estimation. Since this method relies on the. linear model, the corresponding properties of M-estimation in the linear functional model can be established in a straightforward manner for such a case. For example, this method,. 123.

(19) 30. A. Khodabandeh et al.. in addition to (co)variance component estimates, provides asymptotic uncertainty for the estimated components. This is in contrast to the other robust estimation methods on VCE. The method was applied to a few simulated and real coordinate time series and a comparison was made with the least squares method. The difference between the results affirmed that the data contain a large number of outliers. This also was investigated via statistical testing. As considered, choosing a subjective level of significance leads to a subjective set of observations to be flagged as outliers. In a similar fashion, choosing a subjective M-estimator provides subjective estimates for the parameters. On the basis of the results obtained from real GPS time series, it is inferred that the random behavior of the series is better described by a combination of white noise and flicker noise as verified by previous studies. We could also conclude that the least-squares estimate of the flicker noise component is less affected by the possible undetected blunders in the series. Acknowledgments The authors would like to acknowledge Dr. S. D. P. Williams, the editor, Prof. R. Klees, the editor-in-chief, and three anonymous reviewers for their valuable comments, which significantly improved the quality and presentation of this paper.. Appendix A A.1 Asymptotic normality of M-estimation in linear models ˆ into its Taylor series at x and substiExpanding the vector ψ tuting the result in Eq. (5) give the following expression: ' ' ˜Tψ + A ˜ T ∂e ψ A(ˆ ˜ x − x) + o p 'xˆ − x' = 0 (28) A where ∂e ψ is a diagonal matrix containing the derivative ' ' dψ(ei ) and the order term o p 'xˆ − x' functions ψ  (e ) = d(ei ). i. indicates the remainder in the Taylor series which tends in probability to zero as the number of observations tends to infinity (He and Shao 1996). ˜ T ∂e ψ A˜ can be regarded as linear combinations The term A of the random variables ψ  (ei ) (i = 1, 2, . . . , m), that is ˜ T ∂e ψA ˜ = A. m .   ˜ = a˜ 1 a˜ 2 . . . a˜ m T a˜ i a˜ iT ψ  (ei ), A. (29). i=1. where n-vectors a˜ i (i = 1, 2, . . . , m) construct the rows of ˜ the transformed design matrix A. With regard to the assumption that the residuals are independent and identically distributed, one may apply the law of large numbers to Eq. (29) which gives ˜ →p ˜ T ∂e ψA A. m . ˜TA ˜ a˜ i a˜ iT E{ψ  (e1 )} = μψ  A. as m → ∞. i=1. (30). 123. where E{.} is the mathematical expectation operator. ‘→ p ’ stands for ‘tends in probability’. Assuming that the distribution of the residuals is symmetric with respect to the zero and that ψ-function is odd and bounded, one can simply show that the expected value of the m-vector ψ is zero. Therefore, applying central limit theorem ˜ T ψ yields to A ˜ T A) ˜ ˜ T ψ →d Nn (0, σψ2 A A. (31). To obtain the asymptotic distribution of M-estimation, one may need the following lemma. Slutsky’s lemma: Let um be a sequence of random vector and m a sequence of random matrix with some random vector u and random matrix . If the following relations hold um →d u, m → p  then one can deduce m um →d  u For the proof, we refer to Maronna et al. (2006) or Shao (2003). Using the above lemma and setting um and m equal to ˜ −1 respectively, based on Eqs. (28), (30) ˜ T ∂e ψA) ˜ T ψ and (A A and (31), the asymptotic normality of M-estimation in linear models follows (see Eq. (9)). For getting information on the asymptotic properties of M-estimation in general models and for a rigorous proof, we can refer to Huber (1981) and Maronna et al. (2006). A.2 Background (LS-VCE) This appendix aims to provide a quick overview of the basic concepts and results in LS-VCE theory (Teunissen and Amiri-Simkooei 2008; Amiri-Simkooei 2007), which give us a useful insight into the previous discussions in Sect. 4. To draw a parallel between the theory and the method of least squares in linear models, it would be suitable to represent the functional and stochastic models in pairs. We therefore commence with the parametric form of the linear model of observation equations along with the stochastic model as follows E{y}= Ax; with E{(y − Ax)(y − Ax)T } = Q y. (32). Note that the second part of the above equation follows from the definition of the covariance matrix of the observables y. Eq. (32) can also be expressed in terms of the model of condition equations as (Teunissen et al. 2005) E{t} = E{BT y}= 0; with. E{t t T } = Qt = BT Q y B (33).

(20) GPS position time-series analysis. 31. p Given the (co)variance component model Q y = i=1 σi Qi , the second part of Eq. (33) can then be written as E{t t T } =. p . σi BT Qi B. (34). introduced in Eq. (24) goes along the same lines as the above steps. A.3 Decorrelation of the covariance matrix Q yvh. i=1. Using the preceding equation, we would like to find a linear model, like that given in Eq. (1), by which one can perform the standard least squares method. In other words, the (co)variance components σi , i = 1, . . . , p, do play the role of the unknown parameters to be estimated. A close look at Eq. (34) reveals that the main obstacle to achieve this goal lies in the bilinear form of the matrix t t T , which has to be transformed into the linear one in the first place. A simple way is to stack the columns of the matrix t t T under each other, thus converting the matrix to a vector. However, due to the symmetry of t t T , the off-diagonal elements would then appear twice. To avoid this, one can do the same by using the columns containing diagonal elements together with the lower ones. Such a transformation will be conducted via vh-operator (Magnus 1988). As a result, taking vh of Eq. (34) leads to Eq. (18) that makes our linear model for the (co)variance components. Although one can apply the least squares method with a user-defined weight matrix to this model, Teunissen and Amiri-Simkooei (2008) derived, in case of normally distributed data, the covariance matrix of the observables yvh = vh(t t T )as already introduced in Eq. (17). Taking the weight matrix as the inverse matrix −1 −1 1 T Q−1 yvh = 2 D (Qt ⊗ Qt )D, the elements of the normal matrix npl are given as follows (see Eq. 23) npl =. 1 T T −1 T vh (B Q p B)D T (Q−1 t ⊗ Qt )Dvh(B Ql B) (35) 2. In this section, we intend to show that using the transformation introduced in Eqs. (19), (20) and (21), results in decorrelation of the covariance matrix Q yvh given in Eq. (17). In other words, one should illustrate that the covariance matrix of the transformed observables y˜ vh is an identity matrix. To do so, one needs first the following four identities (Magnus 1988): (U ⊗ U)T = UT ⊗ UT. (38). D + (U ⊗ U)D D + = D + (U ⊗ U). (39). (U ⊗ S)(V ⊗ T) = (UV) ⊗ (ST) (D T D)−1 = D + D +T. (40) (41). Now, we are able to readily prove the claim. Applying the error propagation law, one can write T Q y˜vh = Gvh Q yvh Gvh. (42). With regard to Eqs. (17) and (20), the above equation can be formed as Q y˜vh = 2D + (GtT ⊗ GtT )D D + (Qt ⊗ Qt )D +T ×D T (Gt ⊗ Gt )D +T . (43). By using Eq. (39) and its transposition, the termsD D + and D +T D T can be left out, that is Q y˜vh = 2D + (GtT ⊗ GtT )(Qt ⊗ Qt )(Gt ⊗ Gt )D +T  (44). One can also find a similar expression for the elements of the right-hand side vector l p . Keeping this in mind, together with the following well-known identity (Magnus 1988; Koch 1999). Using Eq. (40) along with the relation GtT Qt Gt = I simplifies the above expression as follows. tr(UT TSVT ) = vh T (U)D T (V ⊗ T)Dvh(S). Based on the definition of the duplication matrix, the columns of Dare orthogonal to each other. Moreover, the columns corresponding to diagonal elements contain only a nonzero element equal to one, and the columns corresponding to offdiagonal elements contain two nonzero elements equal to one. Hence, one can achieve the following identity. (36). the elements npl and l p will finally be obtained as (see Eq. 24) 1 T −1 tr(BT Q p BQ−1 t B Ql BQt ), 2 1 T −1 l p = t T Q−1 t B Q p BQt t 2 npl =. (37). The least squares estimates of (co)variance components σ , say σˆ , are then achieved through solving the normal equations Nσˆ = l. It is important to note that initial values of (co)variance components must be available in order to make Qt = BT Q y B and consequently to compute the elements given in Eq. (37). This prompts the algorithm to be accomplished in an iterative manner. The derivation of those. Q y˜vh = 2D + D +T . (45). times p−2 times !" # !" # D D = diag(1, 2, 2, . . . , 2, 1, 2, 2, . . . , 2, . . . , 1) p−1. T. (46). Substituting Eq. (21) into Eq. (45), based on the relation given in Eq. (41), completes the proof. For a definition of the Kronecker product, vh-operator, duplication matrix and their properties, we refer to AmiriSimkooei (2007), Teunissen and Amiri-Simkooei (2008) or Magnus (1988).. 123.

(21) 32. A. Khodabandeh et al.. Appendix B The change detection algorithm For the purpose of offset detection in coordinate time series, Basseville and Nikiforov (1993) introduced the following maximization problem for white noise observations with the same mean (see Williams 2003b). where in the last equation, use has been made of the relations in Eq. (48). By substituting the results into Eq. (49), one can easily achieve  ( 1 toff (m − toff )(μ1 − μ0 )2 max toff m such that toff ∈ {1, 2, . . . , m} (56) from which the claim follows.. max{toff (m − toff )(μ1 − μ0 )2 } such that toff ∈ {1, 2, . . . , m} toff. (47) with μ0 =. 1 toff. toff  i=1. yi. and. μ1 =. 1 (m − toff ). m  i=toff +1. yi. (48). Now, we show that the above algorithm, in fact, works based on the statistical testing and can be regarded as a special case of the procedure introduced in Sect. 3. The estimator σˆ 2 in Eq. (11), does not depend on toff and can be left out. Further, we deal with only offset detection. As a result, the score function of Eq. (16) will be changed to 2 ⊥ max{(AkT Q−1 o P A Ak ) xˆ k } such that toff ∈ {1, 2, . . . , m} (49) toff. For the white noise observations which have the same mean, one should set the following quantities to construct the test statistic given in Eq. (11). A = ι = [1, 1, . . . , 1]T , Qo = Im. (50). Substituting Eq. (50) into Eq. (12) results in the following equation  1 m − 1; i = j = [r ] with r = P⊥ (51) i j i j A i = j m −1; The least squares estimate xˆ k can be obtained as follows (Teunissen et al. 2005) ⊥ −1 T −1 ⊥ xˆ k = (AkT Q−1 o P A Ak ) Ak Qo P A y. (52). Using the matrix Ak introduced in Eq. (15) together with Eq. (51), one may obtain P⊥ A Ak. (toff ) times (m−toff ) times !" # !" # 1 = [−(m − toff ), . . . , −(m − toff ), toff , . . . , toff ]T m (53). And consequently, using Eq. (53) yields the following relations 1 ⊥ AkT Q−1 toff (m − toff ) (54) o P A Ak = m and toff (m − toff ) ⊥ AkT Q−1 (μ1 − μo ) (55) o PA y = m. 123. References Amiri-Simkooei AR (2003) Formulation of L1 norm minimization in Gauss–Markov models. J Surv Eng 129(1): 37–43 Amiri-Simkooei AR (2007) Least-squares variance component estimation: theory and GPS applications, Ph.D. thesis, Delft University of Technology, Publication on Geodesy, 64, Netherlands, Geodetic Commission, Delft Amiri-Simkooei AR, Tiberius CCJM, Teunissen PJG (2007) Assessment of noise in GPS coordinate time-series: methodology and results. J Geophys Res 112:B07413 Babu GJ (1989) Strong representations for LAD estimates in linear models. Probab Theory Relat Fields 83: 547–558 Bahadur RR (1966) A note on quantiles in large samples. Ann Math Stat 37: 577–580 Baselga S (2007) A global optimization solution of robust estimation. J Surv Eng 133(3): 123–128 Basseville M, Nikiforov IV (1993) Detection of abrupt changes: theory and application. Prentice-Hall, Englewood Cliffs, 469 Beutler G, Rothacher M, Schaer S, Springer TA, Kouba J, Neilan RE (1999) The International GPS Service (IGS): an interdisciplinary service in support of Earth sciences. Adv Space Res 23(4): 631–635 Calais E (1999) Continuous GPS measurements across the Western Alps 1996–1998. Geophys J Int 138: 221–230 Davis RA (1996) Gauss-Newton and M-estimation for ARMA processes with infinite variance. Stochas Process Appl 63: 75–95 Ding XL, Zheng DW, Dong DN, Ma C, Chen YQ, Wang GL (2005) Seasonal and secular positional variations at eight colocated GPS and VLBI stations. J Geod 79(1-3): 71–81 Dueck A, Lohr S (2005) Robust estimation of multivariate covariance components. Biometrics 61: 162–169 Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman & Hall, London Erenoglu RC, Hekimoglu S (2009) An investigation into robust estimation applied to GPS networks. In: International Association of Geodesy Symposia. vol 133. Springer, Berlin, pp 639-644 Gervini D, Yohai VJ (1998) Robust estimation of variance components. Can J Stat 26: 419–430 Gui Q, Zhang J (1998) Robust biased estimation and its application in geodetic adjustments. J Geod 72: 430–435 Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. John Wiley & Sons, New York He X, Shao Q (1996) A general Bahadur representation of M-estimators and its application to linear regression with nonstochastic designs. Ann Stat 24(6): 2608–2630 Hekimoglu S (1997) The finite sample breakdown points of conventional iterative outlier detection procedures. J Surv Eng 123(1): 15–31 Huber PJ (1981) Robust statistics. John Wiley & Sons, New York Kenyeres A, Bruyninx C (2004) EPN coordinate time-series monitoring for reference frame. GPS Solut 8: 200–209.

(22) GPS position time-series analysis Khodabandeh A, Amiri-Simkooei AR (2010) Recursive algorithm for L1 norm estimation in linear models. J Surv Eng. doi:10.1061/ (ASCE)SU.1943-5428.0000031 Koch KR (1996) Robuste Parameterschatzung. Allgemeine Vermessungs-Nachrichten 103: 1–18 Koch KR (1999) Parameter estimation and hypothesis testing in linear models. Springer, Berlin Krarup T, Juhl J, Kubik K (1980) Götterdämmerung over least squares adjustment. Proc.14th congress of the international society of photogrammetry, Hamburg, vol B3, pp 369–378 Krarup T, Kubik K (1983) The danish method: experience and Philosophy. DGK Reihe A.Heft 98: 131–134 Langbein J (2008) Noise in GPS displacement measurements from Southern California and Southern Nevada. J Geophys Res 113(B5):B05405 Magnus JR (1988) Linear structures. London School of Economics and Political Science. Charles Griffin & Company LTD, London, Oxford University Press, , New York Mao A, Harrison CGA, Dixon TH (1999) Noise in GPS coordinate time-series. J Geophys Res 104(B2): 2797–2816 Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley series in probability and statistics, England Martin RD (1981) Robust methods for time-series. In: Findley DF (ed) Applied time-series analysis II. Academic Press, New York pp 683–759 Martin RD, Yohai VJ (1986) Influence functionals for time-series. Ann Stat 14: 781–818 Martin RD, Yohai VJ (1991) Bias robust estimation of autoregression parameters. In: Stahel W, Weisberg S (eds) Directions in robust statistics and diagnostics Part I, IMA Volumes in Mathematics and its Applications, vol 30.. Springer, Berlin Peng JH (2005) The asymptotic variance-covariance matrix, Baarda testing and the reliability of L1 norm estimates. J Geod 78: 668–682 Peng JH (2009) Jointly robust estimation of unknown parameters and variance components based on expectation- maximization algorithm. J Surv Eng 135(1): 1–9 Perfetti N (2006) Detection of station coordinate discontinuities within the Italian GPS Fiducial Network. J Geod 80(7): 381–396 Pope AJ (1976) The statistics of residuals and the detection of outliers. NOAA technical report NOS65 NGS1, US Department of commerce, National Geodetic Survey, Rockville, Maryland Ray J, Altamimi Z, Collilieux X, van Dam T (2008) Anomalous harmonics in the spectra of GPS position estimates. GPS Solut 12: 55–64. 33 Rousseuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New York Shao J (2003) Mathematical Statistics, 2nd edn. Springer, New York Teunissen PJG (1988) Towards a least-squares framework for adjusting and testing of both functional and stochastic model. Internal research memo, Geodetic Computing Centre, Delft Teunissen PJG (2000) Testing theory: an introduction. Delft University press, Series on mathematical Geodesy and positioning, The Netherlands Teunissen PJG, Simons DG, Tiberius CCJM (2005) Probability and observation theory. In: Lecture notes AE2-E01, Delft University of technology, Delft, The Netherlands Teunissen PJG, Amiri-Simkooei AR (2008) Least-squares variance component estimation. J Geod 82(2): 65–82 Vanicek P (1969) Approximate spectral analysis by least-squares fit. Astrophys Space Sci 4: 387–391 Williams SDP (2003a) The effect of coloured noise on the uncertainties of rates estimated from geodetic time-series. J Geod 76: 483–494 Williams SDP (2003b) Offsets in global positioning system time-series. J Geophys Res 108(B6):2310 Williams SDP, Bock Y, Fang P, Jamason P, Nikolaidis RM, Prawirodirdjo L, Miller M, Johnson DJ (2004) Error analysis of continuous GPS position time-series. J Geophys Res 109:B03412 Yang Y (1994) Robust estimation for dependent observations. Manuscr Geod 19: 10–17 Yang Y (1999) Robust estimation of geodetic datum transformation. J Geod 73: 268–274 Yang Y, Song L, Xu T (2001) robust estimator for the adjustment of correlated GPS networks. In: Carosio A, Kutterer H (eds) First International Symposium on Robust Statistics and Fuzzy Techniques in Geodesy and GIS, vol 295, pp 199–207 Yang Y, Xu TH, Song LJ (2005) Robust estimation of variance components with application in global positioning system network adjustment. J Surv Eng 131(4): 107–112 Zhang J, Bock Y, Johnson H, Fang P, Williams S, Genrich J, Wdowinski S, Behr J (1997) Southern California permanent GPS geodetic array: error analysis of daily position estimates and site velocitties. J Geophys Res 102: 18035–18055 Zumberge JF, Heflin MB, Jefferson DC, Watkins MM, Webb FH (1997) Precise point positioning for the efficient and robust analysis of GPS data from large networks. J Geophys Res 102: 5005– 5017. 123.

(23)

Cytaty

Powiązane dokumenty

It contains general variables used in searching for dates: Julian day Number, Julian and Gregorian dates, week day name, Long Count date, 260-, 365- and 9-day cycles, year bearer of

[r]

These ratios, on average, are approximately identical for mutual time-series (i.e. Also, parts of the variations are due to the negative cor- relation between the estimated flicker

One immediately striking feature of this result is that the rate of convergence is of the same order as the rate of convergence of histogram es- timators, and that the

na stronicach 7-9 redaktor umieścił spis treści, który zarysowuje układ pracy oraz wielość autorów poszczególnych tekstów. 11-14) podano rację układu pracy i kryteria

Here, we bench- mark a set of commonly used GFP variants to analyze gene expres- sion in the low-GC-rich Gram-positive model organisms Bacillus subtilis, Streptococcus pneumoniae,

Takiej decyzji autor nie podjął jednak, czego efektem jest to, że przeobrażenia okresu reform wypadają blado wymykając się uwadze autora i czytelnika. Borucki