• Nie Znaleziono Wyników

Kriging - A Method of Statistical Interpolation of Spatial Data

N/A
N/A
Protected

Academic year: 2021

Share "Kriging - A Method of Statistical Interpolation of Spatial Data"

Copied!
11
0
0

Pełen tekst

(1)

A C T A U N I V E R S I T A T I S L O D Z I E N S I S

FO LIA O E C O N O M IC A 206, 2007

Jan K ow a li k *

K R IG IN G - A M E T H O D O F STA TISTICA L LXTERPOLATION O F SPA TIAL DATA

Abstract. In analyses o f spatial phenom ena it can som etim es be im possible or very expensive to obtain the value (realization) o f a studied phenom enon at all its locations because o f practical constraints. Then in order to estimate the values o f the variables at these locations one can resort to a geostatistical method o f data estim ation (interpolation) called kriging. K riging is a basic m ethod o f spatial data estim ation used in geostatistics that interp olates unknow n values o f the regionalized (spatial) variable from its known values at other locations. The follow ing analysis is set out to present the basic assum ptions o f geostatistics, theoretical grounds o f kriging, its types and its applications in various areas o f life.

Key words: geostatistics, regionalized variable, kriging, analysis o f variograms.

1. IN T R O D U C T IO N

Spatial continuity is typical for m any natural, sociological and economic phenom ena. However in reality it is impossible to obtain an exhaustive num ber o f the realization of a studied phenom enon at every desired point in the space. It is m ainly because o f financial and natural constraints, e.g. the landscape m akes it often impossible or very expensive to conduct m easurem ents at a given location. Thus in order to determine the spatial distribution o f a studied phenom enon one can take advantage of geostatis­ tical m ethods. Geostatistics is a branch of statistics that uses spatial con­ tinuity o f phenom ena and adapts m ethods o f classical regression in order to use this continuity to determine (estimate, interpolate) spatial distributions o f studied phenom ena. Geostatistical theory is based on the following

(2)

assum ption: beside a point of the determined value of a certain variable there are points o f similar values. So it can be said th at the realizations o f a studied phenom enon that are separated from one another are cor­ related.

The first scientist to pay attention to the im portance of spatial continuity in the estim ation o f the distribution o f phenom ena was Kriege in the m iddle of the 20lh century (1951) and he used the spatial continuity in the m ining o f the gold deposits in the Republic of South Africa. The theory put forw ard by D. G. Kriege was developed by a French m athem atician, a would-be professor G. M a t h e r o n (1962). Since then geostatistics has become quite a popular field o f knowledge boasting m any various ap­ plications ranging from its earliest implem entations in m ining industry to geology, pedology, environm ent protection and economic issues.

The basic aim of this analysis is to present the kriging m ethod - a basic m ethod o f spatial d ata estim ation used in geostatistics, as well as its types and applications in various areas o f life.

2. R U LES OF G E O ST A T IST IC S

The m ost characteristic aspect o f geostatistics is the notion o f regionalized variables, which have properties intermediate between random variables and completely deterministic variables. The regionalized variable is a variable with two aspects: a local random erratic aspect and a structured aspect, which reflects the complexity o f this phenom enon.

G eostatistics is based on the notion of the random function according to which the collections o f the values o f the param eter z(x) at all locations x are regarded as individual realizations from the collection o f the space o f the dependent random variables Z(x).

The analysis of spatial d ata with the use o f geostatistical m ethods requires the knowledge of the first two statistical m om ents attributed to the random function for a given phenom enon, namely:

- the first-order statistical m om ent (mean)

E[Z(x)] = m(x) (1)

- the second-order statistical moments: a) the variance

(3)

b) the covariance which is a function o f the location x, and x 2 C (x„ x2) = E[(Z(x,) - « (x ,)) ■ (Z (x2) - m(x2))] «=

= E[Z(x,)] • E[Z(x2)] - m(x,) ■ m(xj) (3) c) the semivariogram defined as a half o f the variances o f the difference o f random variables a t two different locations x, and x 2

У(Х„Х2) = Л D 2[Z(x, ) - Z (x2)] (4)

In order to use various m ethods o f geostatistical analysis a given phe­ nom enon or process has to be stationary, i.e. it m ust not change its properties with the change o f the beginning of the time scale or the spatial scale. The stationarity of spatial data is determined by means of the so-called hypotheses o f stationarity. There are usually two m ain types o f stationarity ( M e u l , V o n M e i r v e n n e 2003): stationarity in a narrow er sense (first- -order stationarity) and stationarity in a wider sense (second-order statio­ narity or weak stationarity).

The random function is stationary in a narrow er sense when all the m om ents o f its distribution rem ain invariable in relation to the displacement vector. However in reality the assumptions o f the stationarity of the random function o f a given variable are hardly fulfilled.

So the m ost frequently used hypothesis o f stationarity is the second-order stationarity restricted only to the first two statistical m om ents of the random function. It can be said that the random function Z(x) is weakly stationary when there is m athem atical expectation EfZ(x)] which does not depend on the location o f the point x

E[Z(x)] = m(x) Vx (5)

and when there is a covariance for each pair of random variables [Z(x), Z (x + h)] which depends only on the separation vector h

C(h) = E[Z(x + h) ■ Z(x)] - m2 Vx (6) The hypothesis o f the second-order stationarity can be restricted when one assumes the stationarity of the first two m oments from the random function increments. In such a case we define the stationarity as intrinsic stationarity. T he hypothesis of intrinsic stationarity assumes that there is a variance and a m ean o f the random function increments [Z(x + h ) - Z ( x ) ] and they do not depend on the location vector x

(4)

E[Z(x + Ä ) - Z (x ) ] = 0 Vx

D 2[Z(x + h ) - Z(x)] = E[(Z(x + h ) - Z (x ))2] - 2y(h)

(7)

(

8

)

where y(h) is the semivariance.

D epending on the type o f the stationarity of the random function of a studied phenom enon various m ethods o f m ethodology of geostatistical analysis are used.

Kriging is a basic m ethod o f spatial d ata estim ation used in geostatistics. Kriging in a m ethod o f creating optimal unbiased estimations of regionalized variables at unsam ple locations which uses the hypothesis of stationarity and the structural properties of the covariance as well as the initial collection o f data. In the literature on this subject the estim ator of kriging is defined as the best linear unbiased estim ator (BLUE). It is the best m ethod because the estim ation error is minimized. It is linear because the estim ation Z is performed at an unknown location from the weighted sum Z, o f the available d ata ( B o r g a , V i z z a c c a r o 1997)

where:

со, - weights determined for each m easurem ent point,

Z, - the values o f the regionalized variable in m easurem ent points, n - num ber of dispersed m easurem ent points in the collection.

T he kriging weights sum to unity to ensure the unbiasedness of the estim ator and it is written in the following way:

K riging is based on the variogram function that is also known as semivariogram o r semivariance. Semivariogram is a function o f the structure o f the regionalized variables that presents a spatial or time behavior of this variable in a studied collection of data.

A sample semivariance is defined as a half o f the variance of the difference of random variables at the separation vector h distance

3. T H E TH E O R Y O F KRIGING

Z* =

2>,z,

(9)

(5)

y ( h ) = 2 D 2[ Z ( x ) - Z ( x + h ) ] ( И )

The estim ation o f the variogram boils down to the calculation of the empirical variogram o f the studied regionalized variable followed by finding a suitable theoretical variogram to m atch its course.

The empirical variogram describes spatial correlations o f the random sample. It is a curve (vector) that is formed from the graphic representation o f the dependence o f semivariance on the distance h between the m easu­ rem ent points. The experimental semivariogram for the distance h is deter­ mined from the equation ( P l o n e r , D u t t e r 2000)

N(h) - the num ber o f pairs of points at the h distance.

The theoretical variogram describes spatial correlations for the source population and it is a simple m athem atical function that models the trend in the empirical variogram . Am ong commonly used theoretical semivariog- ram s we can distinguish ( P l o n e r , D u t t e r 2000) and ( Z a w a d z k i 2002) the following ones: the spherical model, the nugget model, the linear model, the exponential model and the G aussian model. These m odels are referred to as basic models. The equations of the models are as follows:

a) the nugget model

m

X [Z(x) - Z ( x + h)}2 2 N(h)

(12)

where:

b) the spherical model

(6)

d) the G aussian model

h1

y(h) = с0 + сД 1 - e

e) the linear m odel

y(h) = c0 + bh,

where in all models: c0 - the nugget effect, a - the range,

c i - the sill,

b - the ratio o f the sill c, to the range a.

The above-m entioned elements, the nugget effect, the sill and the range are the three characteristic param eters o f the semivariogram as shown at the Fig. 1.

Fig. 1. The m ost important characteristics o f the semivariogram S o u r c e : own elaboration.

T he sill is the highest value o f the variogram at which an increment in the function is no longer observed. The sill should be approxim ately equal to the variance from the sample.

T he nugget effect is a constant th at in the variogram m odel represents a certain variation of d ata at a scale smaller than the sam pling range. It can also be caused by the sample error ( F r a n c o i s - B o g a r c o n 2004). This quantity is viewed in the semivariogram as an absolute term.

(7)

The range is a scalar quantity th at controls the degree of correlation between the points o f d ata that are usually represented as the distance. It is a distance range from zero to point o f reaching by the semivariogram 95% o f the constant value.

A fter the theoretical and the empirical m odels of the semivariogram have been constructed they can be used to determine the kriging weights сw(. F o r example to perform an interpolation at point P on the basis of the d a ta from the three surrounding points P , , P 2 arK* we use following kriging equation:

Using the semivariogram and the condition that the sum o f kriging weights should be equal to one we can construct the following system of equations to calculate the weights со,

y(h,j) - is the semivariance o f the distance h between the control points i and j,

y(hlp) - is the semivariance o f the distance h between the control point i and the interpolated point p,

X - absolute term o f the equation.

The value o f a studied phenomenon at point P is obtained by solving the system o f equations (14) in relation to the weights со, and by putting the values o f weights to the equation (13).

A n im portant characteristic of kriging is that the variogram can be used to determ ine the estim ation error at every point of the interpolation. 1 he variance can be calculated using the following formula.

Zp = co ,Z , + cojZj + CO3Z3 (13)

ct>,ľ(hu) + co2y(h,2) + со3у(/1,3) + Я = y(hlp) WiVilhi)

+ со2У(^22) + <изя/(^2з)

+ X = y(h2p) C O | y ( f t 3 l ) + <W2y ( / l 32> + « з У ( ^ з з ) + ^ = у ( ^ 3 р ) со, + со2 + CO3 -f 0 = 1 (14) where: Se = СО, И М + Ю2У(^р) + + Я (15)

(8)

4. T Y P E S O F KRIGING

T aking into account various degrees o f stationarity o f spatial d ata used in geostatistical analysis we can distinguish two basic kriging techniques: ordinary kriging and universal kriging. These two kriging techniques can be used as point kriging (it estimates the value o f a studied phenom enon at a given point) or as a block kriging (it provides the m ean value of a studied qu antity in a certain 2D o r 3D area).

O rdinary kriging is one of basic kriging m ethods. A t the unsample location x Q the value of the regionalized variable is estimated as (R e h m a n, G h o r i 2000)

Z \ x „ ) = (16)

i-1 where:

Z*(x0) - the estimated value o f the random variable Z at the unsample location x0 and to, are n weights determined for the observed points Z (x ().

The random function Z(x) can be decomposed onto the trend component and onto the rem ainder com ponent R(x) ( M e u l , V a n M e i r v e n n e 2003)

Z(x) = m(x) + R(x) (17)

O rdinary kriging assumes the stationarity of the m ean and takes into account th at fact th at m(x) is a constant but unknow n value. It is also assumed th at the collection o f data has a constant variance. The rem ainder com ­ ponent R( x) is modeled as a stationary random function with the mean equal to zero, and according to the assum ption of the intrinsic stationarity its spatial dependence is defined by the semivariance as

yR(h) = l- E [ ( Z ( x + h ) - R ( x ) ) 2] (18) where:

.R(x) - is the rem ainder component,

Z (x + h) - random variable at the separation vector h distances. O rdinary kriging is distinguished by a high reliability o f obtained es­ tim ations and it is recommended for m ost collections of data.

Universal kriging is a m ore complex m ethod of kriging as it is a two- -stage procedure. Universal kriging assumes that m(x) is n ot stationary but it changes gently in the local neighborhood representing thus a local trend. T he trend com ponent m(x) is m odeled as a biased sum o f

(9)

the know n / / x ) and of the unknown factors a,, l = 0 , . . . , L ( M e u l , V a n M e i r v e n n e 2003)

m ( x ) = £ a , / , ( x ) (19)

/ * о

In reality the semivariance o f the remainders (h) given in the equation (18) is calculated before the trend m(x) is modeled. As the values o f the attribute z(x) are the only available data, the semivariance o f the remainders is calculated by choosing observation pairs that are not or are only slightly influenced by the trend.

A specific type o f kriging is ordinary cokriging - a m ultidimensional extension o f ordinary kriging. It is a kriging m ethod that for the estimation o f the unknow n value of the regionalized variable at location x 0 uses the inform ation from sample points both o f the m ain variable z(x,) and the secondary variable z(y).

There is o f course a certain statistical correlation between the m ain and the secondary variable. Cokriging estim ator is written as

Z* = £ > (z( + X V , (20)

I - I i - 1

where:

a>, and Xj are the weight for the m ain and the secondary variable respectively.

The basic assum ption of ordinary cokriging is the local stationarity of the m ain and the secondary variable at a certain neighborhood o f the point x0 in which the interpolation is made.

5. E X A M P L E S O F A P P L IC A T IO N S O F KRIGING

N ow adays kriging m ethods apart from their original applications in m ining industry (hard coal, copper and crude oil) are also used in such fields as: environment protection, pedology, agriculture, hydrology, fishing, forestry, m eteorology and economy. The m ain reason for using kriging m ethods in so m any fields is th at the costs connected with the examinations of the distribution of spatial phenom ena at a certain location can be substantially limited. On the basis o f the information obtained at a few sampling locations kriging m ethods let us interpolate the value o f the examined phenom enon at other locations w ithout the necessity to conduct expensive examinations.

(10)

A n example of kriging application in fields connected with environm ent protection was the implementation o f this m ethod to estimate the distribution o f the radioactive radiation in Byelorussia following the Chernobyl nuclear power station disaster. Kriging is also used in exam inations of the solar radiation ( R e h m a n , G h o r i 2000). The im plem entation o f kriging m ethods in determ ining the distribution of the solar radiation helped to reduce the num ber o f stations th at gathered d ata on this subject.

K riging m ethods are also widely used in hydrology, e.g. (B o r g a, V i z z a c c a r o 1997) to determine the spatial distribution of precipitations at a given location. Kriging is also used in exam inations aimed at deter­ m ining the distribution o f the height o f the water surface o f water reservoirs and the state o f the underground waters, which is of great im portance in the m anagem ent of water resources.

We can also use kriging m ethods in agriculture e.g. to determine the hum idity o f the soil in a certain area ( U s o w i c z 1999) or to determine the extent to which the crops should be irrigated ( S o u s a , P e r e i r a 1999).

K riging m ethods are also used in economy. The Spanish scientists J. M i r a and M. J. S a n c h e z (2004) present in their study the concept o f implem enting kriging m ethods in the iterative procedure of detecting atypical observations in economic time series. They suggest using the geo­ statistical kriging m ethod to approxim ate the sample distribution taking into account the detection of atypical observations in a time series.

6. C O N C L U SIO N S

From the presented examples o f the implem entation of kriging m ethods in various fields o f knowledge one can conclude that geostatistics is a branch o f statistics which is developing quite rapidly. Its popularity results from the fact that the m ethods o f geostatistical analysis let one obtain satisfactory results while estim ating the distribution of spatial phenom ena. Implemen­ tation o f kriging m ethod leads to a considerable reduction in costs of the exam inations connected with obtaining inform ation concerning the behavior of a given phenom enon in a defined area. A p art from these economic advantages, kriging has also pure statistical advantages com pared with other m ethods used in the analyses of spatial data. The m ost im portant of them is the fact that it provides an estimation error in every point of interpolation. It is also described as the best unbiased linear estim ator.

Despite a dynam ic development and a wide range o f applications of geostatistical m ethods around the world, it should be said that in Poland this m ethodology is not used frequently enough and it should be popularized.

(11)

REFEREN CES

B o r g a M. , V i z z a c c a r o A . (1997), On the interpolation o f hydrologie variables: fo rm a l

Equivalence o f m ultiquadratic surface fittin g and kriging, “Jurnal o f H ydrology” , 195,

160-171.

F r a n c o i s - B o n g a r c o n D . (2004), Theory o f sampling and geostatistics: An intim ate link, “ C hem om etrica and Intelligent Laboratory Systems” , 74, 143-148.

K r i g e D . G . (1951), A statistica l approach to som e mine valuations and allied problem s at

the W itwatersrand, M asrer's thesis, University Witatersrand, South Africa.

M a t h e r o n G . (1962-1963), T raité de G éostatistique Appliquee, Technip, Paris.

M e u l M. , V a n M e i r v e n n e M. (2003), Kriging so il texture under different types o f

nonstationarity, “G eoderm a” , 112, 217-233.

M i r a , J., S a n c h e z , M. J. (2004), Prediction o f determ inistic functions: An application o f

a Gaussian kriging m odel to a time series outlier problem, “ Com putational Statistics & D ata

A n alysis” , 44, 477-491.

P l o n e r A ., D u t t e r R. (2000), New directions in geostatistics, “Journal o f Statistical Planning and Inference” , 91, 499-509.

R e h m a n S., G h o r i S. (2000), Spatial estim ation o f global solar radiation using geostatistics, “ R enew able Energy” , 21, 583-605.

S o u s a V. , P e r e i r a L. S. (1999), Regional Analysis o f irrigation w ater requirements using

kriging Application to p o tato crop a t Trás - os - M ontes, “Agricultural Water M anagem ent”,

40, 221-223.

U s o w i c z B. (1999), Implementation o f geostatistical analysis and the theory o f fra cta ls in the

stu d y o f the dynam ics o f the soil humidity in cultivable areas, “A cta A grophysica”, 22,

229-243.

Z a w a d z k i J. (2002), Implementation o f g eo statistical m ethods in the analysis o f spatial data, “ W iadom ości Statystyczne” , 12, 23-37.

Jan K ow alik

K R IG IN G - M E T O D A ST A T Y ST Y C Z N E J IN T E R PO L A C JI DA N Y C H P R Z E ST R Z E N N Y C H

W analizach zjawisk przestrzennych m ożna spotkać się z sytuacją, iż z pow odu praktycznych ograniczeń niem ożliw e lub bardzo kosztow ne jest uzyskanie wartości (realizacji) badanego zjawiska we wszystkich położeniach. W takim przupadku, w celu określenia wartości zmiennych w tych punktach badania, zastosow anie znajduje geostatystyczna m etoda estymacji (interpolacji) danych zwana krigingiem. Kriging jest podstaw ow ą m etodą estymacji danych przestrzennych wykorzystyw aną w geostatystyce, która interpoluje nieznane wartości zmiennej zregionalizowanej (przestrzennej) w oparciu o jej znane wartości w innych położeniach. W niniejszym opracowaniu zaprezentow ano pod staw ow e założenia dotyczące geostatystyki oraz przedstaw iono podstawy teoretyczne m etody krigingu, jego rodzaje oraz przykłady aplikacji w różnych dziedzinach wiedzy.

Cytaty

Powiązane dokumenty

[r]

The odour concentration values ob- tained at the measuring points where the poultry manure odour was noted were used as input for the calculation of sta- tistical

We prove that, for every γ ∈ ]1, ∞[, there is an element of the Gevrey class Γ γ which is analytic on Ω, has F as its set of defect points and has G as its set of

A general method for solving several moment and interpolation problems can be summarized as follows: the data of the problem define an isometry, with range and domain in the

Ochrona gleb dotyczy nie tylko Górnośląskiego Okręgu Przem ysło­ wego, lecz bardzo licznych terenów, w których istniejący lub nowo zbudowany przem ysł nie w

Vanaf de centrale lessenaar is een intercom verbinding voorzien naar de diverse package-unit operators, een en ander met oproep- en spreekmogelijkheden van en naar de

Efektem przemian w strukturze demograficznej mieszka!ców wielkich miast Polski jest proces starzenia si# ludno&amp;ci, którego zaawansowanie w $odzi w stosunku do pozosta

Podsumowanie W przypadku oznaczania heksacyjanożelazianu III potasu dla elektrody GC o idealnie przygotowanej powierzchni, powierzchni o chropowatości Rz = 2,90 µm bez modyfikacji