Z. Ciesielski (Poznań)
Gaussian processes*
The purpose of this lecture is to present a construction of a Gaus
sian process in terms of the associate covariance and in terms of a Schau- der basis of the space of continuous functions.
Let us start with a probability space (Q, F , P). Let X be a random variable normally distributed with the variance a2 > 0 and the mean m, i.e.
^ a (m—m)a
P ( X < a ) = — f e ~ ~ ^ ~ dn.
Ф * - i
The random vector X = ( X lt X 2, ..., X n) is said to have n-dimen- sional normal distribution if there is a positive definite (symmetric) matrix n x n (Ькг) and a vector m = (m1, ..., mn) such that the density of the distribution
P x ( ai > • • • > an) — P(-X-i < ai j • • • j X n < an) is given by
p x {Xi, . . . , x n) exp 2? bkl( Xk- m k) ( X l- m l)^, where \B\ > 0 is the determinant of (Ъы).
Let us now take a look at the characteristic functions of X and X.
In the first case we have
(1 ) <px (t) = .
For a = 0 the function q>x{t) defined by (1) is again a characteristic func
tion of a random variable which we shall call the degenerate normal random variable. This random variable has a trivial distribution, i.e.
P ( X = m) = 1.
* This is a content of the lecture delivered at the Instructional Conference on Mathematical Probability,' Durham, March 28th - April 11th 1963.
Similarly, in the general case we have the following formula for the characteristic function of X
П
(2) cpx {t) = exp | г (т , — akltkt^,
k,l= 1
where (акг) is the inverse matrix to (bM), hence (акг) is also positive defi
nite, i.e.
П
(3) JT1 akitkti > 0
k,l= 1
if and only if t = (tly . .. , tn) Ф 0. Now, as in the previous case, we allow the matrix (акг) to he non-negative definite, i.e. we only assume that for any t
П
(4) ^ O'kihh ^ 0 .
k,l= 1
Under the weakened assumption (4) the function <px defined by (2) still represents a characteristic function of a random vector X. It is ob
vious that some of the components of such X may have degenerate normal distribution. From now on, the Gaussian random vector (variable) will be any X (X ) which has (2 ) ((1)) as its characteristic function with (4) (cr > 0) satisfied.
For the one-dimensional case we have m = E (JT)and a2 = E [ ( X —m)2].
In general, differentiating (2) we get
and
hence
d<Px(t)
dtk t=o = i E ( X k)
d2<px (t) 1
dtkdti |*=0 ~ Е ( Х кХ г),
mk = E { X k) and аы = E [ { X k — тк) { Х г— тг)].
Therefore, (px (t) is uniquely determined by { E ( X k)} and by the non-ne
gative definite matrix { E ( X k — тк) ( Х г— тг)}. Thus, the corresponding distribution Е х {аг, ..., an) is uniquely determined by these parameters.
hTotice that the density does not exist unless (акг) is nonsingular.
Let (X t, tel ), I = < 0 ,1>, be a stochastic process such that for any sequence 0 < tx < ... < tn < 1 the random vector ( Xt , . .., X tn) is Gaus
sian. Then ( X t) is said to be Gaussian process.
Since, for each t el , X t is a Gaussian random variable we have
\FĄX,)\ < Щ\Х,\) < [E (X f)]1'2 < oo, hence the function m(t) = F ( X t) is well defined. Moreover,
\F\(Xt- m ( t ) ) ( X , - m ( S))\\ < \E(Xt- m ( t ) Y f "[.E fX .-roW )*]1''2.
Therefore, the function
(5) g(t, S) = E[ ( Xt- m ( t ) ) ( X s-m(s))\
is well defined in I2 = 1 x 1 . Without loss of generality we shall assume that m(t) = 0 in I, hence
(6) E ( X t) = 0 and g(t, s) = E ( X tX s).
The function (5) is known as the covariance associate with the Gaussian process ( X t, t e l ) .
Let a(t, s) be a function defined on I 2. We shall say that a is non- ■ negative definite if for any sequence 0 < tx < ... < tn < 1 and for any и = {ux, , un) we have
71
(7) a(tk, ti)ukut > 0.
■k,l=l
A covariance associate with a Gaussian process ( Xt) is non-negative definite. Indeed, let q be this covariance, then
П П
e(tk, и)икщ = x t]eukJ\ > o .
k,l= 1 L k = 1
A symmetric and non-negative definite function we shall call co- variance. Therefore, to every Gaussian process (X t) satisfying (6) there corresponds a covariance, namely g itself. Conversely, to every covariance g there exists a Gaussian process ( X t) satisfying (6), i.e. having g as its associate covariance. To prove this it is sufficient to construct in terms of g a family of finite dimensional distributions with the permutation property and satisfying the throuing out condition and such that the cor
responding Gaussian process will have g as its associate covariance. Let
(8) ЕН" Лп{ах, . . . , a n)
be* the distribution function corresponding to the characteristic function П
(Pt1...tn{'Mii • • ч Mn) = e x p j— \ g{tk, tj)UfaWjj,
к ,1=1
where 0 < t1 < ... < tn < 1. Obviously the permutation property for the family (8) is satisfied. Now, since . .., un_! ,0) is a char acteristic function of F t tn(®i? •••> Qin—i? 00) and
(Pt1...tn- 1iui? •.., un_i) = ę,q...in('^i> • • • >Un—1; 0) we get
i? • • • t ®п— if °°) = >... 5 ein_ j).
Therefore the throuing out condition is satisfied. At last we choose any stochastic process associated with the family (8).
We should like to recall few known results from the theory of integral equations and from the theory of Schauder bases of the space of contin
uous functions. Let us take the second kind homogeneous Fredholm integral equation
(9) f(t) = x jp(t , s)f{s)ds.
I
Let us assume that ft(t,s) is continuous and symmetric in I 2. It will be said that /5 is non-negative definite kernel (in the sense of integral equations) if and only if for any continuous f(t) we have
J j P i h s)f(t)f(s)dtds > 0 .
Th e o r e m A (see [3]). Let (3(t,s) be a continuous, symmetric and non-negative definite Jcernel. Then there exists a sequence of eigenvalues 0 < < A2 < ... and corresponding sequence of eigenfunctions (px, p 2, ...
such that
<Pk{t) = h f Pih s)<pk(s)ds and J<pk{t)<pi{t)dt = 8M.
i
Th e o r e m В (Mercer, see [3]). Let Pit, s) be as in Theorem A. Then
A=1
<Pk(t)(PkiS)
and the series converges absolutely and uniformly in I 2.
For our purpose we would Uke to have other representation for f ( t , s) than the Mercer’s one. Let C(I) be the space of all real continuous func
tions on I, and let ||a?|| — тах|ж(£)|. A set {ipn} c (7(1) is said to be a Schau
der basis of the Banach space <C(I), || ||> if and only if for every xeC(I) we have unique expansion
OO
x{t) = Y in(%)vn(t)
n = 1
convergent in the norm || || to x. It is well known that £n{x)
are linear functionals on <C(Z), || ||>. By the Biesz theorem there exists a Badon measure pn on I such that
£n(>r) — J‘xdpn.
i
The system {£„, ipn} is called biorthogonal because of the following property
n ф m, n = m.
Theorem C (see [7]). Suppose x( •, •) eO(Z2). Then we have unique expansions
OO OO O O 0 0
*(<>*) = 2 2 £nm (x)yn(t)y>m(s) = 2 2 £nm (x) 1fn (t) ipm (s)
n = l » i = l m = l n = l
00
— ‘Ifnit) tym{s) •
m,n= 1
Moreover, the convergence of each of the above series is uniform and
^nm(^) = J s)pn(dt)pm(ds).
/2
According to Theorem C we have for every x(t, s) = y(t)z(s) that
~ £п{у) £т{з) •
Applying Theorem В and the last remark we obtain
OO 1
(Ю) == 2 £m(*Pk) '
jf e l k
{tyr,
= 0 Ф 0
for for
Let in (10) n be equal to m. Then
< 00.
By the Biesz-Fischer theorem and by the Parseval’s identity there exists a sequence (gn) of functions in L 2(I) such that gn has the sequence [ 1
{' /— Źnipjc )> h ~ I1/4 1 12 1 • • • as its Fourier coefficients and
OO
/— ŹmiPk) — £nm{@) •
Finally we get the second representation of
OO
( 1 1 ) == ^ (ffn ? 9m) tyn ( t ) iPmis) •
n,m= 1
This is a simply consequence of the last result and the Theorem C.
Now, we shall show that continuous and symmetric (3 is non-nega
tive definite if and only if it is a non-negative definite kernel. Suppose (3 has the last property. Then by Theorem В
N oo N
ft ? $т) %n ^m == <Pk (^n) Mn j ^ 0,
i,m= l к—1 * n=i
hence (3 is non-negative definite. The converse is also easy to prove, the last inequality implies the required inequalities for the Biemann sums of
/ / s)f{s)dtds,
hence the last integral must be non-negative.
Let ( Yt, t e l ) be the Wiener process defined by means of the Haar functions (the construction is given in the first lectures by M. Kac, see also [1], pp. 406-407)
OO
(1 2 ) = У S A ( f ) i
n = l
where Gn s are independent, normally distributed random variables
t
with mean zero and variance one; Sn(t) — j x n(r)dr, where %n{t) are Haar
о
functions. The series converges uniformly with probability one. The sequence {xn} is an orthonormal and complete set in L 2(I). Suppose g e L 2(I), then
(13) g(t) = £ a nXn(t),
n = 1
where an = (g, Xn)- Differentiating formally (1 2 ) we obtain
(14)
OO
GnXn(t) • n = 1
The Parsevals identity, (13) and (14) imply
/</<№ = y<?OO ndn.
(15)
The series in the last equality converges with probability one because
oo
]?a,n is finite (the Kolmogorow three series theorem). Therefore the sum of this series will be used to define the random variable in the left hand side of (15). This random variable is known as the Stieltjes integral in the sense of Paley, Wiener and Zygmund ([6]).
Finally we shall construct a Gaussian process for a given continuous covariance q = (3. Let s be as in (11). It is not difficult to show that (16) (dii i От) — f ( J n d t WJ"Om^t Wj
We shall define for each t in I a random variable X t as follows
Xt = 1EU 9n(^)drYr] y>n{t),
71—1 I
where the series converges to X t in the l/(Q , P) norm. Now, according to (11) and (16) we can easily check (6). Therefore (Xt, tel) is the associate Gaussian process with the covariance g.
N. Wiener was the first who introduced the Fourier series into the theory of Brownian motion, e. g. [5], later M. Kac and A. F. F. Siegert used the equation (9) investigating the stationary Gaussian processes, some applications of the ideas presented here the reader can find in [1 ] and [2].
References
[1] Z. C ie s ie ls k i, Holder conditions for realizations o f Gaussian processes, Trans.
Amer. Math. Soc. 99 (1961), pp. 4 03-41 3.
[2] — and H. K e s t e n , A limit theorem for the fractional parts o f the sequence (2nt), Proe. Amer. Math. Soc. 13(1962), pp. 596-600.
[3] G. H a m e l, Integralgleichungen, Berlin 1949.
[4] M. K a c and A . J. F. S ie g e r t, On the theory o f noise in radio with square law detectors, Journ. Appl. Phys. 18 (1947), pp. 383-397.
[5] R. E. A . C. P a le y and N . W ie n e r , Fourier transforms in the complex domain, AMS Coll. Puhl. 19, N. Y . 1934.
[6] — — and A . Z y g m u n d , Notes on random functions, Math. Zeitschr. 37 (1933), pp. 647-668.
[7] Z. S e m a d e n i, Schauder bases and approximation with nodes in spaces o f continuous functions, Bull. Acad. Polon. Sci. 11 (1963), pp. 391-395.