• Nie Znaleziono Wyników

Approximation and Complexity (notes for students of the University of Warsaw)

N/A
N/A
Protected

Academic year: 2021

Share "Approximation and Complexity (notes for students of the University of Warsaw)"

Copied!
127
0
0

Pełen tekst

(1)

Approximation and Complexity

(notes for students of the University of Warsaw)

Leszek Plaskota

Instytut Matematyki Stosowanej i Mechaniki Uniwersytet Warszawski

January 13, 2020

(2)

2

(3)

Contents

I Classical approximation 5

1 Preliminaries 7

2 Uniform approximation 15

3 The Weierstrass theorem 25

4 Fourier and Fej´er operators 33

5 Quality of projections 43

6 Bernstein’s ‘lethargy’ theorem 51

7 The Jackson theorems 55

II Information-based approximation 67

8 Information and its radius 69

9 Linear algorithms for linear problems 77

10 Optimality of spline algorithms 85

11 Optimal information 91

12 Adaptive information 101

13 Asymptotic setting 113

3

(4)

4 CONTENTS

III Appendix 119

14 Completeness of the space C(D) 121

15 Banach-Steinhaus theorem 123

16 Hahn-Banach theorem 125

(5)

Part I

Classical approximation

5

(6)
(7)

Chapter 1

Preliminaries

A general formulation of the (classical) approximation problem is as follows.

Let X be a linear normed space over K (where K = R or K = C) with a norm k · k. Since we are primarily interested in function spaces, the elements of X will be denoted by f, g, . . . . Let V be a linear subspace of X with finite dimension, i.e.,

dim(V ) = n < +∞.

For f ∈ X we define

dist(f, V ) := inf

v∈V kf − vk and

PV(f ) := {v ∈ V : kx − vk = dist(f, V )}.

PV(f ) is the set of optimal approximations for f with respect to V.

The approximation problem expresses a general wish to represent ‘com- plicated’ objects (the ones in X) with ‘simpler’ objects (the ones in V.)

We first make an observation that finite dimensionality of V ensures non- emptiness of PV(x). Indeed, since we clearly have that dist(f, V ) ≤ kf k, the set PV(f ) is contained in the ball B = {v ∈ V : kvk ≤ 2kf k}; otherwise, if v ∈ PV(f ) and kvk > 2kf k then, by the triangle inequality,

kf − vk ≥ kvk − kf k > kf k ≥ dist(f, V ).

Since any finite and closed ball in a finite dimensional space is compact and the function w 7→ kf − wk is continuous, it attains its minimal value in B.

The assumption dim(V ) < +∞ is crucial for PV(f ) 6= ∅. To see this, we give an example.

7

(8)

8 CHAPTER 1. PRELIMINARIES Example 1.1 Let X = `1 be the space of all infinite and absolutely summable real sequences x = {xn}n≥1 with norm kxk = Pi=1|xi|. Let V be the subspace consisting of all sequences v for which only finitely many coefficients are nonzero. Then for any x /∈ X \ V (for instance, for xi = i−2) we have dist(x, V ) = 0, but for any v ∈ V is kx − vk > 0.

We list the following more or less obvious, but useful properties of the map f 7→ dist(f, V ). (A simple proof is left for the reader.)

Lemma 1.1

(i) dist(f + g, V ) ≤ dist(f, V ) + dist(g, V ), f, g ∈ X, (ii) dist(f + v, V ) = dist(f, V ), f ∈ X, v ∈ V,

(iii) dist(αf, V ) = |α| dist(f, V ), f ∈ X, α ∈ K, (iv) |dist(f, V ) − dist(g, V )| ≤ kf − gk, f, g ∈ X.

We now give sufficient conditions for uniqueness of the opimal approxi- mations.

Definition 1.1 A normed space X is called uniformly convex iff for any

 > 0 there is δ > 0 such that for all f, g ∈ X the following holds:

if kf k = 1 = kgk and k(f + g)/2k > 1 − δ then kf − gk < .

Definition 1.2 A normed space X is called strictly convex iff for any f, g ∈ X the following holds:

if kf k = 1 = kgk and k(f + g)/2k = 1 then f = g.

Then we have the following result.

Theorem 1.1

(i) If X is uniformly convex then X is also strictly convex.

(ii) If dim(X) < +∞ and X is strictly convex then X is also uniformly convex.

(9)

9 Proof. (i) Let kf k = 1 = kgk and k(f + g)/2k = 1. Then the uniform convexity implies that for any  > 0 is kf − gk < , which in turn means that kf − gk = 0 and f = g.

(ii) Suppose X is strictly convex and dim(X) < +∞. For given  > 0, define the set

A :=n(f, g) ∈ X × X : kf k = 1 = kgk and kf − gk ≥ o.

This set is closed and bounded in X, hence it is compact. Furthermore, by strict convexity of X, the function

h(f, g) = 1 − k(f + g)/2k

is positive and continuous on A. Then, uniform convexity holds for δ = inff,g∈Ah(f, g) > 0. Indeed, if kf k = 1 = kgk and kf − gk ≥  then we have k(f + g)/2k ≤ 1 − δ. 

Theorem 1.2 If X is strictly convex then the optimal approximation is uniquely determined for any f ∈ X.

Proof. Suppose v0, v00 ∈ V are different and both optimal for f ∈ X. Let d = dist(f, V ). Then for v = (v0+ v00)/2 we have

kf − vk = d

1 2

f − v0

d +f − v00 d

!

< 1 2



kf − v0k + kf − v00k



= dist(f, V ), where the inequality ‘<’ follows from strict convexity of X. This contradicts the optimality of v0 and v00. 

Examples of uniformly convex, and consequently also strictly convex, spaces are provided by unitary spaces. Recall that X is a unitary space iff its norm is generated by an inner product h·, ·i : H × H → K (where K ∈ {R, C}), i.e.,

kf k =qhf, f i.

Theorem 1.3 Any unitrary space X is uniformly convex.

Proof. For a given  > 0 we let 1 = min(2, ) and δ = 1 −

q

1 − 21/4 > 0.

(10)

10 CHAPTER 1. PRELIMINARIES Suppose that f, g ∈ X with kf k = 1 = kgk and k(f + g)/2k > 1 − δ. Due to the rectangle equality kf + gk2+ kf − gk2 = 2kf k2+ kgk2, we have

kf − gk2 < 41 − (1 − δ)2= 21 < 2, as claimed. 

In a unitary space X, the optimal approximation vof f ∈ X with respect to V is just the orthogonal projection of f onto V ; that is,

hf − v, vi = 0 for all v ∈ V.

Having a basis of V, the optimal element can be expressed as follows. Let n = dim(V ) and

V = span(v1, v2, . . . , vn).

The orthogonality condition is equivalent to hv, vii = hf, vii, 1 ≤ i ≤ n.

Writing v = Pnj=1ajvj, we then have that the unknown aj’s satisfy the following n × n system of linear equations:

n

X

j=1

ajhvj, vii = hf, vii, 1 ≤ i ≤ n.

In particular, if the basis is orthonormal, i.e., hvj, vii = 0 for j 6= i and kvik = 1 for all i then

v =

n

X

j=1

hf, vjivj and

dist(f, V )2 = kf − vk2 = kf k2

n

X

j=1

|hf, vji|2 =

+∞

X

j=n+1

|hf, vji|2.

The most popular unitary space is L2(a, b), where −∞ ≤ a < b ≤ +∞.

It consists of (Lebesgue) measurable and square integrable (real or complex) functions defined on the interval (a, b), where the inner product is defined as

hf, gi =

Z b a

f (t)g(t) dt,

(11)

11 and the corresponding norm is

kf kL2 = hf, f i1/2 =

Z b a

|f (t)|2dt

!1/2

. It can be easily checked that the trigonometric polynomials

√1

2π, 1

π cos t, 1

πsin t, 1

π cos 2t, 1

π sin 2t, . . . form an orthonormal system in L2(0, 2π). 1

A generalization of L2(a, b) are Lp(a, b) spaces, where 1 ≤ p < +∞. They consist of all (Lebesgue) measurable functions on (a, b), such that |f |p is integrable. The norm is defined as

kf kLp =

Z b a

|f (t)|pdt

!1/p

.

It is known that the spaces Lp(a, b) are separable Banach spaces.

We also define the space L(a, b) of all mesurable functions for which kf kL = ess sup

a≤t≤b

|f (t)|

(which defines the norm) is finite.

Recall an important H¨older’s inequality: for any measurable functions f and g and any 1 ≤ p, q ≤ +∞ such that

1 p +1

q = 1 (where q = +∞ if p = 1, and vice versa) we have

kf gkL1 ≤ kf kLpkgkLq. (1.1) For 1 < p, q < +∞, H¨older’s inequality can be written as

Z b a

|f (t)g(t)| dt ≤

Z b a

|f (t)|pdt

!1/p

Z b a

|g(t)|qdt

!1/q

.

1Actually, the subspace spanned by all trigonometric polynomials is dense in L2(0, 2π).

(12)

12 CHAPTER 1. PRELIMINARIES Moreover, we have equality above if and only if the functions |f |pand |g|q are linearly dependent, meaning in particular that if f is not the zero function then there is c such that (note p/q = p − 1)

|g(t)| = c |f (t)|p−1 for all t a.e. (1.2) The spaces Lp(a, b) are uniformly convex for 1 < p < +∞, but the proof of this fact is far from trivial. We show only strict convexity which, by Theorem 1.2, is sufficient for uniqueness of the best approximation.

Theorem 1.4 The space Lp(a, b) is strictly convex for 1 < p < +∞.

Proof. The triangle inequality (which of course is one of the necessary con- ditions for k · kLp to be a norm) says that

kf + gkLp ≤ kf kLp+ kgkLp. (1.3) Recall the proof. We have

kf + gkpLp =

Z b a

|(f + g)(t)| |(f + g)(t)|p−1dt

Z b a

|f (t)| |((f + g)(t)|p−1dt +

Z b a

|g(t)| |(f + g)(t)|p−1dt

≤ kf kLpk(f + g)p−1kLq + kgkLpk(f + g)p−1kLq,

where the second inequality follows from H¨older’s inequality applied to |f | and |f +g|p−1, and to |g| and |f +g|p−1. Dividing both sides by k(f +g)p−1kLq and using (p − 1)q = p we obtain (1.3).

From the proof above it follows (cf. (1.2)) that we have equality in (1.3) only if |g| = c1|f + g|p−1 and |f | = c2|f + g|p−1 a.e. For kf kLp = kgkLp = k(f + g)/2kLp = 1, this can happen only when g = f a.e. 

The spaces L1(a, b) and L(a, b) are not strictly convex, and consequently not uniformly convex. To see this, we provide simple examples showing that in these spaces the approximation problem does not have a unique solution.

Example 1.2 Consider the approximation of f ≡ 1 with respect to V = {t 7→ at : a ∈ R}. Then, for X = L(0, 1) we have dist(f, V ) = 1 and any t 7→ at with a ∈ [0, 1] is optimal, while for X = L1(−1, 1) we have dist(f, V ) = 2 and any t 7→ at with a ∈ [−1, 1] is optimal.

(13)

13 Finding optimal approximations with respect to the Lp norm with p 6= 2 is in general a difficult problem. However, it is possible to give the following characterization of the optimal elements.

Theorem 1.5 Let V be a finite dimensional subspace of Lp(a, b) where 1 <

p < +∞. An element v ∈ V is optimal for f ∈ Lp(a, b) with respect to V if and only if for any v ∈ V it holds

Z b a

v(t) f (t) − v(t) p−1sgnf (t) − v(t)dt = 0. (1.4) For L1(a, b), the correponding if and only if condition reads

Z b a

v(t) sgnf (t) − v(t)dt = 0 (1.5) (with the convention that sgn 0 = 0).

Proof. We first show necessity of the condition (1.4). We can assume that f /∈ V. We clam that there is a linear functional ` such that `(f − v) = kf − vkLp, k`k = 1, and `(v) = 0 for all v ∈ V. Indeed, we first define the functional `1 on the space spanned by V and the function f − v as

`1

α(f − v) + v= αkf − vk, α ∈ K, v ∈ V.

Since the optimal element for f −v in V equals zero, v is the optimal element for any g = α(f − v) + v and dist(g, V ) = αkf − vk. Hence

|`(g)| = |α| kf − vkLp ≤ kα(f − v) + vkLp = kgkLp,

which means that `1 has norm one. Next, by Hahn-Banach theorem, this functional can be extended to a funtional ` defined on Lp(a, b) preserving the norm.

It is known that the functional ` (as any other bounded functional on Lp(a, b) with 1 ≤ p < +∞) has the representation

`(f ) =

Z b a

f (t)h(t) dt,

for some h ∈ Lq(a, b) such that khkLq = k`k = 1. By H¨older’s inequality,

`(f − v) =

Z b a

(f − v)(t)h(t) dt ≤

Z b a

|(f − v)(t)| |h(t)| dt

≤ kf − vkLpkhkLq = kf − vkLp = `(f − v).

(14)

14 CHAPTER 1. PRELIMINARIES This means that we have equalities above, and for some c > 0 is

h(t) = c |f (t) − v(t)|p−1 ∀t a.e. (1.6) It also follows that (f − v)(t)h(t) = |(f − v)(t)| |h(t)| a.e. on (a, b), which together with (1.6) means that

sgn h(t) = sgn (f (t) − v(t)) for all t such that f (t) 6= v(t). Hence

h(t) = |h(t)| sgn h(t) = c |(f − v)(t)|p−1sgn(f − v)(t),

which together with the fact that `(v) = Rabv(t)h(t) dt = 0 for v ∈ V com- pletes the proof of the necessity of (1.4).

We now prove the sufficiency. Let v ∈ V satisfy (1.4). Then for any v ∈ V we have

kf − vkpLp =

Z b a

(f − v)(t) (f − v)(t) p−1sgn (f − v)(t) dt

=

Z b a

(f − v + v − v)(t) (f − v)(t) p−1sgn (f − v)(t) dt

=

Z b a

(f − v)(t) (f − v)(t) p−1sgn (f − v)(t) dt

≤ kf − vkLpkf − vkp−1Lp , which implies kf − vkLp ≤ kf − vkLp.

The proof for p = 1 follows the same lines as for p > 1 and therefore is omitted. 

We add that for p = 2 the condition (1.4) menas that f − v is orthogonal to the subspace V.

(15)

Chapter 2

Uniform approximation

In this chapter, we deal with the approximation in C(D), where D ⊂ Rd is a compact set. (Recall that a set D ⊂ Rd is compact if and only if D is bounded and closed.) Specifically, C(D) is the space of all continuous and real-valued functions

f : D → R with the (uniform) norm

kf k := max

t∈D |f (t)|.

It is well known that C(D) is a Banach space.

Our aim is to give characterization of optimal approximations. We first note that C(D) is not a strictly convex space and therefore optimal approxi- mations are not always unique. (One can produce similar examples to those for L.)

Lemma 2.1 A function v ∈ V is optimal for f ∈ C(D) if and only if 0 is optimal for f − v.

Proof. If 0 is not optimal for f − v then there is v1 ∈ V such that kf − v − v1k < kf − vk,

but this means that v − v1 is a better approximation for f than v. The proof of the reversed implication is similar. 

15

(16)

16 CHAPTER 2. UNIFORM APPROXIMATION Now we characterize functions for which 0 is optimal; that is, functions f such that

dist(f, V ) = kf k. (2.1)

Define the critical set as

Crit(f ) = {x ∈ D : |f (x)| = kf k}.

Theorem 2.1 The equality (2.1) holds if and only if there is no element v ∈ V such that f (x)v(x) > 0 for all x ∈ Crit(f ).

Proof. If there is v ∈ V such that

kf − vk < kf k, (2.2)

then f (x) and v(x) are of equal signs for any x ∈ Crit(f ). Indeed, if v(x) ≤ 0 < f (x) (the other cases are similar) then f (x) − v(x) ≥ f (x) = kf k which contradics (2.2).

Suppose now that there is v ∈ V that takes the same signs as f in Crit(f ).

We can assume without loss of generality that kvk < kf k. Let A = {x ∈ D : f (x)v(x) ≤ 0}.

If A is an empty set then we clearly have kf − vk < kf k. Otherwise A is compact and has empty intersection with Crit(f ), so that

m := max

x∈A |f (x)| < kf k.

Define

v1 :=



1 − m kf k



v.

Then |f (x) − v1(x)| < kf k for all x ∈ D and consequently kf − v1k < kf k.

Indeed, this is clear for x /∈ A, and for x ∈ A we have

|f (x) − v1(x)| ≤ |f (x)| + |v1(x)| < m +



1 − m kf k



kf k = kf k.



(17)

17 Corollary 2.1 An element v ∈ V is optimal for f ∈ C(D) if and only if there is no element w ∈ V such that

w(x)f (x) − v(x)> 0 for all x ∈ Crit(f − v).

To proceed further on, we need two facts from convex analysis. Recall that a convex hull of a set S ⊂ Rnis the set of all convex linear combinations of points in S, i.e.,

conv(S) =

( k X

i=1

αisi : k ∈ N, si ∈ S, αi > 0,

k

X

i=1

αi = 1

)

.

Lemma 2.2 Let S ⊂ Rn be a compact set. The vector ~0 does not belong to the convex hull of S if and only if there is ~z ∈ Rn such that the inner product h~z, ~ui2 > 0 for all ~u ∈ S.

Proof. Suppose first that ~0 /∈ conv(S). Let ~z be the element of conv(S) with the minimal norm, i.e.,

k~zk2 = min{k ~wk2 : ~w ∈ conv(S)}.

(Such an element exists by compactness of S, and it is unique by convexity of conv(S) and strict convexity of the space Rn.) For any ~u ∈ S and 0 ≤ α ≤ 1 we have α~u + (1 − α)~z ∈ conv(S) and

0 ≤ kα~u + (1 − α)~zk22− k~zk22 = ααk~u − ~zk22+ 2h~u − ~z, ~zi2. This may hold for any α only if h~u − ~z, ~zi ≥ 0, or equivalently

h~u, ~zi2 ≥ k~zk22 > 0.

Thus ~z is the wanted element.

Suppose now that ~0 ∈ conv(S). Then ~0 =Pmi=0λi~si for some ~si ∈ S and λi > 0 with Pmi=0λi = 1. Hence for any ~z ∈ Rn we have Pmi=0λih~si, ~zi2 = 0, which means that for at least one i is h~si, ~zi2 ≤ 0, as claimed. 

Lemma 2.3 Every point of a convex hull of a set S ⊂ Rn is a convex com- bination of at most n + 1 points of S.

(18)

18 CHAPTER 2. UNIFORM APPROXIMATION Proof. Let ~x ∈ conv(S). Then ~x = Pmi=0αi~si for some ~si ∈ S and αi > 0 with Pmi=0αi = 1, where we assume that m is smallest possible. Suppose that m ≥ n + 1. Define ~yi = ~si − ~x for 0 ≤ i ≤ m. Then Pmi=0αi~yi = ~0.

Since m > n, we also have that the elements ~yi for 1 ≤ k ≤ m are linearly dependent; hencePmi=1βi~yi = ~0 for some βis, where at least one βiis negative.

Then for all real λ we have

m

X

i=0

(λαi+ βi)~yi = ~0,

where we additionally set β0 = 0. Now, we set λ := max

0≤j≤m

−βj

αj > 0.

Then λα0+ β0 > 0, for at least one i is λαi+ βi = 0, and for all the remaining indexes i these coefficients are nonnegative. Using again the substitution

~yi = ~si− ~x we have

m

X

i=0

λαi+ βi

!

~ x =

m

X

i=0

(λαi+ βi)~si,

and dividing both sides by Pmi=0(λαi + βi) we finally obtain that ~x can be represented as a convex combination of less than m + 1 points from S. 

We also introduce some notation. Let dim(V ) = n and (v1, v2, . . . , vn) be a fixed basis of V . Then

~v(x) := (v1(x), v2(x), . . . , vn(x)).

For x ∈ D, we denote by ˆx a linear functional on C(D) given by ˆ

x(g) = g(x), g ∈ C(D).

Theorem 2.2 The following conditions are equivalent:

(i) kf k = dist(f, V ).

(ii) No element of V has the same signs as f in the set Crit(f ).

(iii) 0 belongs to the convex hull of the set {f (x)~v(x) : x ∈ Crit(f )}.

(19)

19 (iv) There exists a functional of the form L =Pki=1λixˆi with k ≤ n+1, such

that xi ∈ Crit(f ) and λif (xi) > 0 for all 1 ≤ i ≤ k, and V ⊂ ker(L).

Proof. Equivalence of (i) and (ii) is proven in Theorem 2.1. To show (ii) ⇒ (iii), observe that (ii) implies that there are no numbers ci such that

f (x)

n

X

i=1

civi(x) > 0 ∀x ∈ Crit(f ).

This condition can be written as h~c, f (x)~v(x)i2 > 0 (~c = (c1, . . . , cn)) and means, by Lemma 2.2, that ~0 is in the convex hull of the set {f (x)~v(x) : x ∈ Crit(f )}.

We show (iii) ⇒ (iv). From (iii) and Lemma 2.3 it follows that ~0 ∈ Rn can be written as a convex combination of k ≤ n + 1 points from {f (x)~v(x) : x ∈ Crit(f )}, i.e.,

~0 =

k

X

i=1

αif (xi)~v(xi).

Then L(g) = Pki=1λig(xi) where λi = αif (xi). Indeed, we clearly have λjf (xj) > 0 and

0 =

k

X

i=1

λivj(xi) =

k

X

i=1

λixˆi

!

(vj) = L(vj), 1 ≤ j ≤ n,

which means that L vanishes on V.

And finally, to show (iv) ⇒ (i), we check that (iv) implies that for any v ∈ V is

kf k

k

X

i=1

i| =

k

X

i=1

λif (xi) =

k

X

i=1

λif (xi) − v(xi)≤ kf − vk

k

X

i=1

i|,

i.e., kf k ≤ kf − vk. The proof is complete. 

Now we deal with uniqueness of optimal approximations.

Definition 2.1 An n-dimensional linear space V ⊂ C(D) is a Haar space iff any nonzero function f ∈ V vanishes in at most n − 1 points of D. Any basis of a Haar space is called a Haar system.

(20)

20 CHAPTER 2. UNIFORM APPROXIMATION Let us note that an equivalent condition for V to be a Haar space is that the interpolation problem: for different n points xi ∈ D and numbers yi, find v ∈ V such that

v(xi) = yi, 1 ≤ i ≤ n, (2.3) has a unique solution. Indeed, writing v =Pnj=1ajvj where {vj}nj=1 is a basis of V, we have that (2.3) is equivalent to the linear system of n × n equations

n

X

j=1

ajvj(xi) = yi, 1 ≤ i ≤ n.

It has a unique solution if and only if the zero function is the unique solution of the homogenous system. The latter is ensured by the Haar condition.

We now give some examples.

A probably most natural Haar space is provided by algebraic polynomials of degree at most n,

Pn+1:= span(1, x, x2, . . . , xn) ⊂ C([a, b])

for any −∞ < a < b < +∞. Obviously dim Pn+1 = n+1. Another important Haar space form trigonometric polynomials,

V2n+1 := span1, cos t, sin t, . . . , cos nt, sin nt⊂ C([a, b]) (2.4) for any a, b with 0 < b − a < 2π. Indeed, using e = cos φ + i sin φ (i =

−1), any nontrivial function t 7→ h(t) :=Pnk=0akcos kt + bksin kt ∈ V2n+1 can be written as

h(t) = X

|k|≤n

ckeikt = z−n

2n

X

k=0

ck−nzk, (2.5)

where z = eit and c0 = a0, c±k = (ak ∓ ibk)/2, 1 ≤ k ≤ n. If c±k = 0 for all k then also ak = bk = 0. This yields dim V2n+1 = 2n + 1. Furthermore, h vanishes at no more than 2n points in [a, b], since the algebraic polynomial on the right-hand side of (2.5) has at most 2n different zeros and the function x 7→ eit is one-to-one in any real interval of length smaller than 2π.

Remark 2.1 Some important subspaces of V2n+1 are also Haar spaces. These include

Vbn+1:= span(1, cos t, cos 2t, . . . , cos nt) ⊂ C([0, π]), and

(21)

21

Ven:= span(sin t, sin 2t, . . . , sin nt) ⊂ C([, π − ]) for any  ∈ (0, π/2).

Indeed, any nontrivial function t 7→ h(t) := Pnk=0akcos kt ∈ Vbn+1 can be written as h(t) = Pnk=0akTk(x) where x = cos t, t ∈ [0, π] and Tk is the kth Chebyshev polynomial (of the first kind). Since the change of variable is one-to-one, h has at most n zeros in [0, π].

To see that Ven is a Haar space on [, π − ] it suffices to notice that if an odd function t 7→ h(t) :=Pnk=1bksin kt has n zeros  ≤ t1 < . . . < tn ≤ π −  then it has 2n + 1 zeros in [−π + , π − ]; namely 0 and ±tj, 1 ≤ j ≤ n. This is impossible since, as shown above, V2n+1 is a Haar space on the interval [−π + , π − ].

Theorem 2.3 Let V ⊂ C(D) be a Haar space of dimension n, and f ∈ C(D) be such that kf k = dist(f, V ). Then there exists γ > 0 (that depends on f ) such that for any v ∈ V

kf − vk ≥ kf k + γ kvk.

Proof. We first make an observation that if V is an n-dimensional Haar space then k in the definition of the functional L of Theorem 2.2(iv) equals n + 1; otherwise we could choose v ∈ V such that v(xi) = λi, 1 ≤ i ≤ k, and then 0 = L(v) =Pki=1λiv(xi) =Pki=1λ2i > 0.

We write L as

L(g) =

n

X

i=0

θiσig(xi), σi := sgnf (xi),

so that all θis are positive. Let w ∈ V with kwk = 1. Since L(w) = 0 and, by the Haar condition, w does not vanish at all the points xi, we have max0≤i≤nσiw(xi) > 0. Define

γ := inf

w max

i σiw(xi) > 0.

Now, if v ∈ V then for some i is σiv(xi)/kvk ≥ γf and kf − vk ≥ σif (xi) + v(xi)≥ kf k + γ kvk, as claimed. 

Corollary 2.2 If V ⊂ C(D) is a Haar space then any f ∈ C(D) has a unique optimal approximation with respect to V.

(22)

22 CHAPTER 2. UNIFORM APPROXIMATION Proof. If v0 and v00 are both optimal for f then by Theorem 2.3 we have

dist(f, V ) = kf − v00k = k(f − v0) + (v0 − v00)k

≥ kf − v0k + γ kv0− v00k = dist(f, V ) + γ kv0− v00k which forces kv0 − v00k = 0 and v0 = v00. 

Another consequence of Theorem 2.3 is continuity of the best approxima- tion.

Theorem 2.4 Let V be a Haar subspace of C(D). Let A : C(D) → V be the mapping that associates with any element of C(D) its optimal approximation with respect to V . Then for any f ∈ C(D) there is κ such that for any other g ∈ C(D) we have

kA(f ) − A(g)k ≤ κ kf − gk.

Proof. By Theorem 2.3, there is γ > 0 such that for any w ∈ V k(f − A(f )) − wk ≥ kf − A(f )k + γ kwk

which is equivalent to kf − vk ≥ kf − A(f )k + γ kA(f ) − vk for all v ∈ V.

Hence, taking v = A(g), we obtain

γ kA(f ) − A(g)k ≤ kf − A(g)k − kf − A(f )k

≤ kf − gk + kg − A(g)k − kf − A(f )k

≤ kf − gk + kg − A(f )k − kf − A(f )k

≤ kf − gk + kg − f k + kf − A(f )k − kf − A(f )k

= 2 kf − gk, and the theorem holds with κ = 2/γ. 

Finally, we arrive at the (Chebyshev) alternation theorem for the domain D = [a, b].

Theorem 2.5 Let V be an n-dimensional Haar subspace of C([a, b]). An element v is optimal for f ∈ C([a, b]) with respect to V if and only if there exist σ ∈ {−1, +1} and points a ≤ x0 < x1 < · · · < xn≤ b such that

f (xi) − v(xi) = σ(−1)ikf − vk, 0 ≤ i ≤ n.

(23)

23 Proof. Let v be optimal for f. Then, by Theorem 2.2(iv), there are points x0 < x1 < · · · < xn and numbers λi such that |f (xi) − v(xi)| = kf − vk and λi(f (xi) − v(xi)) > 0, and Pni=0λiw(xi) = 0 for all w ∈ V. We show that xi are the alternation points. To that end, it suffices that λj−1λj < 0.

Indeed, for 1 ≤ j ≤ n, we choose wj ∈ V such that it interpolates the data wj(xj) = 1 and wj(xi) = 0, i 6= j − 1, j. Then

0 =

n

X

i=0

λiwj(xi) = λj−1wj(xj−1) + λj.

This implies wj(xj−1) > 0, since otherwise wj would have an nth zero in the interval (xj−1, xj), and consequently λj−1λj < 0.

Suppose now that the n + 1 alternation points xi exist. If there were a w ∈ V such that kf − wk < kf − vk, then we would have that the function (f −v)−(f −w) = w−v ∈ V assumes alternately positive and negative values at successive xi. Hence the function w − v would have at least n different zeros. Since V is a Haar space, w − v = 0 and w = v. 

Remark 2.2 If V = Pn+1 is the space of algebraic polynomials of degree at most n, then existence of the alternation points can be shown straightfor- wardly.

Define points {xi} as follows. The first point x1 is the smallest point in [a, b] such that |(f − v)(x1)| = kf − vk. We can assume without loss of generality that (f −v)(x1) = −kf −vk. Then x2 is the smallest point in [x1, b]

such that (f − v)(x2) = kf − vk, and generally, xi is the smallest point in [xi−1, b] such that (f − v)(xi) = (−1)ikf − vk.

If we can choose in such a way at least n + 2 points then {xi}n+2i=1 are the alternation points. Suppose that we can choose only k ≤ n + 1 points.

Then we define yi, 1 ≤ i ≤ k − 1, as the largest point in [xi, xi+1] such that (f − v)(yi) = (−1)ikf − vk. We obviously have that yi < xi+1. Then the polynomial

x 7→ w(x) = (−1)k(x − z2)(x − z3) · · · (x − zk), zi = (yi−1+ xi)/2, is in Pn+1 and has the property that w(x)(f − v)(x) > 0 for all x such that

|(f − v)(x)| = kf − vk. Now it suffices to use Corollary 2.1.

Remark 2.3 Consider the space V2n+1 of trigonometric polynomials defined in (2.4). This is a subspace of C([0, 2π]), but not a Haar space for all n ≥ 1,

(24)

24 CHAPTER 2. UNIFORM APPROXIMATION since the function x 7→ sin nx has 2n + 1 different zeros πk/n, 0 ≤ k ≤ 2n.

As a consequence, there are functions f ∈ C([0, 2π]) for which the optimal approximation with respect to V2n+1 is not unique. (A simple example is n = 1, where t 7→ v(t) = −a sin t is optimal for t 7→ f (t) = t/π − 1, for all 0 ≤ a ≤ 1.) However, if we assume, in addition to f ∈ C([0, 2π]) that f (0) = f (2π) then all the results of this chapter that follow from V being a Haar space remain valid.

To see this, observe that for such f the points xi defining the functional L in Theorem 2.2(iv) can be chosen such that

−π ≤ x1 < x2 < · · · < xk < π.

(Indeed, the condition f (−π) = f (π) implies that the point π can be identified with −π.) Moreover, since V2n+1 is already a Haar subspace of C([−π, xk]), proceeding as in the proof of Theorem 2.3 we have k = 2n+2, and this theorem follows. Consequently, we also have uniqueness of the optimal approximation and {xi}2n+2i=1 are the alternation points.

Remark 2.4 If V ⊂ C(D) is not a Haar space then one can construct a function f that possesses more than one optimal element with respect to V.

The construction goes as follows.

We first choose a basis (v1, . . . , vn) of V and points x1, . . . , xn ∈ D such that the matrix {vi(xj)}ni,j=1 is singular. Let ~a = (a1, . . . , an) and

~b = (b1, . . . , bn) be nonzero vectors that are, correspondingly, orthogonal to the columns and rows of this matrix, i.e.,

n

X

i=1

aivi(xj) = 0, 1 ≤ j ≤ n, and

n

X

j=1

bjvi(xj) = 0, 1 ≤ i ≤ n.

(Then obviously Pnj=1bjv(xj) = 0 for all v ∈ V.) Let p = Pni=1aivi. We can assume that kpk < 1. Let g ∈ C(D) be such that kgk = 1 and g(xj) = sgn bj for 1 ≤ j ≤ n, and

f (x) = g(x)(1 − |p(x)|).

Then f (xj) = g(xj) = sgn bj. We also have that kf − vk ≥ 1 for all v ∈ V, since otherwise sgn v(xj) = sgn f (xj) = sgn bj, which contradicts

P

j=1bjv(xj) = 0. We show that λ p is optimal for f for all 0 ≤ λ ≤ 1.

Indeed, for any x ∈ D we have

|f (x) − λ p(x)| ≤ |f (x)| + λ |p(x)| = |g(x)|(1 − |p(x)|) + λ |p(x)|

≤ 1 − |p(x)| + λ |p(x)| ≤ 1.

(25)

Chapter 3

The Weierstrass theorem

This chapter is devoted to the well-known Weierstrass theorem which estab- lishes density of algebraic polynomials in the space C([a, b]). Among several proofs of this fact we choose the one that uses properties of positive operators.

For f, g ∈ C([a, b]) we write f ≥ g (or f ≤ g) iff f (x) ≥ g(x) (or f (x) ≤ g(x)) for all x ∈ [a, b]. By |f | we mean the function x 7→ |f (x)|, x ∈ [a, b].

Definition 3.1 A linear operator L : C([a, b]) → C([a, b]) is positive iff for all f ∈ C([a, b]) the condition f ≥ 0 implies that Lf ≥ 0.

Sometimes the term monotone operator instead of positve operator is used, since Definition 3.1 is obviously equivalent to the following: for any f, g ∈ C([a, b]), if f ≤ g then Lf ≤ Lg.

For positive operators we have in particular that |Lf | ≤ L(|f |).

Theorem 3.1 Let the functions hi be defined as hi(x) = xi. Let {Ln}n≥1 be a sequence of positive linear operators, Ln: C([a, b]) → C([a, b]). If

n→∞lim khi− Lnhik = 0 for i = 0, 1, 2, then

n→∞lim kf − Lnf k = 0 for all f ∈ C([a, b]).

Proof. Let f ∈ C([a, b]). Let ε > 0. Since continuity of f implies its uniform continuity, there is δ > 0 such that |f (x) − f (y)| < ε if |x − y| < δ. On the other hand, if |x − y| ≥ δ then

|f (x) − f (y)| ≤ 2kf k ≤ 2kf k(x − y)22. 25

(26)

26 CHAPTER 3. THE WEIERSTRASS THEOREM Hence, for c := 2kf k/δ2 we have |f (x) − f (y)| ≤ ε + c(x − y)2, which can be written in terms of hi as

|f − f (y)h0| ≤ εh0+ c(h2− 2yh1+ y2h0), (3.1) where we treat both sides of (3.1) as functions of x. Applying the positive operator Ln we get

|Lnf − f (y)Lnh0| ≤ εLnh0+ c(Lnh2− 2yLnh1+ y2Lnh0).

Then, denoting e(i)n := Lnhi− hi and taking x = y we obtain

|(Lnf )(y) − f (y)(Lnh0)(y)|

≤ ε(Lnh0)(y) + c(Lnh2)(y) − 2y(Lnh1)(y) + y2(Lnh0)(y)

= ε1 + e(0)n (y)+ c



y2+ e(2)n (y) − 2yy + e(1)n (y)+ y21 + e(0)n (y)



= ε + εe(0)n (y) + ce(2)n (y) − 2cye(1)n (y) + cy2e(0)n (y)

≤ ε + εke(0)n k + cke(2)n k + 2ckh1kke(1)n k + ckh2kke(0)n k,

which is smaller than 2ε for n sufficiently large, n ≥ m with m independent of y. The proof completes the observation that

kLnf − f k ≤ kLnf − f Lnh0k + kf Lnh0− f h0k ≤ 2ε + kf kke(0)n k, which is smaller than 3ε for sufficiently large n. 

Theorem 3.1 yields, in particular, the Weierstrass theorem.

Theorem 3.2 For any function f ∈ C([a, b]) and any ε > 0, there exists an algebraic polynomial p such that kf − pk < ε.

Proof. Without loss of generality we can restrict ourselves to the interval [a, b] = [0, 1]. For a given f ∈ C([0, 1]) we define the Bernstein polynomials as

(Bnf )(x) :=

n

X

k=0

f k n

! n k

!

xk(1 − x)n−k.

It is clear that for all n, Bnf is a polynomial of degree n, and the operator f 7→ Bnf is linear and positive. Hence it is enough to show that we have convergence of Bnf to f for the polynomials hi, i = 0, 1, 2. For i = 0,

(Bnh0)(x) =

n

X

k=0

n k

!

xk(1 − x)n−k = 1,

(27)

27 i.e., Bnh0 = h0. For i = 1,

(Bnh1)(x) =

n

X

k=0

k n

n k

!

xk(1 − x)n−k =

n

X

k=0

n − 1 k − 1

!

xk(1 − x)n−k

= x

n−1

X

k=0

n − 1 k

!

xk(1 − x)n−1−k = x, i.e., Bnh1 = h1. And finally, for i = 2,

(Bnh2)(x) =

n

X

k=0

k n

!2

n k

!

xk(1 − x)n−k =

n

X

k=1

k n

n − 1 k − 1

!

xk(1 − x)n−k

=

n

X

k=1

n − 1 n

k − 1 n − 1 + 1

n

! n − 1 k − 1

!

xk(1 − x)n−k

= n − 1 n x2

n

X

k=2

n − 2 k − 2

!

xk−2(1 − x)n−k+ x n

= n − 1

n x2+ x n,

which yields |(Bnh2)(x) − h2(x)| = x(1 − x)/n ≤ 1/(4n) and convergence (in norm) of Bnh2 to h2. 

Corollary 3.1 Let f ∈ C([a, b]). For any n ≥ 1 there exist points a ≤ x(n)0 < x(n)1 < · · · < x(n)n ≤ b, n ≥ 1,

such that the algebraic polynomial pn ∈ Pn+1 interpolating f at the points x(n)i , 0 ≤ i ≤ n, is optimal. Moreover, limn→∞kf − pnk = 0.

Proof. This is a direct consequence of the Weierstrass Theorem 3.2 and the Chebyshev alternation Theorem 2.5. Indeed, let vn be the optimal poly- nomial for f with respect to Pn+1. By the the alternation theorem, f − vn nullifies between any two alternation points. Since there are n + 2 alternation points, vn interpolates f at n + 1 points. Moreover, optimality of vn and the Weierstrass theorem yield limn→∞kf − vnk = 0. Hence the theorem holds with pn = vn. 

The ‘problem’ with Corollary 3.1 is that the points x(i)n depend on the particular function f . A natural question is whether it is possible to choose

Cytaty

Powiązane dokumenty

For a differential inclusion with Lipschitz right hand side without state constraints, several papers [2, 5, 6, 9–11] yield results on the relaxation theorem and some other

T heorem 3.. On the other hand Hille, Yosida and Kato in these situations have proved directly the convergence of corresponding sequences Un{t 1 s), obtaining in this

This note contains the representation theorem of a bounded linear functional in a subspace of a symmetric space made of functions with absolutely continuous

S is conceived as a topological space with the topology determined by the

In final section, we give our main result concerning with the solvability of the integral equation (1) by applying Darbo fixed point theorem associated with the measure

Abstract: The concept of normed quasilinear spaces which is a gen- eralization of normed linear spaces gives us a new opportunity to study with a similar approach to

In [13] we prove a convergence theorem for finite difference schemes that approximate unbounded solutions to parabolic problems with differential-functional dependence by means of

Kominek ([3], Lemma 1) has obtained a more general result (for a midpoint convex functional on a non-empty algebraically open and convex subset).. Using the above lemmas we obtain