• Nie Znaleziono Wyników

Approximation and Complexity (notes for students of the University of Warsaw)

N/A
N/A
Protected

Academic year: 2021

Share "Approximation and Complexity (notes for students of the University of Warsaw)"

Copied!
87
0
0

Pełen tekst

(1)

Approximation and Complexity

(notes for students of the University of Warsaw)

Leszek Plaskota

Instytut Matematyki Stosowanej i Mechaniki Uniwersytet Warszawski

October 17, 2021

(2)

Contents

I Classical approximation 2

1 Preliminaries 3

2 Uniform approximation 9

3 The Weierstrass theorem 16

4 Fourier and Fej´er operators 21

5 Quality of projections 28

6 Bernstein’s ‘lethargy’ theorem 34

7 The Jackson theorems 37

II Information-based approximation 45

8 Information and its radius 46

9 Linear algorithms for linear problems 51

10 Optimality of spline algorithms 57

11 Optimal information 61

12 Adaptive information 68

13 Asymptotic setting 77

III Appendix 81

14 Completeness of the space C(D) 82

15 Banach-Steinhaus theorem 83

16 Hahn-Banach theorem 85

1

(3)

Part I

Classical approximation

2

(4)

Chapter 1

Preliminaries

A general formulation of the (classical) approximation problem is as follows. Let X be a linear normed space over K (where K = R or K = C) with a norm k · k. Since we are primarily interested in function spaces, the elements of X will be denoted by f, g, . . . . Let V be a linear subspace of X with finite dimension, i.e.,

dim(V ) = n < +∞.

For f ∈ X we define

dist(f, V ) := inf

v∈V kf − vk and

PV(f ) := {v ∈ V : kx − vk = dist(f, V )}.

PV(f ) is the set of optimal approximations for f with respect to V.

The approximation problem expresses a general wish to represent ‘complicated’ objects (the ones in X) with ‘simpler’ objects (the ones in V.)

We first make an observation that finite dimensionality of V ensures non-emptiness of PV(x).

Indeed, since we clearly have that dist(f, V ) ≤ kf k, the set PV(f ) is contained in the ball B = {v ∈ V : kvk ≤ 2kf k}; otherwise, if v ∈ PV(f ) and kvk > 2kf k then, by the triangle inequality,

kf − vk ≥ kvk − kf k > kf k ≥ dist(f, V ).

Since any finite and closed ball in a finite dimensional space is compact and the function w 7→ kf − wk is continuous, it attains its minimal value in B.

The assumption dim(V ) < +∞ is crucial for PV(f ) 6= ∅. To see this, we give an example.

Example 1.1 Let X = `1 be the space of all infinite and absolutely summable real sequences x = {xn}n≥1 with norm kxk =Pi=1|xi|. Let V be the subspace consisting of all sequences v for which only finitely many coefficients are nonzero. Then for any x /∈ X \ V (for instance, for xi = i−2) we have dist(x, V ) = 0, but for any v ∈ V is kx − vk > 0.

We list the following more or less obvious, but useful properties of the map f 7→ dist(f, V ). (A simple proof is left for the reader.)

Lemma 1.1

(i) dist(f + g, V ) ≤ dist(f, V ) + dist(g, V ), f, g ∈ X, 3

(5)

CHAPTER 1. PRELIMINARIES 4 (ii) dist(f + v, V ) = dist(f, V ), f ∈ X, v ∈ V,

(iii) dist(αf, V ) = |α| dist(f, V ), f ∈ X, α ∈ K, (iv) |dist(f, V ) − dist(g, V )| ≤ kf − gk, f, g ∈ X.

We now give sufficient conditions for uniqueness of the opimal approximations.

Definition 1.1 A normed space X is called uniformly convex iff for any  > 0 there is δ > 0 such that for all f, g ∈ X the following holds:

if kf k = 1 = kgk and k(f + g)/2k > 1 − δ then kf − gk < .

Definition 1.2 A normed space X is called strictly convex iff for any f, g ∈ X the following holds:

if kf k = 1 = kgk and k(f + g)/2k = 1 then f = g.

Then we have the following result.

Theorem 1.1

(i) If X is uniformly convex then X is also strictly convex.

(ii) If dim(X) < +∞ and X is strictly convex then X is also uniformly convex.

Proof. (i) Let kf k = 1 = kgk and k(f + g)/2k = 1. Then the uniform convexity implies that for any  > 0 is kf − gk < , which in turn means that kf − gk = 0 and f = g.

(ii) Suppose X is strictly convex and dim(X) < +∞. For given  > 0, define the set A :=n(f, g) ∈ X × X : kf k = 1 = kgk and kf − gk ≥ o.

This set is closed and bounded in X × X, hence it is compact. Furthermore, by strict convexity of X, the function

h(f, g) = 1 − k(f + g)/2k

is positive and continuous on A. Then, uniform convexity holds for δ = inff,g∈Ah(f, g) > 0. Indeed, if kf k = 1 = kgk and kf − gk ≥  then we have k(f + g)/2k ≤ 1 − δ. 

Theorem 1.2 If X is strictly convex then the optimal approximation is uniquely determined for any f ∈ X.

Proof. Suppose v0, v00 ∈ V are different and both optimal for f ∈ X. Let d = dist(f, V ). Then for v = (v0 + v00)/2 we have

kf − vk = d

1 2

f − v0

d +f − v00 d

!

< 1 2



kf − v0k + kf − v00k



= dist(f, V ),

where the inequality ‘<’ follows from strict convexity of X. This contradicts the optimality of v0 and v00. 

Examples of uniformly convex, and consequently also strictly convex, spaces are provided by unitary spaces. Recall that X is a unitary space iff its norm is generated by an inner product h·, ·i : H × H → K (where K ∈ {R, C}), i.e.,

kf k =qhf, f i.

(6)

CHAPTER 1. PRELIMINARIES 5 Theorem 1.3 Any unitrary space X is uniformly convex.

Proof. For a given  > 0 we let 1 = min(2, ) and

δ = 1 −q1 − 21/4 > 0.

Suppose that f, g ∈ X with kf k = 1 = kgk and k(f + g)/2k > 1 − δ. Due to the rectangle equality kf + gk2+ kf − gk2 = 2kf k2+ kgk2, we have

kf − gk2 < 41 − (1 − δ)2= 21 < 2, as claimed. 

In a unitary space X, the optimal approximation v of f ∈ X with respect to V is just the orthogonal projection of f onto V ; that is,

hf − v, vi = 0 for all v ∈ V.

Having a basis of V, the optimal element can be expressed as follows. Let n = dim(V ) and V = span(v1, v2, . . . , vn).

The orthogonality condition is equivalent to hv, vii = hf, vii, 1 ≤ i ≤ n. Writing v =Pnj=1ajvj, we then have that the unknown aj’s satisfy the following n × n system of linear equations:

n

X

j=1

ajhvj, vii = hf, vii, 1 ≤ i ≤ n.

In particular, if the basis is orthonormal, i.e., hvj, vii = 0 for j 6= i and kvik = 1 for all i then v =

n

X

j=1

hf, vjivj and

dist(f, V )2 = kf − vk2 = kf k2

n

X

j=1

|hf, vji|2 =

+∞

X

j=n+1

|hf, vji|2.

The most popular unitary space is L2(a, b), where −∞ ≤ a < b ≤ +∞. It consists of (Lebesgue) measurable and square integrable (real or complex) functions defined on the interval (a, b), where the inner product is defined as

hf, gi =

Z b a

f (t)g(t) dt, and the corresponding norm is

kf kL2 = hf, f i1/2 =

Z b a

|f (t)|2dt

!1/2

. It can be easily checked that the trigonometric polynomials

√1

2π, 1

πcos t, 1

π sin t, 1

πcos 2t, 1

π sin 2t, . . .

(7)

CHAPTER 1. PRELIMINARIES 6 form an orthonormal system in L2(0, 2π).1

A generalization of L2(a, b) are Lp(a, b) spaces, where 1 ≤ p < +∞. They consist of all (Lebesgue) measurable functions on (a, b), such that |f |p is integrable. The norm is defined as

kf kLp =

Z b a

|f (t)|pdt

!1/p

. It is known that the spaces Lp(a, b) are separable Banach spaces.

We also define the space L(a, b) of all mesurable functions for which kf kL = ess sup

a≤t≤b

|f (t)|

(which defines the norm) is finite.

Recall an important H¨older’s inequality: for any measurable functions f and g and any 1 ≤ p, q ≤ +∞ such that

1 p +1

q = 1 (where q = +∞ if p = 1, and vice versa) we have

kf gkL1 ≤ kf kLpkgkLq. (1.1)

For 1 < p, q < +∞, H¨older’s inequality can be written as

Z b a

|f (t)g(t)| dt ≤

Z b a

|f (t)|pdt

!1/p Z b a

|g(t)|qdt

!1/q

.

Moreover, we have equality above if and only if the functions |f |p and |g|q are linearly dependent, meaning in particular that if f is not the zero function then there is c such that (note p/q = p − 1)

|g(t)| = c |f (t)|p−1 for all t a.e. (1.2) The spaces Lp(a, b) are uniformly convex for 1 < p < +∞, but the proof of this fact is far from trivial. We show only strict convexity which, by Theorem 1.2, is sufficient for uniqueness of the best approximation.

Theorem 1.4 The space Lp(a, b) is strictly convex for 1 < p < +∞.

Proof. The triangle inequality (which of course is one of the necessary conditions for k · kLp to be a norm) says that

kf + gkLp ≤ kf kLp + kgkLp. (1.3)

Recall the proof. We have kf + gkpLp =

Z b a

|(f + g)(t)| |(f + g)(t)|p−1dt

Z b a

|f (t)| |((f + g)(t)|p−1dt +

Z b a

|g(t)| |(f + g)(t)|p−1dt

≤ kf kLpk(f + g)p−1kLq + kgkLpk(f + g)p−1kLq,

1Actually, the subspace spanned by all trigonometric polynomials is dense in L2(0, 2π).

(8)

CHAPTER 1. PRELIMINARIES 7 where the second inequality follows from H¨older’s inequality applied to |f | and |f + g|p−1, and to |g|

and |f + g|p−1. Dividing both sides by k(f + g)p−1kLq and using (p − 1)q = p we obtain (1.3).

From the proof above it follows (cf. (1.2)) that we have equality in (1.3) only if |g| = c1|f + g|p−1 and |f | = c2|f + g|p−1 a.e. For kf kLp = kgkLp = k(f + g)/2kLp = 1, this can happen only when g = f a.e. 

The spaces L1(a, b) and L(a, b) are not strictly convex, and consequently not uniformly convex.

To see this, we provide simple examples showing that in these spaces the approximation problem does not have a unique solution.

Example 1.2 Consider the approximation of f ≡ 1 with respect to V = {t 7→ at : a ∈ R}. Then, for X = L(0, 1) we have dist(f, V ) = 1 and any t 7→ at with a ∈ [0, 1] is optimal, while for X = L1(−1, 1) we have dist(f, V ) = 2 and any t 7→ at with a ∈ [−1, 1] is optimal.

Finding optimal approximations with respect to the Lp norm with p 6= 2 is in general a difficult problem. However, it is possible to give the following characterization of the optimal elements.

Theorem 1.5 Let V be a finite dimensional subspace of Lp(a, b) where 1 < p < +∞. An element v ∈ V is optimal for f ∈ Lp(a, b) with respect to V if and only if for any v ∈ V it holds

Z b a

v(t) f (t) − v(t) p−1sgnf (t) − v(t)dt = 0. (1.4) For L1(a, b), the correponding if and only if condition reads

Z b a

v(t) sgnf (t) − v(t)dt = 0 (1.5) (with the convention that sgn 0 = 0).

Proof. We first show necessity of the condition (1.4). We can assume that f /∈ V. We clam that there is a linear functional ` such that `(f − v) = kf − vkLp, k`k = 1, and `(v) = 0 for all v ∈ V.

Indeed, we first define the functional `1 on the space spanned by V and the function f − v as

`1α(f − v) + v= αkf − vk, α ∈ K, v ∈ V.

Since the optimal element for f −v in V equals zero, v is the optimal element for any g = α(f −v)+v and dist(g, V ) = αkf − vk. Hence

|`(g)| = |α| kf − vkLp ≤ kα(f − v) + vkLp = kgkLp,

which means that `1 has norm one. Next, by Hahn-Banach theorem, cf. Chapter 16, this functional can be extended to a funtional ` defined on Lp(a, b) preserving the norm.

It is known that the functional ` (as any other bounded functional on Lp(a, b) with 1 ≤ p < +∞) has the representation

`(f ) =

Z b a

f (t)h(t) dt,

(9)

CHAPTER 1. PRELIMINARIES 8 for some h ∈ Lq(a, b) such that khkLq = k`k = 1. By H¨older’s inequality,

`(f − v) =

Z b a

(f − v)(t)h(t) dt ≤

Z b a

|(f − v)(t)| |h(t)| dt

≤ kf − vkLpkhkLq = kf − vkLp = `(f − v).

This means that we have equalities above, and for some c > 0 is

h(t) = c |f (t) − v(t)|p−1 ∀t a.e. (1.6)

It also follows that (f − v)(t)h(t) = |(f − v)(t)| |h(t)| a.e. on (a, b), which together with (1.6) means that

sgn h(t) = sgn (f (t) − v(t)) for all t such that f (t) 6= v(t). Hence

h(t) = |h(t)| sgn h(t) = c |(f − v)(t)|p−1sgn(f − v)(t),

which together with the fact that `(v) = Rabv(t)h(t) dt = 0 for v ∈ V completes the proof of the necessity of (1.4).

We now prove the sufficiency. Let v ∈ V satisfy (1.4). Then for any v ∈ V we have kf − vkpLp =

Z b a

(f − v)(t) (f − v)(t) p−1sgn (f − v)(t) dt

=

Z b a

(f − v + v − v)(t) (f − v)(t) p−1sgn (f − v)(t) dt

=

Z b a

(f − v)(t) (f − v)(t) p−1sgn (f − v)(t) dt

≤ kf − vkLpkf − vkp−1Lp , which implies kf − vkLp ≤ kf − vkLp.

The proof for p = 1 follows the same lines as for p > 1 and therefore is omitted. 

We add that for p = 2 the condition (1.4) menas that f − v is orthogonal to the subspace V.

(10)

Chapter 2

Uniform approximation

In this chapter, we deal with the approximation in C(D), where D ⊂ Rd is a compact set. (Recall that a set D ⊂ Rd is compact if and only if D is bounded and closed.) Specifically, C(D) is the space of all continuous and real-valued functions

f : D → R with the (uniform) norm

kf k := max

t∈D |f (t)|.

It is well known that C(D) is a Banach space, cf. Chapter 14.

Our aim is to give characterization of optimal approximations. We first note that C(D) is not a strictly convex space and therefore optimal approximations are not always unique. (One can produce similar examples to those for L.)

Lemma 2.1 A function v ∈ V is optimal for f ∈ C(D) if and only if 0 is optimal for f − v.

Proof. If 0 is not optimal for f − v then there is v1 ∈ V such that kf − v − v1k < kf − vk,

but this means that v −v1 is a better approximation for f than v. The proof of the reversed implication is similar. 

Now we characterize functions for which 0 is optimal; that is, functions f such that

dist(f, V ) = kf k. (2.1)

Define the critical set as

Crit(f ) = {x ∈ D : |f (x)| = kf k}.

Theorem 2.1 The equality (2.1) holds if and only if there is no element v ∈ V such that f (x)v(x) > 0 for all x ∈ Crit(f ).

Proof. If there is v ∈ V such that

kf − vk < kf k, (2.2)

then f (x) and v(x) are of equal signs for any x ∈ Crit(f ). Indeed, if v(x) ≤ 0 < f (x) (the other cases are similar) then f (x) − v(x) ≥ f (x) = kf k which contradics (2.2).

9

(11)

CHAPTER 2. UNIFORM APPROXIMATION 10 Suppose now that there is v ∈ V that takes the same signs as f in Crit(f ). We can assume without loss of generality that kvk < kf k. Let

A = {x ∈ D : f (x)v(x) ≤ 0}.

If A is an empty set then we clearly have kf − vk < kf k. Otherwise A is compact and has empty intersection with Crit(f ), so that

m := max

x∈A |f (x)| < kf k.

Define

v1 :=



1 − m kf k



v.

Then |f (x) − v1(x)| < kf k for all x ∈ D and consequently kf − v1k < kf k. Indeed, this is clear for x /∈ A, and for x ∈ A we have

|f (x) − v1(x)| ≤ |f (x)| + |v1(x)| < m +



1 − m kf k



kf k = kf k.



Corollary 2.1 An element v ∈ V is optimal for f ∈ C(D) if and only if there is no element w ∈ V such that

w(x)f (x) − v(x)> 0 for all x ∈ Crit(f − v).

To proceed further on, we need two facts from convex analysis. Recall that a convex hull of a set S ⊂ Rn is the set of all convex linear combinations of points in S, i.e.,

conv(S) =

( k X

i=1

αisi : k ∈ N, si ∈ S, αi > 0,

k

X

i=1

αi = 1

)

.

Lemma 2.2 Let S ⊂ Rn be a compact set. The vector ~0 does not belong to the convex hull of S if and only if there is ~z ∈ Rn such that the inner product h~z, ~ui2 > 0 for all ~u ∈ S.

Proof. Suppose first that ~0 /∈ conv(S). Let ~z be the element of conv(S) with the minimal norm, i.e., k~zk2 = min{k ~wk2 : ~w ∈ conv(S)}.

(Such an element exists by compactness of S, and it is unique by convexity of conv(S) and strict convexity of the space Rn.) For any ~u ∈ S and 0 ≤ α ≤ 1 we have α~u + (1 − α)~z ∈ conv(S) and

0 ≤ kα~u + (1 − α)~zk22− k~zk22 = ααk~u − ~zk22+ 2h~u − ~z, ~zi2. This may hold for any α only if h~u − ~z, ~zi ≥ 0, or equivalently

h~u, ~zi2 ≥ k~zk22 > 0.

Thus ~z is the wanted element.

Suppose now that ~0 ∈ conv(S). Then ~0 =Pmi=0λi~si for some ~si ∈ S and λi > 0 with Pmi=0λi = 1.

Hence for any ~z ∈ Rn we have Pmi=0λih~si, ~zi2 = 0, which means that for at least one i is h~si, ~zi2 ≤ 0, as claimed. 

(12)

CHAPTER 2. UNIFORM APPROXIMATION 11 Lemma 2.3 Every point of a convex hull of a set S ⊂ Rn is a convex combination of at most n + 1 points of S.

Proof. Let ~x ∈ conv(S). Then ~x =Pmi=0αi~si for some ~si ∈ S and αi > 0 withPmi=0αi = 1, where we assume that m is smallest possible. Suppose that m ≥ n + 1. Define ~yi = ~si− ~x for 0 ≤ i ≤ m. Then

Pm

i=0αi~yi = ~0. Since m > n, we also have that the elements ~yi for 1 ≤ k ≤ m are linearly dependent;

hence Pmi=1βi~yi = ~0 for some βis, where at least one βi is negative. Then for all real λ we have

m

X

i=0

(λαi+ βi)~yi = ~0,

where we additionally set β0 = 0. Now, we set λ := max

0≤j≤m

−βj

αj > 0.

Then λα0 + β0 > 0, for at least one i is λαi + βi = 0, and for all the remaining indexes i these coefficients are nonnegative. Using again the substitution ~yi = ~si− ~x we have

m

X

i=0

λαi+ βi

!

~x =

m

X

i=0

(λαi+ βi)~si,

and dividing both sides by Pmi=0(λαi + βi) we finally obtain that ~x can be represented as a convex combination of less than m + 1 points from S. 

We also introduce some notation. Let dim(V ) = n and (v1, v2, . . . , vn) be a fixed basis of V . Then

~v(x) := (v1(x), v2(x), . . . , vn(x)).

For x ∈ D, we denote by ˆx a linear functional on C(D) given by ˆ

x(g) = g(x), g ∈ C(D).

Theorem 2.2 The following conditions are equivalent:

(i) kf k = dist(f, V ).

(ii) No element of V has the same signs as f in the set Crit(f ).

(iii) 0 belongs to the convex hull of the set {f (x)~v(x) : x ∈ Crit(f )}.

(iv) There exists a functional of the form L =Pki=1λixˆi with k ≤ n + 1, such that xi ∈ Crit(f ) and λif (xi) > 0 for all 1 ≤ i ≤ k, and V ⊂ ker(L).

Proof. Equivalence of (i) and (ii) is proven in Theorem 2.1. To show (ii) ⇒ (iii), observe that (ii) implies that there are no numbers ci such that

f (x)

n

X

i=1

civi(x) > 0 ∀x ∈ Crit(f ).

This condition can be written as h~c, f (x)~v(x)i2 > 0 (~c = (c1, . . . , cn)) and means, by Lemma 2.2, that

~0 is in the convex hull of the set {f(x)~v(x) : x ∈ Crit(f)}.

(13)

CHAPTER 2. UNIFORM APPROXIMATION 12 We show (iii) ⇒ (iv). From (iii) and Lemma 2.3 it follows that ~0 ∈ Rn can be written as a convex combination of k ≤ n + 1 points from {f (x)~v(x) : x ∈ Crit(f )}, i.e.,

~0 =

k

X

i=1

αif (xi)~v(xi).

Then L(g) =Pki=1λig(xi) where λi = αif (xi). Indeed, we clearly have λjf (xj) > 0 and 0 =

k

X

i=1

λivj(xi) =

k

X

i=1

λixˆi

!

(vj) = L(vj), 1 ≤ j ≤ n, which means that L vanishes on V.

And finally, to show (iv) ⇒ (i), we check that (iv) implies that for any v ∈ V is kf k

k

X

i=1

i| =

k

X

i=1

λif (xi) =

k

X

i=1

λif (xi) − v(xi)≤ kf − vk

k

X

i=1

i|, i.e., kf k ≤ kf − vk. The proof is complete. 

Now we deal with uniqueness of optimal approximations.

Definition 2.1 An n-dimensional linear space V ⊂ C(D) is a Haar space iff any nonzero function f ∈ V vanishes in at most n − 1 points of D. Any basis of a Haar space is called a Haar system.

Let us note that an equivalent condition for V to be a Haar space is that the interpolation problem:

for different n points xi ∈ D and numbers yi, find v ∈ V such that

v(xi) = yi, 1 ≤ i ≤ n, (2.3)

has a unique solution. Indeed, writing v =Pnj=1ajvj where {vj}nj=1 is a basis of V, we have that (2.3) is equivalent to the linear system of n × n equations

n

X

j=1

ajvj(xi) = yi, 1 ≤ i ≤ n.

It has a unique solution if and only if the zero function is the unique solution of the homogenous system. The latter is ensured by the Haar condition.

We now give some examples.

A probably most natural Haar space is provided by algebraic polynomials of degree at most n, Pn+1 := span(1, x, x2, . . . , xn) ⊂ C([a, b])

for any −∞ < a < b < +∞. Obviously dim Pn+1 = n + 1. Another important Haar space form trigonometric polynomials,

V2n+1 := span1, cos t, sin t, . . . , cos nt, sin nt⊂ C([a, b]) (2.4) for any a, b with 0 < b − a < 2π. Indeed, using e = cos φ + i sin φ (i =

−1), any nontrivial function t 7→ h(t) :=Pnk=0akcos kt + bksin kt ∈ V2n+1 can be written as

h(t) = X

|k|≤n

ckeikt= z−n

2n

X

k=0

ck−nzk, (2.5)

(14)

CHAPTER 2. UNIFORM APPROXIMATION 13 where z = eit and c0 = a0, c±k = (ak∓ ibk)/2, 1 ≤ k ≤ n. If c±k = 0 for all k then also ak = bk = 0.

This yields dim V2n+1 = 2n + 1. Furthermore, h vanishes at no more than 2n points in [a, b], since the algebraic polynomial on the right-hand side of (2.5) has at most 2n different zeros and the function x 7→ eit is one-to-one in any real interval of length smaller than 2π.

Remark 2.1 Some important subspaces of V2n+1 are also Haar spaces. These include Vbn+1 := span(1, cos t, cos 2t, . . . , cos nt) ⊂ C([0, π]), and Ven:= span(sin t, sin 2t, . . . , sin nt) ⊂ C([, π − ]) for any  ∈ (0, π/2).

Indeed, any nontrivial function t 7→ h(t) :=Pnk=0akcos kt ∈Vbn+1can be written as h(t) =Pnk=0akTk(x) where x = cos t, t ∈ [0, π] and Tk is the kth Chebyshev polynomial (of the first kind). Since the change of variable is one-to-one, h has at most n zeros in [0, π].

To see that Ven is a Haar space on [, π − ] it suffices to notice that if an odd function t 7→ h(t) :=

Pn

k=1bksin kt has n zeros  ≤ t1 < . . . < tn ≤ π −  then it has 2n + 1 zeros in [−π + , π − ]; namely 0 and ±tj, 1 ≤ j ≤ n. This is impossible since, as shown above, V2n+1 is a Haar space on the interval [−π + , π − ].

Theorem 2.3 Let V ⊂ C(D) be a Haar space of dimension n, and f ∈ C(D) be such that kf k = dist(f, V ). Then there exists γ > 0 (that depends on f ) such that for any v ∈ V

kf − vk ≥ kf k + γ kvk.

Proof. We first make an observation that if V is an n-dimensional Haar space then k in the definition of the functional L of Theorem 2.2(iv) equals n + 1; otherwise we could choose v ∈ V such that v(xi) = λi, 1 ≤ i ≤ k, and then 0 = L(v) =Pki=1λiv(xi) =Pki=1λ2i > 0.

We write L as

L(g) =

n

X

i=0

θiσig(xi), σi := sgnf (xi),

so that all θis are positive. Let w ∈ V with kwk = 1. Since L(w) = 0 and, by the Haar condition, w does not vanish at all the points xi, we have max0≤i≤nσiw(xi) > 0. Define

γ := inf

w max

i σiw(xi) > 0.

Now, if v ∈ V then for some i is σiv(xi)/kvk ≥ γ and

kf − vk ≥ σif (xi) + v(xi)≥ kf k + γ kvk, as claimed. 

Corollary 2.2 If V ⊂ C(D) is a Haar space then any f ∈ C(D) has a unique optimal approximation with respect to V.

Proof. If v0 and v00 are both optimal for f then by Theorem 2.3 we have dist(f, V ) = kf − v00k = k(f − v0) + (v0− v00)k

≥ kf − v0k + γ kv0− v00k = dist(f, V ) + γ kv0− v00k which forces kv0 − v00k = 0 and v0 = v00. 

Another consequence of Theorem 2.3 is continuity of the best approximation.

(15)

CHAPTER 2. UNIFORM APPROXIMATION 14 Theorem 2.4 Let V ⊂ C(D) be a Haar space. Let A : C(D) → V be the mapping that associates with any element of C(D) its optimal approximation with respect to V . Then for any f ∈ C(D) there is κ such that for any other g ∈ C(D) we have

kA(f ) − A(g)k ≤ κ kf − gk.

Proof. By Theorem 2.3, there is γ > 0 such that for any w ∈ V

k(f − A(f )) − wk ≥ kf − A(f )k + γ kwk

which is equivalent to kf − vk ≥ kf − A(f )k + γ kA(f ) − vk for all v ∈ V. Hence, taking v = A(g), we obtain

γ kA(f ) − A(g)k ≤ kf − A(g)k − kf − A(f )k

≤ kf − gk + kg − A(g)k − kf − A(f )k

≤ kf − gk + kg − A(f )k − kf − A(f )k

≤ kf − gk + kg − f k + kf − A(f )k − kf − A(f )k

= 2 kf − gk, and the theorem holds with κ = 2/γ. 

Finally, we arrive at the (Chebyshev) alternation theorem for the domain D = [a, b].

Theorem 2.5 Let V be an n-dimensional Haar subspace of C([a, b]). An element v is optimal for f ∈ C([a, b]) with respect to V if and only if there exist σ ∈ {−1, +1} and points a ≤ x0 < x1 < · · · <

xn≤ b such that

f (xi) − v(xi) = σ(−1)ikf − vk, 0 ≤ i ≤ n.

Proof. Let v be optimal for f. Then, by Theorem 2.2(iv), there are points x0 < x1 < · · · < xn and numbers λi such that |f (xi) − v(xi)| = kf − vk and λi(f (xi) − v(xi)) > 0, andPni=0λiw(xi) = 0 for all w ∈ V. We show that xi are the alternation points. To that end, it suffices that λj−1λj < 0. Indeed, for 1 ≤ j ≤ n, we choose wj ∈ V such that it interpolates the data wj(xj) = 1 and wj(xi) = 0, i 6= j − 1, j. Then

0 =

n

X

i=0

λiwj(xi) = λj−1wj(xj−1) + λj.

This implies wj(xj−1) > 0, since otherwise wj would have an nth zero in the interval (xj−1, xj), and consequently λj−1λj < 0.

Suppose now that the n + 1 alternation points xi exist. If there were a w ∈ V such that kf − wk <

kf − vk, then we would have that the function (f − v) − (f − w) = w − v ∈ V assumes alternately positive and negative values at successive xi. Hence the function w − v would have at least n different zeros. Since V is a Haar space, w − v = 0 and w = v. 

Remark 2.2 If V = Pn+1 is the space of algebraic polynomials of degree at most n, then existence of the alternation points can be shown straightforwardly.

Indeed, let f ∈ C([a, b]) and v be optimal for v in Pn+1. Define points {xi} as follows. The first point x1 is the smallest one in [a, b] such that |(f − v)(x1)| = kf − vk. We can assume without loss of

(16)

CHAPTER 2. UNIFORM APPROXIMATION 15 generality that (f − v)(x1) = −kf − vk. Then x2 is the smallest point in [x1, b] such that (f − v)(x2) = kf − vk, and generally, xi is the smallest point in [xi−1, b] such that (f − v)(xi) = (−1)ikf − vk.

If we can choose in such a way at least n+2 points then {xi}n+2i=1 are the alternation points. Suppose that we can choose only k ≤ n + 1 points. Then we define zi for 2 ≤ i ≤ k as the largest zero of f − v in the interval [xi−1, xi]. We obviously have zi < xi. Observe that the polynomial

x 7→ w(x) = (−1)k(x − z2)(x − z3) · · · (x − zk)

is in Pn+1 and has the property that w(x)(f − v)(x) > 0 for all x such that |(f − v)(x)| = kf − vk.

Now it suffices to use Corollary 2.1.

Remark 2.3 Consider the space V2n+1 of trigonometric polynomials defined in (2.4). This is a subspace of C([0, 2π]), but not a Haar space for all n ≥ 1, since the function x 7→ sin nx has 2n + 1 different zeros πk/n, 0 ≤ k ≤ 2n. As a consequence, there are functions f ∈ C([0, 2π]) for which the optimal approximation with respect to V2n+1 is not unique. (A simple example is n = 1, where t 7→ v(t) = −a sin t is optimal for t 7→ f (t) = t/π − 1, for all 0 ≤ a ≤ 1.) However, if we assume, in addition to f ∈ C([0, 2π]) that f (0) = f (2π) then all the results of this chapter that follow from V being a Haar space remain valid.

To see this, observe that for such f the points xi defining the functional L in Theorem 2.2(iv) can be chosen such that

−π ≤ x1 < x2 < · · · < xk < π.

(Indeed, the condition f (−π) = f (π) implies that the point π can be identified with −π.) Moreover, since V2n+1 is already a Haar subspace of C([−π, xk]), proceeding as in the proof of Theorem 2.3 we have k = 2n + 2, and this theorem follows. Consequently, we also have uniqueness of the optimal approximation and {xi}2n+2i=1 are the alternation points.

Remark 2.4 If V ⊂ C(D) is not a Haar space then one can construct a function f that possesses more than one optimal element with respect to V. The construction goes as follows.

We first choose a basis (v1, . . . , vn) of V and points x1, . . . , xn∈ D such that the matrix {vi(xj)}ni,j=1 is singular. Let ~a = (a1, . . . , an) and ~b = (b1, . . . , bn) be nonzero vectors that are, correspondingly, orthogonal to the columns and rows of this matrix, i.e.,

n

X

i=1

aivi(xj) = 0, 1 ≤ j ≤ n, and

n

X

j=1

bjvi(xj) = 0, 1 ≤ i ≤ n.

(Then obviously Pnj=1bjv(xj) = 0 for all v ∈ V.) Let p =Pni=1aivi. We can assume that kpk < 1. Let g ∈ C(D) be such that kgk = 1 and g(xj) = sgn bj for 1 ≤ j ≤ n, and

f (x) = g(x)(1 − |p(x)|).

Then f (xj) = g(xj) = sgn bj. We also have that kf − vk ≥ 1 for all v ∈ V, since otherwise sgn v(xj) = sgn f (xj) = sgn bj, which contradicts Pj=1bjv(xj) = 0. We show that λ p is optimal for f for all 0 ≤ λ ≤ 1. Indeed, for any x ∈ D we have

|f (x) − λ p(x)| ≤ |f (x)| + λ |p(x)| = |g(x)|(1 − |p(x)|) + λ |p(x)|

≤ 1 − |p(x)| + λ |p(x)| ≤ 1.

(17)

Chapter 3

The Weierstrass theorem

This chapter is devoted to the well-known Weierstrass theorem which establishes density of algebraic polynomials in the space C([a, b]). Among several proofs of this fact we choose the one that uses properties of positive operators.

For f, g ∈ C([a, b]) we write f ≥ g (or f ≤ g) iff f (x) ≥ g(x) (or f (x) ≤ g(x)) for all x ∈ [a, b]. By

|f | we mean the function x 7→ |f (x)|, x ∈ [a, b].

Definition 3.1 A linear operator L : C([a, b]) → C([a, b]) is positive iff for all f ∈ C([a, b]) the condition f ≥ 0 implies that Lf ≥ 0.

Sometimes the term monotone operator instead of positve operator is used, since Definition 3.1 is obviously equivalent to the following: for any f, g ∈ C([a, b]), if f ≤ g then Lf ≤ Lg.

For positive operators we have in particular that |Lf | ≤ L(|f |).

Theorem 3.1 Let the functions hi be defined as hi(x) = xi. Let {Ln}n≥1 be a sequence of positive linear operators, Ln: C([a, b]) → C([a, b]). If

n→∞lim khi− Lnhik = 0 for i = 0, 1, 2, then

n→∞lim kf − Lnf k = 0 for all f ∈ C([a, b]).

Proof. Let f ∈ C([a, b]). Let ε > 0. Since continuity of f implies its uniform continuity, there is δ > 0 such that |f (x) − f (y)| < ε if |x − y| < δ. On the other hand, if |x − y| ≥ δ then

|f (x) − f (y)| ≤ 2kf k ≤ 2kf k(x − y)22.

Hence, for c := 2kf k/δ2 we have |f (x) − f (y)| ≤ ε + c(x − y)2, which can be written in terms of hi as

|f − f (y)h0| ≤ εh0+ c(h2 − 2yh1+ y2h0), (3.1) where we treat both sides of (3.1) as functions of x. Applying the positive operator Ln we get

|Lnf − f (y)Lnh0| ≤ εLnh0+ c(Lnh2− 2yLnh1+ y2Lnh0).

16

(18)

CHAPTER 3. THE WEIERSTRASS THEOREM 17 Then, denoting e(i)n := Lnhi− hi and taking x = y we obtain

|(Lnf )(y) − f (y)(Lnh0)(y)|

≤ ε(Lnh0)(y) + c(Lnh2)(y) − 2y(Lnh1)(y) + y2(Lnh0)(y)

= ε1 + e(0)n (y)+ c



y2+ e(2)n (y) − 2yy + e(1)n (y)+ y21 + e(0)n (y)



= ε + εe(0)n (y) + ce(2)n (y) − 2cye(1)n (y) + cy2e(0)n (y)

≤ ε + εke(0)n k + cke(2)n k + 2ckh1kke(1)n k + ckh2kke(0)n k,

which is smaller than 2ε for n sufficiently large, n ≥ m with m independent of y. The proof completes the observation that

kLnf − f k ≤ kLnf − f Lnh0k + kf Lnh0− f h0k ≤ 2ε + kf kke(0)n k, which is smaller than 3ε for sufficiently large n. 

Theorem 3.1 yields, in particular, the Weierstrass theorem.

Theorem 3.2 For any function f ∈ C([a, b]) and any ε > 0, there exists an algebraic polynomial p such that kf − pk < ε.

Proof. Without loss of generality we can restrict ourselves to the interval [a, b] = [0, 1]. For a given f ∈ C([0, 1]) we define the Bernstein polynomials as

(Bnf )(x) :=

n

X

k=0

f k n

! n k

!

xk(1 − x)n−k.

It is clear that for all n, Bnf is a polynomial of degree n, and the operator f 7→ Bnf is linear and positive. Hence it is enough to show that we have convergence of Bnf to f for the polynomials hi, i = 0, 1, 2. For i = 0,

(Bnh0)(x) =

n

X

k=0

n k

!

xk(1 − x)n−k = 1, i.e., Bnh0 = h0. For i = 1,

(Bnh1)(x) =

n

X

k=0

k n

n k

!

xk(1 − x)n−k =

n

X

k=0

n − 1 k − 1

!

xk(1 − x)n−k

= x

n−1

X

k=0

n − 1 k

!

xk(1 − x)n−1−k = x, i.e., Bnh1 = h1. And finally, for i = 2,

(Bnh2)(x) =

n

X

k=0

k n

!2

n k

!

xk(1 − x)n−k =

n

X

k=1

k n

n − 1 k − 1

!

xk(1 − x)n−k

=

n

X

k=1

n − 1 n

k − 1 n − 1 + 1

n

! n − 1 k − 1

!

xk(1 − x)n−k

= n − 1 n x2

n

X

k=2

n − 2 k − 2

!

xk−2(1 − x)n−k+ x n

= n − 1

n x2+x n,

(19)

CHAPTER 3. THE WEIERSTRASS THEOREM 18 which yields |(Bnh2)(x) − h2(x)| = x(1 − x)/n ≤ 1/(4n) and convergence (in norm) of Bnh2 to h2.



Corollary 3.1 Let f ∈ C([a, b]). For any n ≥ 1 there exist points a ≤ x(n)0 < x(n)1 < · · · < x(n)n ≤ b, n ≥ 1,

such that the algebraic polynomial pn∈ Pn+1 interpolating f at the points x(n)i , 0 ≤ i ≤ n, is optimal.

Moreover, limn→∞kf − pnk = 0.

Proof. This is a direct consequence of the Weierstrass Theorem 3.2 and the Chebyshev alternation Theorem 2.5. Indeed, let vn be the optimal polynomial for f with respect to Pn+1. By the the alternation theorem, f − vn nullifies between any two alternation points. Since there are n + 2 alternation points, vn interpolates f at n + 1 points. Moreover, optimality of vn and the Weierstrass theorem yield limn→∞kf − vnk = 0. Hence the theorem holds with pn= vn. 

The ‘problem’ with Corollary 3.1 is that the points x(i)n depend on the particular function f . A natural question is whether it is possible to choose points independently of f such that we have convergence of the polynomial approximation to f for any f ∈ C([a, b]). Unfortunately, the answer is negative and is a consequence of more general considerations in the following Chapters 4 and 5 of this book.

Theorem 3.1 has its counterpart in C(R). This space consists of all functions f : R → R that are continuous and 2π-periodic, i.e,

f (x) = f (x + 2π) for all x ∈ R, and where the norm is

kf k := max

x∈R |f (x)|.

Theorem 3.3 Let h0(x) = 1, h1(x) = cos x, h2(x) = sin x. Let {Ln}n≥1 be a sequence of positive linear operators, Ln: C(R) → C(R). If

n→∞lim Ln(hi) = hi for i = 0, 1, 2, then limn→∞Ln(f ) = f for all f ∈ C(R).

Proof. We proceed similarly to the proof of Theorem 3.1. Let f ∈ C(R) and M := kf k. Choose arbitrary  > 0. By uniform continuity of f there is δ > 0 such that |x−y| ≤ δ implies |f (x)−f (y)| ≤ .

We claim that for any α − δ < x ≤ 2π + α − δ is

|f (x) − f (α)| ≤  + 2M

sin2(δ/2)ψ(x), ψ(x) = sin2 x − α 2

!

, (3.2)

Indeed, if |x − α| < δ then |f (x) − f (α)| ≤ ; otherwise 2δx−α2 ≤ π −2δ which implies sin2(x−α2 ) ≥ sin2(δ2) and

|f (x) − f (α)| ≤ 2M ≤ 2M

sin2(δ/2)ψ(x).

Cytaty

Powiązane dokumenty

Although our generic construction produces skew products with a good cyclic approximation, hence of simple spectrum, there also exist ergodic real-analytic cocycles (of topological

To accomplish our research tasks, we analysed and compared three approaches developed to de- lineate European urban regions: functional urban areas (Fuas) defined in the context

The Hahn-Banach theorem about the possibility of an extension of a linear functional defined on a subspace to a functional on the whole space preserving its norm reads as

Department of Mathematics Department of Mathematics Government Degree College Faculty of Natural Sciences. Chaubattakhal (Pauri) Jamia Millia Islamia (Central University) Uttrakhand

The Hausdorff measure is often more convenient that Kuratowski measure since in many spaces there are formulae allowing to calculate or evaluate its values ([1], [2]) while

Based on observations and calculations, we have reasons to believe that C k piecewise differentiable functions might achieve the required Jackson type estimate (1). Precisely, we

Тиман, Теория приближения функций действительного переменного, Москва 1960.. [6]

The main goal of presented thesis is to identify tourism space of the students of Geography of Tourism and Tourism and Recreation at Polish universities and