R O C Z N IK I P O L S K IE G O TO W A RZY STW A M ATEM ATYCZNEGO Séria I : P R A C E M A TEM A TY CZN E X I X (1977)
W
ie s l a wS
o b ie s z e k(Gliwice)
On the functional equations of the dynamic programming with a singular point
l i In this paper it is introduced a notion of a singular point of a fun
damental functional equation of the dynamic programming. The sin
gularity of above-mentioned point is such that the equation with this singular point is neither of the first type nor of the second type according to the classification given by B. Bellman in [1]. Therefore the theory of the equations of the first and of the second type cannot be applied to the equations with a singular point. Some theorems on the funda
mental functional equations of the dynamic programming with the sin
gular point are proved in this paper.
2. Let 8 be the mapping, which assigns to each point p of some set В of the Euclidean space R m some set 8(p) of the Euclidean space Rn.
Beal-valued functions g and h, and the function T with the values in the set В are define on the graph of this mapping, i.e., on the set
&s = {( p > 3)1 g.eS(p)}- Symbolically
g: Gs -+Rx, h: G8-+R l, T : Gs -+B .
We shall consider the following functional equation of the dynamic programming :
(1) fiP) = sup [g{p, q )+ h {p , q) f(T {p , g))], f(d) = 0 (*) qeS(p)
with the singular point defined as follows.
The vector êe8(p) for every p e B is called the singular point of equation (1) if:
(2) g (p ,'& )= 0 , h (p ,d ')= 1, T { p ,& ) = p for each p e B .
(х) K. Bellman (see [1]) call this equation the fundamental equation, but the
authors of papers [2] and [4] call the fundamental equation more general form of
equation (1). In [1], [
2] and [4] the sets-images 8(p) of the mapping S are constant,
i.e., they do not depend on p eD, moreover, in [2] and [4] 8 (p) <= R 1, in [3] 8 (p) <=:
374 W . Sobieszek
The essential conditions under which equation (1) satisfies the re
quirements of the equations of the first type or of the second type are the conditions, respectively (see [1])
(*) \\T(p, з)|| < a Upll for some number a < l and for each #e$(p) or
(**) \h(p, q)\ ^ a < 1 for some number a and for each (p, q)^Gs . It is easy to see that at the singular point ê neither condition (*) nor condition (**) is fulfilled. On account of this, equation (1) with the singular point is neither of the first type nor of the second type.
E
x a m p l e1. Some problem of the dynamic programming (see [7], [1]) leads to the equation
f{cc) = sup [g(y1) + h(y2)+ f(a y 1 + by2 + x - y 1- y 2)], ДО) = О,
V\+V2<x V\,V2>0
where g, h: [0, + о о )-> й 1, a and b are the numbers from the interval (0,1), moreover, ДО) == ДО) = 0.
It is easily seen that the vector & = (0,0) belongs to every set-image 8(< ü ) — {q = ( y i , ^ 2) I Уг + Уъ^®, 2 ^ 0 } the mapping 8 and fulfils conditions (2) in other words, it is the singular point.
E
x a m p l e2. The equation
f{æ) = sup \a {y )+ \(l- a {y ))f{x - y )}, /(0) = 0 ,
0
where a : [0, -f oo)-»-^1, a (0) = 0, has the singular point ê — 0.
We assume that equation (1) has the singular point # and we assume, without loss of generality, that it is the zero element of the space R n.
The zero element of the space Rm we shall denote by the letter в. We assume that QeD. Moreover, the coordinates of the vector we shall denote by the same letters as the vector but with lower indices. If a is the vector and a is the number, then aa means the vector with the coordinates af.
If a and b are the vectors, then their scalar product we shall denote by ab.
The unit vector, i.e., the vector of which all coordinates are equal one we shall denote by I.
By the norm of the vector p e R m we shall understand the number M = (i*i + •••
The inequality between the vectors we shall understand as the in
equality between their respective coordinates.
For every p e l) we define the following mapping $ 0:
(3) S 0(p) = {qeRn\ I q ^ \\p\\, q > & }.
In this paper we shall use two following groups of the assumptions:
A
ssu m p t io n sH .
1° 8(p) c $ 0(p) for every peD.
2 ° There exists the vector b > ê and the number а ф 1 that g(p, q) < bqa for every (p, q)tGs .
3° There exists the vector a > &, a < I, such that
\\T{p, tf)|| < ( a - I )q + \\p\\ for every (p,q)eGs . 4° 0 < h{p, q) < 1 for every (p, q)eGs .
A
ssu m p t io n sH *.
1° 8(p) 8 0(p) for every peD.
2° There exists the vector b ^ ê, b ф ê and the number ae(0,1) such that g(p, q) > bqa for every (p , q)eGs .
3° There exists the vector a > #, a < I, such that
\\T {p ,q )\\> {a - I)q + \\p\\ for every {p,q)eGs . 4° h(p, q ) ^ l for every (p, q)eGs .
With equation (1) it is strictly connected the following functional sequence :
<1) f k ( P ) = s v P [g(P,q) + h(P,Q.)fk-i(T(P,9.))], * = 2 , 3 , . . . , QeS(p)
where f x{p) = sup g(p, q).
qeS(p)
Sequence (4) we shall call the sequence of the successive approxi
mations for equation (1). The limit of this sequence, if it exists, we shall call the value of the problem of the dynamic planning (2) and we shall denote it by w{p). The last name has a practical reasoning bacause solving of the problems of the dynamic planning leads to the determination of the limit of sequence (4). This limit in the case of the equation of the first and of the second type is also the unique solution of the equation.
In our case, i.e., if equation (1) has a singular point, we cannot identify the function w(p) with the solution of equation (1), because, as it will be shown below, equation (1) has infinitely many solutions, one of them is w(p).
In the next sections of this paper we shall use the following three lemmas.
L
e m m a 1. The sequence
Dk = max(&x + axDk_x, ..., bn + anDk^ ) , h = 2, ...,
■ where
Dx = max(&!, . . bn) and # < a — (ax, ..., an) < I is non-decreasing and has the limit max{b1l( l — a1) , . . . , b nl(l — an)).
(2) This name is derived from [5].
376 W . Sobieszek
The proof of this lemma one may find in [6].
L
em m a2. The sequence of successive approximations (4) is non-de
creasing and its terms are non-negative.
Proof. From the singularity of the point & (see (2)) it follows f k(p) = sup \g (p ,q )+ h {p ,q )fk_1(T (p ,q ))\^ g (p ,ê ) + h(p,'&)fk_1(T(p,'&))
QtS(p)
= fk-i(P), k = 2 , . . .
From the monotonicity of sequence (4) and from the inequality fi(P) = sup g(p, q) > g { p , &) = 0
aeS(p)
it follows that the terms of sequence (4) are non-negative.
L emma 3. I f a is a number from the interval (0,1), a and b are the vectors of R n and if b ^ #, b Ф P, # < a < 7, then the sequence
Mk = m a x [b q a+ M k_1((a — I ) q Jr l)a], к = 2 , 3 , . . . , Мг = m&xbqa,
üeQ QcQ
where Q = {qeRn I Iq < 1, q > Щ, is increasing and is not upper bounded.
Proof. The monotonicity of the sequence {Mk} follows from the fact that the vector & belongs to the set Q. Let us suppose that the se
quence {Mk} is upper bounded. Thus, its limit Ш fulfils the equation M = max[bqa-\-M((a — I ) q + l ) a].
QeQ
.<Let b{ be the positive coordinate of the vector b. The existence of positive bt follows from the assumptions & > #, b Ф #. Thus, if we shall denote F{qf) = Ь{д °+М ((а{ — l ) ^ + l ) “, we state that
lim F'iqf) = -foo, F(0) = M and max F(q{) > M . Qf*
0 +Hence, since M = тах[бд “ +Ж ((а — > max Fiqf), it fol- qeQ
lows the contradiction. This completes the proof.
3. Now we shall prove two theorems on the sufficient conditions of the convergence and divergence of sequence (4).
T
h e o r e m1. I f assumptions H are fulfilled, then sequence (4) is con
vergent for every peD and such that |)pji < Cx, where Сг is an arbitrary, positive constant.
Proof. Let us observe that if the vector b defined in H, 2° is the
zero vector $, then the function w{p) = 0 is the limit of sequence (4).
So, let ns assume that b Ф & and let us define the sequence dk = max (bi + a°dk_J , к = 2 , 3 , dx — max &г-,
where a{, bi (i = 1, ..., n) are the coordinates of the vectors a, b defined in H, 2°, 3°, respectively.
Thus, in virtue of assumptions H, 1°, 2° and by the convexity of the function bqa in the simplex S 0(p) (see (3)), we obtain
fi(p) = sup g(p, q) < sup bqa < max bqa = m ax(0, bt \\p\\a, ..., bn\\p\\a)
QeS(p) QeS(p) qeS0 (p)
= l№ max bi = di\\p\\a.
If we shall assume that the inequality f N(p) < dN\\p\\a holds for some N ^ 1, then by assumptions H, 2°, 3°, 1° and by the covexity of the function bqa + dN((a — I)q + ||p||)a with respect to the variable qeS0{p)f
we obtain
/v + i(P) = sup [g(p, q )+ h (p , q)fN(T{p, tf))] < sup [bqa + dN\\T{p, q)||a]
QtS(p) qeS(p)
< sup [bqa + dN((a — l ) q + ||p||)a| < max [bqa -\-dN({a — I)q+\\p\\)a\
qeS(p) qeS0(p)
= m ax(й^|| î > IIе, bi\\p\\a + dNaa1\\p\\a, ..., bn\\p f+ dNaan\\pf)
= llpir maxCdjy, max (bi + aaidN)'\ = ||p||emax(d^, dN+1) .
Hence and by the monotonicity of the sequence {dk} (see Lemma 1), we obtain the inequality
fjsr+iiP) ^ dN+i \\p\\a-
By virtue of the principle of mathematical induction and by Lemma 1, we have the inequality
fk (p )< dk\\p\\a < (% юах (bi (1 —«?)), * = 1 , 2 , . . . for peD and HplK Cx.
Taking into account the boundedness of sequence (4) and Lemma 2, we obtain the thesis of the theorem.
T h e o r e m 2. I f assumptions H* are fulfilled, then sequence (4) is diver
gent for every peD, p Ф 6.
P roof. Let {Mk} be the sequence defined in Lemma 3. Using as
sumptions H*, 2°, 1° and substituting q — ql\\p\\, p Ф d, we have f x(p) = sup g{p, q) > sup bqa > max bqa = max bqa\\p\\a = Jfil|p||a.
qeS{p) qeS(p) qeS0(p) ~eQ
378 W . Sobieszek
If we shall assume that the inequality f N{p) > MN||р|Г holds for some N > 1, then according to assumptions H*, 2°, 4°, 1°, 3°, we obtain
Jn+
i(P )= s u p [g{p,q) + h {p ,q )fN(T (p ,q ))]^ $ up [bqa
+ f N(T (p , g))]
qeS(p) qeS(p)
> sup [ 62 a +/iv (^ (p , q))] > sup [&да+Ж я ||Т(р, g) IIе]
qeS0(p) qeSffip)
> m a x [&да+ЖУ((а-1)д+|1:Р11)а] = тах[&да+ Ж ^ ((а -1 )д + 1 )а] ||р|Г
QcSolP)
' «eQ= MN+l\\p\\a, where q = q!\\p\\a.
By the principle of mathematical induction we have f k(p) > Mk\\p\\a for h = 1,2, ... Hence and by Lemmas 2 and 3, the thesis of theorem follows.
4. Two theorems which we shall prove in this section, will deal with the solutions of equation (1).
T heorem 3. I f assumptions H are fulfilled and if M is an arbitrary real number, which fulfils the inequality Mi > max (6г-/(1 — at)) and the
numbers a{ and bi are the coordinates respectively of the vectors a and b, defined in assumptions H, 3°, 2°, then each function f(p) — M\\p\\a is the solution of equation (1).
P ro o f. B y conditions (2) for the singular point ê we obtain
sup [g(p, q) +M h(p, q)\\T(p,q)Wa] > g{p,&) +M h{p,P) \\Т(р,Щ а = 'Ж||р||а.
•qeS(p)
According to assumptions H, 1°, 2°, 3°, 4°, from definition of the number Ж and by the convexity of the function bqajrM {[a — I)q + ||p||)a with respect to the variable qeS0(p) we obtain
sup [g(p, q)+M h (p, q)\\T(p, g)||a] < sup [bqa+ M ( ( a - I ) q + |[p|[)a]
■ qeS(p) QeS(p)
< max [bqa + M ((a - I )q + \\p[\)a\ = m ax(M\\p\\a, b^pf. + MalWp^, ...
qeS^P)
. . . ,Ъп\\рТ+Ма«п\\рТ)
= ||р||ат а х (Ж , b i+ M al, ..., 6П+ Ж < ) = Ж||р||а.
Hence, taking into account the previous inequality, we obtain the
«quality
Jf||p||“ = s u p [g{p, q) + h(p, q)M\\T{p, g)||e], . qeS(p)
which completes the proof of the theorem.
T heorem 4. I f the assumptions of Theorem 1 are fulfilled, then the
■ value of the problem of the dynamic planning is the solution of equation (1).
P roof. Let, as above, w(p) denote the value of the problem of the dynamic planning. Obviously w(p) — limf k(p), where {fk(p)} is the
k-*- oo
sequence of the successive approximations (4).
By the monotonicity of sequence (4) (see Lemma 2), we obtain the inequality
f k(p) < sup [g(p, q) + h{p, q)w(T{p, g))].
QeS(p)
Hence, passing to the limit, we have
w(p) < sup [g{p, q) + h{p, q)w(T(p, g))].
QtS(p)
By virtue of properties of the operation “sup”
f k ( P ) > 9 (P,Q.) + h (p,q)fk_l (T(p,q)) for every qeS{p).
Thus, in the limit we obtain
w(p) > g(p, q )+ h (p , q)w(T{p, g)) for every qeS{p).
Hence it follows the inequality
w {p )> sup [g(p,q) + h{p,q)w (T(p,q))], a*S(p)
which together with previous, gives the thesis of the theorem.
From Theorems 3 and 4 it follows that equation (1) with the singular point, under assumptions H has infinitely many solutions. One of them is w(p) the value of the problem of the dynamic planning. Therefore, the usefulness of equation (1) for the investigation of the structure of the value of the problem of the dynamic planning and for its determina
tion — is slight. In connection with this fact arises the following question:
is it possible such “elimination” of the singular point & to obtain a “good”
situation. The answer is positive. In this paper we do not deal with this problem, but we shall show how to solve it. Namely, one can “cut out”
the singular point # in the suitable manner. This “cutting out” depends upon the parameter £e(0,1]. Thus we shall obtain the equation which differs from equation (1), namely, instead of 8(p) in this equation shall appear 8t(p), i.e. the set-image of the mapping 8 with “cut out” singular point. It be seen that this new equation have for every te(0 ,1 ] the unique solution ft(p) convergent to the value of the problem of the dynamic planning, when
References
[1] E. B e llm an , Dynamic programming, Princeton 1957.
[2] T. Ja n k o w sk i and M. K w ap isz, On the convergence of approximate solutions of a dynamic programming equation, Colloq. Math. 21 (1970), p. 149-160.
[3] — — O przyblizonych iteracjach dla uTcladôw rownan programowania dynamicz- nego, Zeszyty Naukowe P. G-., Matematyka 6 (1971), p. 3-22.
14 — P rac e M atem atyczn e 19 z. 2