• Nie Znaleziono Wyników

Bayesian Control of a Discrete-Time Linear System with Uniformly Distributed Disturbances

N/A
N/A
Protected

Academic year: 2021

Share "Bayesian Control of a Discrete-Time Linear System with Uniformly Distributed Disturbances"

Copied!
14
0
0

Pełen tekst

(1)

Dariusz Walczak (Houston)

Bayesian Control of a Discrete-Time Linear System with Uniformly Distributed Disturbances

Abstract The main objective of this article is to develop Bayesian optimal control for a class of linear stochastic discrete time systems. By taking into consideration that the disturbances in the system are given by a random variable having a uni- form distribution with a natural parameter, we prove that the control in the sense of Bayes is the solution of a linear system of algebraic equations for the conjugate priors.

2010 Mathematics Subject Classification: 60G40, 62L15.

Key words and phrases: Bayes control, optimal, singular system, disturbances, Pareto distribution.

1. Introduction Linear stochastic discrete time systems, are systems in which the variables take their value at instantaneous time points. Discrete time systems differ from continuous time ones in that their signals are in the form of sampled data. In real systems, the discrete time system often appears when it is the result of sampling the continuous-time system or when only discrete data are available for use. With the development of the digital computer, the stochastic discrete time system theory plays an important role in the general control theory.

When considering such systems, the performance measurement and the available information at the moments of control specification are two very important factors. The small deviations of the parameters can be treated as disturbances. As the random disturbance is admitted, the performance measure will be the mean value of the deviation of the system state from the required behavior. When all the parameters of the system are known and the distribution of disturbances is well defined, then the optimal control can be determined. The extension of this model to an adaptive one means that the disturbances are uncertain. Adaptive control is the control method used by a controller which must adapt to a controlled system with parameters that vary, or are initially uncertain (see Tesfatsion [14] for the history of the adaptive control).

Based on the behavior of the system we can learn the details of the distur- bances. It is assumed that the disturbance has a fixed probabilistic descrip- tion which is determined by the assumption. In this paper it is assumed that

(2)

the distribution function is known with an accuracy of its parameters and the disturbances additionally change the state of the system. It resembles the statistical problem of estimation.

Initially, it was the seminal papers by Wald [19], [20], where the back- ground of modern statistical decision theory was established. The statistical decision theory approach to control problems was then applied some years later (see books by Sworder [12], Aoki [1], Sage and Melsa [10]). The new class of control systems under uncertainty was called adaptive (see Tesfatsion [14]).

In these adaptive control problems Bayesian systems play an important role.

In this class of control models it is assumed that the preliminary knowledge of the disturbances is given by a priori distributions of their parameters.

In this work we find the optimal feedback control of a dynamic linear system with discrete time and with additive disturbances. The disturbances are assumed to be independently and identically distributed (’i.i.d’). The distribution is specified up to a set of parameters. The loss function is a positively semi-definite quadratic form that depends on the system state and the control applied. The control horizon is a bounded random variable with a known distribution, and independent from the random disturbances in the system. In the Bayesian approach we assume knowledge of the prior distribution. Problems of this kind are classified in the literature as the adaptive control problems. For the particular case of i.i.d. disturbances belonging to the exponential family of distributions, and using a dynamic programming approach that we also utilize here, optimal controls can be found in [16]; other results in related settings, and also including minimax control are available in [16], [17], and [15].

The form of feedback controls considered here is straightforward to com- pute due to its recursive nature. We are also allowing incomplete informa- tion about the distributions involved by explicitly modeling uncertainty in the parameter, which is the setting often found in practice. Our solution approach based on the Bayes’ theorem is intuitive and explicitly handles this uncertainty via the theory of conjugate distributions (cf. [3], [13]). Bayesian methods are experiencing a recent resurgence in interest due to their appli- cability to pattern recognition and more generally in the machine learning area.

By implementing the dynamic programming approach we determine the analytical form of the optimal controls in the closed feedback loop: in Sec- tion 3 for disturbances distributed uniformly on [0, λ] and in Section 4 for disturbance uniformly distributed on [λ1, λ2].

2. Model Formulation The control system under consideration is formulated as follows:

xn+1 = αnxn+ un+ γnvn, (1) where x0 = e, n = 0, 1, . . . , M, and un ∈ (−∞, +∞), where n is the time

(3)

index. The xn is the state variable, un is the control, and v0, . . . , vM are the i.i.d random variables modeling system disturbances; αn and γn are given constants, and we assume that γn 6= 0. The control horizon N is a random variable bounded by M and independent from disturbances and the controls;

it is distributed with the given probabilities pk so that:

Pr(N = k) = pk, k = 0, 1, . . . , M,

M

X

k=0

pk = 1, and pM > 0.

We will use the following notation:

Xn= (x0, x1, . . . , xn), Un= (u0, u1, . . . , un).

Our control policy is U = UM, and given the policy the loss function is defined as:

L(U, XN) =

N

X

i=0

(six2i + kiu2i),

where si, and ki are given positive numbers. Assuming that un depends on Xn, Un, and because of the fact that γn 6= 0, the un is a function of v0, . . . , vn−1. Let the distribution of U be parameterized by λ, then for a given initial state x0 the risk R(λ, U ) under policy U is defined as:

R(λ, U ) = EN



EλL(U, XN)



= EN

 Eλ

 N

X

i=0

(six2i + kiu2i) | X0



.

The expectations in the above formula are with respect to the distribution of the disturbances and the random horizon, respectively. For a given prior distribution π of the parameters the corresponding risk has the following form:

H(π, U ) = EπR(λ, U ) = ENEπ

 N

X

i=0

(six2i + kiu2i) | X0

 . We will call a control policy Bayesian if it satisfies this condition:

H(π, U) = inf

U ∈UΓ

H(π, U ),

where UΓ is a set of policies for which the risk H(π, U ) exists.

We approach the problem by considering sub-problems obtained by con- ditioning on subsequent times (decision epochs) n, that is, for time n, we assume the knowledge of Xn, Un−1 and seek the optimal (un, . . . , uM) from that time on. The corresponding expectation of risk is then determined as:

Hn(π, U ) = EN

 Eπ

 N

X

i=n

(six2i + kiu2i) | Xn, Un−1



| N ≥ n

 .

(4)

The control policy which minimizes the expected risk is called the Bayes control (cf. [7], [8], [11], [18]).

3. Bayesian Control with Disturbances Uniformly Distributed on [0, λ] The distribution of the disturbance is absolutely continuous with respect to the Lebesgue measure with density of the form (where for any set A, 1Ais its indicator function):

p(v, λ) = 1

λ1[0,λ](v), Λ = (0, +∞), λ ∈ Λ.

The parameter λ is unknown but we assume that its prior distribution is Pareto:

g(λ; β, r) = βrβ

λβ+11(r,+∞)(λ), r > 0, β > 2.

The family of Pareto distributions with parameters β and r (also known as the family of one-sided Pareto distributions) constitutes a conjugate family with respect to the uniform distribution of the type considered here (cf. [3]). A convenient property of the conjugate distribution family is that the posterior distribution is also of the same type but with different parameters. Using this property we calculate, for given X1 and U0, the v0 and obtain via the Bayes Rule that:

f (λ | X1, U0) = f (λ | v0 = v)

= p(v, λ)g(λ; β, r) R

Λp(v, λ)g(λ; β, r)dλ

= 1

λβ+2(β + 1)(r ∨ v)β+11(r∨v,+∞)(λ)

=: g(λ; β1, r1),

where r ∨ v := max{r, v}. We thus see that the posterior distribution is also of one-sided Pareto type with new parameters:

β1= β + 1, r1 = r ∨ v.

Analogously, after observing Xn and knowing Un−1, the control applied at time n − 1, we obtain:

f (λ | Xn, Un−1) = f (λ | vn−1= v) = g(λ; βn, rn),

with βn = βn−1 + 1, rn = rn−1 ∨ v. The conditional distribution of the random variable vn after observing Xn has density

h(v | Xn, Un−1) = Z +∞

0

p(v, λ)g(λ; βn, rn)dλ

= Z +∞

0

1

λ1(0,λ)(v)βnrnβn

λβn+11(rn,∞)(λ)dλ

= βnrβnn

βn+ 1

1

(rn∨ v)βn+11(0,+∞)(v).

(5)

Lemma 3.1 The following equalities hold E(vn| Xn, Un−1) = 1

2 βn

βn− 1rn= Qnrn, Qn= βn 2(βn− 1) E(v2n| Xn, Un−1) = Qn1r2n, Qn1 = βn

3(βn− 2) E(rn+1| Xn, Un−1) = Qn2rn, Qn2 = βn2

β2n− 1 E(r2n+1| Xn, Un−1) = Qn3r2n, Qn3 = βnn− 1)

n+ 1)(βn+ 2) E(xn+1| Xn, Un−1) = αnxn+ un+ γnQnrn

E(x2n+1| Xn, Un−1) = (αnxn+ rn)2+ 2(αnxn+ unnQnrn+ γn2Qn1r2n E(xn+1rn+1| Xn, Un−1) = (αnxn+ un)Qn2rn+ γnQn4rn2, Qn4

= βn2

2(βn+ 1)(βn− 2).

Proof We show explicit derivation for only the last two of the above equal- ities as the remaining ones can be demonstrated analogously.

E(x2n+1| Xn, Un−1) = E((αnxn+ un+ γnvn)2 | Xn, Un−1)

= (αnxn+ un)2+ 2(αnxn+ unnE(vn| Xn, Un−1) + γ2E(vn2 | Xn, Un−1)

= (αnxn+ un)2+ 2(αnxn+ unnQnrn+ γn2Qn1rn2

E(xn+1rn+1| Xn, Un−1) = E((αnxn+ un+ γnvn)(rn∨ vn) | Xn, Un−1)

= (αnxn+ un)Qn2rn+ γn

 βn

βn+ 1rnβn

1 2

1

rnβnrn2+ 1 βn− 2



= (αnxn+ un)Qn2rn+ γn βn2

2(βn+ 1)(βn− 2)rn2 = (αnxn+ un)Qn2rn+ Qn4rn2, We are now ready for the main result of this section.

Theorem 3.2 Under the assumptions of this section concerning the distur- bances and the prior distribution π of the parameter, the optimal Bayesian control of System (1) takes the form

un= −

πn+1

πn αnAn+1 kn+ πn+1π

n An+1xn

πn+1

πnnAn+1Qn+ Bn+1Qn2) kn+ πn+1π

n An+1 rn, (2)

with its corresponding risk of the form

Wn= Anx2n+ 2Bnxnrn+ Cnr2n. (3)

(6)

The An, Bn, and Cn are functions of βn, do not depend on rn, and satisfy the recursive relationships shown below

An= sn+

πn+1

πn knα2nAn+1 kn+πn+1π

n An+1 (4)

Bn= kn πn+1

πn αn



γnAn+1Qn+ Bn+1Qn2

 kn+πn+1π

n An+1 (5)

Cn= πn+1

πn



γn2Qn1An+1+ 2Bn+1γnQn4 + Cn+1Qn3



(6)

−kn+ππn+1

n An+1

kn2α2n · Bn2

AM = sM, BM = CM = 0. (7)

Proof Let πk=PM

i=kpi, with this notation we can write risk H as Hn= EN

 E

hXN

i=n

(six2i + kiu2i) | Xn, Un−1

i

| N ≥ n



(8)

=

M

X

k=n

EhXk

i=n

(six2i + kiu2i) | Xn, Un−1ipk πk

= EhXM

i=n

πi

πn(six2i + kiu2i) | Xn, Un−1i .

We derive now the Bayesian control unand the risk associated with it

Wn= min

un

Hn= min

ui,n≤i≤MEhXM

i=n

πi πn

(six2i + kiu2i) | Xn, Un−1i .

From Bellman’s Optimality Principle we obtain Wn= min

ui,n≤i≤M



snx2n+ knu2n (9)

+ min

ui,n+1≤i≤ME

 M

X

i=n+1

πi πn

(six2i + kiu2i) | Xn, Un−1



= min

ui,n≤i≤M



snx2n+ knu2n

+ min

ui,n+1≤i≤M

 πn+1 πn

E

 M

X

i=n+1

πi πn+1

(six2i + kiu2i) | Xn+1, Un



| Xn, Un−1



(7)

and thus it follows that Wn= min

un



snx2n+ knu2nn+1 πn

E



Wn+1| Xn, Un−1



(10) Since the integrand is bounded from below we can move the minimum inside the integral, which results in an equation that the Bayesian control n has to satisfy

2knun+ ∂

∂un

πn+1 πn

E



Wn+1 | Xn, Un−1



= 0 (11)

and together with Equation1 we further obtain 2knunn+1

πn

E

 ∂

∂xn+1

Wn+1| Xn, Un−1



= 0. (12)

Now, by means of backward induction, we show that Wnhas the desired form (3).

1. For n = M , WM = Smx2M and thus AM = sM, BM = CM = 0.

2. Assume (inductively) that Wn+1has the form as in (3). We obtain that

∂xn+1

Wn+1 = 2An+1xn+1+ 2Bn+1rn+1, and also that

E

 ∂

∂xn+1

Wn+1| Xn, Un−1



= 2An+1nxn+ un) + 2An+1γnQnrn

+ 2Bn+1Qn2rn. Equation (12) gives us

2knunn+1 πn



2An+1nxn+ un) + 2An+1γnQnrn+ 2Bn+1Qn2rn



= 0, which can be converted into the expression for the optimal control (2) that we have been seeking.

Substituting into Equation10 the expression for E



Wn+1| Xn, Un−1

 previ- ously obtained in Lemma 3.1 as well as the just obtained expression for un, by equating coefficients of x2n, xnrn, and r2n we verify relationships (4). 

4. Bayesian Control with Disturbances Uniformly Distributed on [λ1, λ2] In this section we consider a more complex model with two

(8)

unknown parameters for the distribution of disturbances. Namely, the dis- turbances are uniformly distributed over [λ1, λ2], and both λ1 and λ2 are unknown; for the sake of notation we will sometimes use λ to denote the vector (λ1, λ2). These i.i.d. disturbances have the following density:

p(v, λ) = 1

2− λ1)112](v), where λ2 > λ1.

The prior distribution is assumed to be the two-sided Pareto distribution whose density with respect to the Lebesgue measure on the plane is

g(λ; α, β, γ) = γ(γ + 1)(β − α)γ

2− λ1)γ+2 ·1(−∞,α)1)1(β,+∞)2), where β > α, γ > 2.

(13) We expand the notation slightly to accommodate additional parameters and operators needed:

x ∨ y := max{x, y}, and x ∧ y := min{x, y}.

We write the system equation now as

xn+1= anxn+ un+ cnvn, n = 0, 1, . . . , M. (14) In order to determine the optimal control in the Bayes’ sense we will follow the same approach as in the previous section. For a given X1, U0 we thus have

f (λ | X1, U0) = f (λ | v0 = v) = p(v, λ)g(λ; α, β, γ) R

Λp(v, λ)g(λ; α, β, γ)dλ, Λ = {(x, y) : y > x}.

Integrating out λ in the last denominator gives us Z

Λ

p(v, λ)g(λ; α, β, γ)dλ = γ(β − α)γ

(γ + 2)(β ∨ v − α ∧ v)(γ+1) and ultimately

f (λ | X1, U0) = (γ + 1)(γ + 2)

2− λ1)γ+3 ·1(−∞,α∧v)1)·1(β∨v,+∞)2)·(β∨v−α∧v)γ+1

= g(λ; α1, β1, γ1).

We can see that the posterior distribution of λ has indeed the same form as the prior with updated parameters

α1 = α ∧ v, β1 = β ∨ v, γ1 = γ + 1.

In an analogous manner, at the n-th stage, given Xnand Un−1we obtain the posterior with parameters

αn= αn−1∧ v, βn= βn−1∨ v, γn= γn−1+ 1,

(9)

and the conditional distribution of vn with density h(v | Xn, Un−1) = γn

γn+ 2

(β − α)γ (β ∨ v − α ∧ v)γ+1. To help with formalism we need to introduce some notation. Let

S1n= 1

γn+ 2, S2n= γn

γn+ 2, S3n= 1

n− 1)(γn+ 2),

S4n= 1

n− 1)(γn+ 2)(γn− 2). We then write

E0= Z βn

αn

h(v | Xn, U n − 1)dv = S2n, E1=

Z αn

−∞

v · h(v | Xn, Un−1)dv = E1α· αn+ E1β· βn, E1α= S1n+ S3n, E1β = −S3n,

E2= Z βn

αn

v · h(v | Xn, Un−1)dv = E2α· αn+ E2β· βn, E2α= E2β = 1

2S2n, E3=

Z +∞

βn

v · h(v | Xn, Un−1)dv = E3α· αn+ E3β· βn, E3α = −S3n, E3β = S1n+ S3n,

E4= Z αn

−∞

v2· h(v | Xn, Un−1)dv = E4α2 · α2n+ E4αβ· αnβn+ E4β2· βn2, E4α2 = S1n+ 2S3n+ 2S4n, E4αβ = −2Sn3 − 4S4n, E4β2 = 2S4n,

E5= Z βn

αn

v2· h(v | Xn, Un−1)dv = 1

3S2nn2+ αnβn+ βn2), E5α2 = E5αβ = E5β2 = 1

3S2n, E6=

Z +∞

βn

v2· h(v | Xn, Un−1)dv = E6α2 · α2n+ E6αβ· αnβn+ E6β2· β2n, E6α2 = 2S4n, E6αβ = −2S3n− 4S4n, E6β2 = S1n+ 2Sn3 + 2S4n,

E



v | Xn, Un−1



= E1+ E2+ E3 = 1

2(αn+ βn), E



v2| Xn, Un−1



= E4+ E5+ E6 = Qn12n+ βn2) + Qn2αnβn, Qn1 = S1n+ 2S3n+ 4S4n, Qn2 = −4S3n− 8S4n.

(10)

Lemma 4.1 Under the assumptions of this section and utilizing the above notation the following relationships hold

E



αn+1βn+1 | Xn, Un−1



= E3α2n+ βn2) + (2E1α+ Sn2nβn

E



α2n+1+ βn+12 | Xn, Un−1



= (E4α2 + E2α+ E3β+ E6α2)(α2n+ β2n) + 2(E2β+ E3β+ E4βnβn

E



xn+1n+1+ βn+1) | Xn, Un−1



= (anxn+ un)(2S1n+ S2n)(αn+ βn) +cn



E4α2+E2α+E3α+E6α2(α2n2n)+ E4αβ+E2β+E3β+E6αβ+E1α+E2αnβn



E



x2n+1 | Xn, Un−1



= (anxn+ un)2 + 2cn(anxn+ un)1

2(αn+ βn) + c2n



Qn12n+ β2n) + Qn2αnβn



E



xn+1| Xn, Un−1



= anxn+ un+ cn1

2(αn+ βn). (15) Proof We explicitly show derivation of only one of the equalities as the remaining ones can be verified in a analogous manner.

E



xn+1n+1+ βn+1) | Xn, Un−1



= E



(anxn+ un+ cnvn)(αn∧ v + βn∨ v) | Xn, Un−1



= (anxn+ un)(2S1n+ S2n)(αn+ βn) + cn



E4α2+ E2α+ Eα3 + E6α2n2

+ E4α2 + E2β+ E3β+ E6αβ+ E1α+ E2αnβn+ E4β2 + E6β2+ E1β+ E2βn2



= (anxn+ un)(αn+ βn) +cn



E4α2+E2α+E3α+E6α2(αn2n2)+ E4αβ+E2β+E3β+E6αβ+E1α+E2αnβn



since E4α2 = E6β2, E4β2 = E6α2, E2β = E2α, and E3α= E1β.  In full analogy to the proof of Theorem3.2one can show a similar result under the distributional assumption of this section, i.e. for disturbances being i.i.d on [λ1, λ2].

(11)

Theorem 4.2 Under the above assumptions on the distribution of distur- bances and the prior, the optimal Bayesian control of system (14) takes the form

un= −

πn+1

πn An+1an kn+ππn+1

n An+1

xn

πn+1

πn (12cnAn+1+ Bn+1) kn+ππn+1

n An+1

n+ βn), (16) with its corresponding risk of the form

Wn= Anx2n+ 2Bnxnn+ βn) + Cn2n+ βn2) + 2Dnn+ βn), (17) where An, Bn, Cn, and Dn do not depend on βn and satisfy the following recursive relationships

An= sn+

πn+1

πn An+1kna2n kn+πn+1π

n An+1 Bn= kn

πn+1

πn an(12cnAn+1+ Bn+1) kn+ ππn+1

n An+1 Cn= πn+1

πn



Cn2Qn1An+1+ 2cnBn+1 E4αβ+ E2β+ E3β+ E6αβ + Dn+1E3α + E4α2 + E2α+ E3α+ E6α2Cn+1



−kn+πn+1π

n An+1

2 Bn,

Dn= πn+1 πn



An+1cnQn2 + Bn+1cn(E4αβ + E2β+ Eβ3 + E6αβ+ E1α+ E2α) + Cn+1(E2β+ E3β+ E4αβ) + 2Dn+1E1α

with AM = sM, BM = CM = DM = 0.

It seems that one cannot obtain results of Theorem3.2directly from Theorem 4.2even though the model in this section is the most general as far as uniform distribution on the line. When we try to specialize results of the latter to uniform distribution on [0, λ] considered in Theorem 3.2 by setting αn = 0, γn = βn, and βn = rn, we obtain a slightly different form of control.

Namely instead of Qn = 2(ββn

n−1) we have 12, and instead of Qn2 = ββ22n n−1 we have 1, the general form of the risk is the same but with different coefficients.

One can also use this general model to control the model with distur- bances i.i.d. on [λ − a, λ + a] with a constant a and λ unknown, but the optimality cannot be guaranteed a priori.

5. Conclusion By utilizing certain properties of conjugate distributions we have obtained analytical expressions for the adaptive feedback control in the sense of Bayes for a linear model with discrete time and a finite random horizon with additive i.i.d. disturbances. Two types of uniform distributions

(12)

were considered, and for each type we show that the controls can be easily calculated numerically using finite recursion.

The model can be further researched to determine conditions for a conve- nient form of feedback control under additive and independent disturbances that are distributed uniformly on [λ − a, λ + a] with unknown a and λ. One can also analyze the structure of control when the number of disturbances is random, e.g. when for each time point we introduce an independent binary random variable to turn the disturbance on or off.

Extending the model to multiple dimensions is interesting in itself but seems to require considerably more effort. However, such an extension should also be more relevant in practice given various potential applications (see e.g. [4], [2], [5], [6], [9]).

6. Acknowledgment I would like to thank Professor Krzysztof Sza- jowski for the initial inspiration as well as discussions and support throughout this research project.

References

[1] M. Aoki. Optimization of stochastic systems. Topics in discrete-time systems. Mathe- matics in Science and Engineering, Vol. 32. Academic Press, New York-London, 1967.

MR 0234749 Zbl 0168.15802.

[2] W. S. Black, P. Haghi, and K. B. Ariyur. Adaptive systems: History, techniques, problems, and perspectives. Systems, 2:606–660, 2014.doi: 10.3390/systems2040606.

[3] M. DeGroot. Optimal Statistical Decision. McGraw Hill Book Comp., New York, 1970. Zbl 1136.62011.

[4] T. E. Duncan, B. Pasik-Duncan, and L. Stettner. Adaptive control of a partially observed discrete time Markov process. Appl. Math. Optim., 37(3):269–293, 1998.

doi: 10.1007/s002459900077;MR 1610799.

[5] J. I. Gonz´alez-Trejo, O. Hern´andez-Lerma, and L. F. Hoyos-Reyes. Minimax con- trol of discrete-time stochastic systems. SIAM J. Control Optim., 41(5):1626–1659 (electronic), 2002. doi: 10.1137/S0363012901383837;MR 1971966.

[6] A. Grzybowski. Minimax control of a system with actuation errors. Zastos. Mat., 21(2):235–252, 1991.MR 1145478;Zbl 0756.93088.

[7] H. Kushner. Introduction to stochastic control. Holt, Rinehart and Winston, Inc., New York-Montreal, Que.-London, 1971. MR 0280248;Zbl 0293.93018.

[8] Z. Porosiński, K. Szajowski, and S. Trybuła. Bayes control for a multidimensional stochastic system. Systems Sci., 11(2):51–64 (1987), 1985.MR 919393;Zbl 0629.93073.

[9] W. J. Runggaldier. Concepts and methods for discrete and continuous time control un- der uncertainty. Insurance Math. Econom., 22(1):25–39, 1998. The interplay between insurance, finance and control (Aarhus, 1997). doi: 10.1016/S0167-6687(98)00006-7;

MR 1625819;Zbl 0916.93085.

(13)

[10] A. P. Sage and J. L. Melsa. Estimation theory with applications to communications and control. McGraw-Hill Book Co., New York-D¨usseldorf-London, 1971. McGraw- Hill Series in Systems Science.MR 0501447;Zbl 0255.62005.

[11] G. Sawitzki. Exact filtering in exponential families: discrete time. Math. Operations- forsch. Statist. Ser. Statist., 12(3):393–401, 1981. doi: 10.1080/02331888108801598;

MR 640558.

[12] D. Sworder. Optimal adaptive control systems. Mathematics in Science and Engineer- ing. Vol. 25. Academic Press, New York-London, 1966. MR 0211801 Zbl 0168.15801.

[13] K. Szajowski and S. Trybuła. Bayes control of a discrete time linear system with random disturbances. Random horizon case. Podstawy Sterowania, 14:109–115, 1984.

Zbl 0552.93066.

[14] L. Tesfatsion. A dual approach to Bayesian inference and adaptive control. Theory and Decision, 14(2):177–194, 1982.doi: 10.1007/BF00133976;MR 665583;Zbl 0489.93059.

[15] S. Trybuła. Sterowanie dualne przy samoreprodukujących się rozkładach. In Prace V Krajowej Konferencji Automatyki, pages 163–169, Gdańsk, 1971. Sekcja 1. Teoria sterowania.

[16] S. Trybuła and K. Szajowski. Decision making in an incompletely known stochastic system. I. Zastos. Matem., 19:31–41, 1987. MR 897512;Zbl 0645.62008.

[17] S. Trybuła and K. Szajowski. Decision making in an incompletely known stochastic system. II. Zastos. Matem., 19:43–56, 1987. MR 897512;Zbl 0645.62009.

[18] D. Walczak. Bayes and minimax control of discrete time linear dynamical systems.

Technical report, Wrocław University of Technology, Faculty of Fundamental Problems of Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland, 1986. Master’s Thesis (in Polish).

[19] A. Wald. Contributions to the theory of statistical estimation and testing hypothe- ses. Ann. Math. Statistics, 10:299–326, 1939. doi: 10.1214/aoms/1177732144; MR 0000932;Zbl 65.0585.03.

[20] A. Wald. Statistical Decision Functions. John Wiley & Sons, Inc., New York, N. Y.;

Chapman & Hall, Ltd., London, 1950.

(14)

Sterowanie bayesowskie systemem liniowym z czasem dyskretnym przy jednostajnych zakłóceniach

Dariusz Walczak

Streszczenie W pracy tej rozważa się zagadnienie sterowania optymalnego linio- wym systemem dynamicznym z dyskretnym czasem, przy addytywnych zakłóce- niach. Zakłócenia są niezależnymi zmiennymi losowymi o jednakowym rozkładzie podanym z dokładnością do parametru. Sterowanie odbywa się w układzie zamknię- tym. Funkcja strat to nieujemnie określona forma kwadratowa zależna od stanu systemu i zastosowanego sterowania. Horyzont sterowania jest ograniczoną zmienną losową o znanym rozkładzie, niezależną od zakłóceń, a pomiary stanu nie są obar- czone błędem. Wykorzystując metodę programowania dynamicznego wyznaczono analityczną postać algorytmu bayesowskiego sterowania optymalnego w układzie zamkniętym: dla zakłóceń o rozkładzie jednostajnym na [0, λ] oraz dla zakłóceń o rozkładzie jednostajnym na [λ1, λ2].

2010 Klasyfikacja tematyczna AMS (2010): 60G40, 62L15.

Słowa kluczowe: Sterowanie bayesowskie, zakłócenia, rozkład jednostajny, rozkład Pareto, rozkłady sprzężone.

Dariusz Walczak holds a PhD degree in Business Admin- istration (Operations & Logistics) from Sauder Business School, University of British Columbia in Vancouver, along with MSc degrees in Mathematics from UBC, and MEng in Applied Mathematics/Engineering from Wro- claw University of Technology.

He is a Principal Research Scientist at PROS Inc.

in Houston, Texas where he is involved in design and deployment of revenue management (RM) and pricing optimization appli- cations across a variety of industries. Dariusz currently chairs the Revenue Management and Pricing (RMP) Section of INFORMS.

Dariusz Walczak PROS Inc.

3100 Main Street, Suite 900 Houston, TX 77002, USA U.S.A.

E-mail: dwalczak@pros.com URL: http://www.pros.com

Communicated by: Krzysztof Szajowski

(Received: 23rd of November 2015; revised: 21th of December 2015)

Cytaty

Powiązane dokumenty

Dans un premier temps, le chercheur articule sa problématique autour d’une approche globale interdisciplinaire qui se focalisera sur le côté évolutif du processus créatif

Irrigation canals consist of several connected canal reaches, the inflow or outflow of which can be controlled using structures such as so-called overshot or undershot gates,

Amplituda zmian względnych regionalnych wskaźników zmian struktury zatrudnienia jest jednak wyraźnie mniejsza niż w przypadku nakładów inwesty­ cyjnych.. Zmiany te zasadniczo

With reference to the work of Verriest and Lewis (1991) on continuous finite-dimensional systems, the linear quadratic minimum-time problem is considered for discrete

To solve the problem, we introduced an adequate Hilbertian structure and proved that the optimum and optimal cost stem from an algebraic linear infinite dimensional equation which

Thus we propose a control strategy realized in a two-level hierarchical structure with a coordinator on the upper level and local controllers on the lower one.. Let the

Keywords: equality constraints, discrete-time systems, linear matrix inequality, state feedback, control algorithms, quadratic stability, singular

- A linear physical system is said to be reciprocal if its constitutive law is represented by a Lagrangian subspace S of the phase space P..