Index of /rozprawy2/10903

(1)

AGH University of Science and Technology

Faculty of Computer Science, Electronics and Telecommunications

Department of Computer Science

Doctoral dissertation

Adaptive population-based

algorithms for solving single- and

multiobjective inverse problems

Ewa Gajda-Zagórska

Supervisor: Prof. dr hab. inż. Robert Schaefer

(2)

(3)

Akademia Górniczo-Hutnicza im. Stanisława Staszica

Wydział Informatyki, Elektroniki i Telekomunikacji

Katedra Informatyki

Rozprawa doktorska

Adaptacyjne algorytmy

populacyjne rozwiązywania

jedno- i wielokryterialnych

problemów analizy odwrotnej

Ewa Gajda-Zagórska

Promotor: Prof. dr hab. inż. Robert Schaefer

(4)

(5)

Acknowledgements

I would like to express my gratitude to my supervisor, Professor Robert Schaefer, for his extremely helpful remarks, patience, and assistance in car-rying out the research described in this dissertation.

I would like to thank my husband, Marcin, for support, patience, and many fruitful discussions.

I acknowledge that during the work on my thesis I have been a scholarship fellow of the Doctus - Małopolski fundusz stypendialny dla doktorantów project cofunded by EU funds within European Social Fund.

The work has been partially supported by Polish National Science Centre grants DEC-2012/05/N/ST6/03433 and DEC-2011/03/B/ST6/01393.

(6)

(7)

List of Figures

3.1 3D geometry of a logging instrument in a vertical borehole penetrating three dipping layers in Ω domain. Figure originally published in [60]. . . 35 3.2 DC forward problem. The computation of the sample logging curve

con-sists of solving a sequel of multiple forward problems (3.18) over a domain composed of a borehole and ﬁve formation layers with assumed resistiv-ities. This set of layers will be utilized in the experimental Section3.5.1. Figure originally published in [60]. . . 40 3.3 DC inverse problem. Resistivities of three formation layers are sought

basing on a given reference logging curve (for details refer to Section3.5.1). Figure originally published in [60]. . . 42 3.4 3D geometry of a logging instrument in a vertical borehole penetrating

two dipping layers in Ω domain. . . 45 3.5 AC forward problem. The computation of the sample logging curve

con-sists of solving a sequel of multiple forward problems (3.48) over a domain composed of a borehole and four formation layers with assumed resistiv-ities. This set of layers will be utilized in the experimental Section3.5.2. 49 3.6 AC inverse problem. Resistivities of two formation layers are sought

bas-ing on a given reference loggbas-ing curve (for details of this example refer to Section3.5.2). . . 50 3.7 Steps of the SFIL process. . . 53 3.8 Exemplary shrinkage of the object measured after removing the template

(12)

3.9 (a) HGS tree with three levels. (b) Corresponding two-dimensional meshes in the binary version of HGS. (c) Corresponding three-dimensional ge-netic spaces in the real-number version of HGS. . . 57 3.10 Simulations of the DC inverse problem, metaepoch 1 in pure genetic

case. Left: total execution time for the entire population. Right: average execution time for a single individual from the particular population. . . 68 3.11 Simulations of the DC inverse problem, metaepoch 2 in pure genetic

case. Left: total execution time for the entire population. Right: average execution time for a single individual from the particular population. . . 69 3.12 Simulations of the DC inverse problem, pure genetic case. Scatter plots

of ω0 vs. ω1 (top–left), ω0vs. ω2 (top–right) and ω1 vs. ω2 (bottom–left).

Different colours correspond to different fitness values. Figure originally published in [58]. . . 70 3.13 Simulations of the DC inverse problem, pure genetic case. Total execution

time for the heaviest population in the particular epoch. Each metaepoch consists of three epochs. Figure originally published in [58]. . . 71 3.14 Simulations of the DC inverse problem, pure genetic case. Execution

time for the heaviest individual in the particular epoch. Each metaepoch consists of three epochs. Figure originally published in [58]. . . 72 3.15 Simulations of the AC inverse problem, results of the global phase. Green

points represent individuals from the ﬁrst level and red points represent individuals from the second level of the HGS tree. . . 77 3.16 SFIL problem domain, two-parameter case. (a) Location of the slice with

Young modulus values being the subject of inverse analysis. (b) Location of parameters within the slice. . . 80 3.17 SFIL problem domain, nine-parameter case. (a) Location of the slice

with Young modulus values being the subject of inverse analysis. (b) Location of parameters within the slice. . . 81 3.18 Simulations of parameter identiﬁcation in elasticity. Displacements

ob-tained by the molecular statics model, used to compute the reference energy. Figure originally published in [13]. . . 82

(13)

LIST OF FIGURES

3.19 Simulations of parameter identiﬁcation in elasticity. Computation results of stage 1. Green points represent solutions from level 1, red – from

level 2, and blue – from level 3. . . 83

3.20 Simulations of parameter identiﬁcation in elasticity. Computation results of stage 2. Red points represent solutions from level 2 and blue – from level 3. . . 84

3.21 Simulations of parameter identiﬁcation in elasticity. Computation results of stage 3. Points represent solutions from level 3. . . 86

3.22 Simulations of parameter identiﬁcation in elasticity. Young modulus val-ues for the best six points found in the global phase of hybrid computations. 89 3.23 Simulations of parameter identiﬁcation in elasticity, expected solutions. Black squares represent the undamaged material, and white squares rep-resent the damaged material. . . 92

3.24 Simulations of parameter identiﬁcation in elasticity, the best solutions after the local phase. Dark colours represent high Young modulus values (undamaged material), and light colours represent low Young modulus values (damaged material). . . 92

4.1 MCES example objective function f2 (4.22). . . 107

4.2 MCES examplary run: root individuals in the decision space. . . 108

4.3 MCES examplary run: root individuals in the objective space. . . 108

4.4 MCES examplary run: individuals created in leaves, decision space. Black dots represent centres of the recognized clusters. . . 109

4.5 Cluster centres in 10 runs of MCES example. Diﬀerent colours denote diﬀerent runs. . . 110

4.6 MO-HGS benchmark simulations, the original problem. Rank calculated numerically for a mesh of sample points. Areas with rank < 0.1 are denoted with brown. The best 1% of sample points is shown as black dots.111 4.7 MO-HGS benchmark simulations, the modiﬁed problem incorporating incidence between objectives. Modiﬁed ranks calculated numerically for a mesh of sample points. Areas with rank < 0.1 are denoted with brown. The best 1% of sample points is shown as black dots. . . 111

(14)

4.8 MO-HGS benchmark simulations. The best numerically calculated sam-ple points for the original problem (black dots) and for the problem incorporating incidence between objectives (red dots). . . 112 4.9 MO-HGS results, no rank modiﬁcation. Dark colour denotes regions with

the highest density of leaf individuals. . . 114 4.10 MO-HGS results, with rank modiﬁcation. Dark colour denotes regions

with the highest density of leaf individuals. . . 114 4.11 Average numbers of branches and leaves for 100 runs of MO-HGS

(15)

List of Tables

3.1 Simulations of the DC inverse problem, parameters of hp-HGS in the pure genetic case. Accuracy corresponds to the maximum relative error decrement in a single hp-FEM step applied to the solution of a forward problem (threshold value of errorF EM). . . 68 3.2 Simulations of the DC inverse problem, pure genetic case. Selected results

of ﬁnding resistivities of three ground layers, ω0, ω1 and ω2. . . 69

3.3 Comparison of parallelization options for pure genetic DC computations. 71 3.4 Simulations of the DC inverse problem, hybrid case. Starting points

found by the genetic algorithm. . . 72 3.5 Comparison of parallelization options for the global phase of hybrid DC

computations. . . 73 3.6 Simulations of the AC inverse problem, parameters of hp-HGS. Accuracy

corresponds to the maximum relative error decrement in a single hp-FEM step applied to the solution of a forward problem (threshold value of errorF EM). . . 76 3.7 Simulations of the AC inverse problem. Well-ﬁtted individuals found

by the genetic algorithm, used as starting points for the second phase algorithm. All presented points were calculated with accuracy=10. . . . 76 3.8 Comparison of parallelization options of the global phase of the

simula-tion for AC inverse problem. . . 78 3.9 Simulations of the AC inverse problem, results of local BFGS method for

particular points. All presented points were calculated with accuracy=1%. 79 3.10 Simulations of parameter identiﬁcation in elasticity. Parameters of

(16)

3.11 Simulations of parameter identification in elasticity. Comparison of par-allelization options of the first stage of computations. . . 84 3.12 Simulations of parameter identification in elasticity. Parameters of

hp-HGS at stage 2. . . 84 3.13 Simulations of parameter identiﬁcation in elasticity. Comparison of

par-allelization options of the second stage of computations. . . 85 3.14 Simulations of parameter identiﬁcation in elasticity. Comparison of

par-allelization options of the third stage of computations. . . 86 3.15 Simulations of parameter identiﬁcation in elasticity. Comparison of the

experiments for three stages of computations. . . 86 3.16 Simulations of parameter identiﬁcation in elasticity, parameters of

hp-HGS in the hybrid case. . . 88 3.17 Simulations of parameter identiﬁcation in elasticity, summary of the

re-sults of the first and second phase of hybrid computations. The first column presents fitness values of the best six points found by the genetic algorithm, the second one – the number of solver calls during the conver-gence in the local phase of computations, and the third shows the final fitness values after the local phase. . . 89 3.18 Simulations of parameter identification in elasticity. Comparison of

par-allelization options of the global phase of the simulation. . . 90 3.19 Simulations of parameter identiﬁcation in elasticity, ﬁnal points resulting

from the local BFGS search. . . 90

4.1 Parameters of MO-HGS benchmark simulations. . . 115 4.2 Average numbers of branches and leaves for 100 runs of MO-HGS

bench-mark simulations. . . 116 4.3 Average distance to the Pareto set DPfor 100 runs of MO-HGS without

and with rank modiﬁcation. Pareto sets are calculated numerically for the original problem (see Fig. 4.6) and the modiﬁed problem incorporating incidence between objective functions (see Fig. 4.7). . . 117

(17)

Glossary

D Admissible set of solutions to a global optimization problem

V Solution space for a global optimization problem Ω Binary genetic universum in Sections2.2and3.4

Continuous domain of PDEs in Sections2.1,3.1,3.2, and3.3 f Fitness function

U Genetic universum (set of genotypes)

Dr ⊂ D Grid of encoded points (set of phenotypes)

r = #Dr Cardinality of the set of phenotypes

code Bijective encoding function, code : U → Dr

dcode Partial function of inverse encoding (partial decoding function), dcode : D 7−→ U

l Binary code length

P Population

η Occurrence function, η : U → Z+∪ {0}

µ Population cardinality (number of individuals)

λ Offspring population cardinality

x∗

Local minimizer to a global optimization problem B_x∗ Basin of attraction of a local minimizer x

∗ Λr _{Vose’s simplex in R}r M _{Probabilistic measure} ˆ Θ Mapping ˆΘ : Λr_{→ M(D} r) H Heuristic operator

F Selection operator in Chapters2and4 M Mixing function

(18)

Q Probability transition matrix K Set of fixed points of H

ρω Density measure over D

errorF EM Error of the forward hp-FEM method

P _{Pareto set}

AC Alternate current

BFGS Broyden-Fletcher-Goldfarb-Shanno algorithm

BV Boundary Value problem

DC Direct current

FEM Finite Element Method

GA Genetic Algorithm

GOP Global optimization problem

HGS Hierarchic Genetic Strategy

MCES Multiobjective Clustered Evolutionary Strategy

MO-HGS Hierarchic Genetic Strategy for multiobjective optimization

MOEA Multiobjective Optimization Evolutionary Algorithm

MOGA Fonseca and Fleming Multiobjective Genetic Algorithm

PDE Partial differential equation

RM Rank modification

SEA Simple Evolutionary Algorithm

SFIL Step-and-Flash Imprint Lithography

(19)

Chapter 1

Introduction

This chapter begins with motivation behind the research presented in the dissertation. The motivation is followed by the objective and research tasks. Afterwards, an outline of the individual contribution of the author is given. The chapter ends with an overview of text organization.

1.1 Motivation

Solving inverse problems formulated as global optimization problems is often a diﬃcult and costly task. On the other hand, these types of problems are very important, because they appear in key areas of technology, business and medicine (i.e. design of chemical compounds, optimum design, ﬂaw detection, recognition of cancer tissues, search and exploitation of natural resources).

Global optimization tasks generated by the inverse analysis may exhibit multiple diﬃculties mainly caused by their bad conditioning. These obstacles manifest them-selvesby multimodality of the objective functional and low sensitivity of functionals in the vicinity of solutions. In general, ill-posedness results mainly from model sym-metries, inaccuracies of solving methods (measurement, calculation and arithmetic er-rors), insuﬃcient knowledge about the problem, and errors in the data measurement programme.

Most of known strategies and algorithms solving the above problems have both advantages and drawbacks. On the one hand, many convex analysis methods are rel-atively fast, but do not oﬀer a possibility of ﬁnding multiple extrema, whereas global

(20)

stochastic search methods which possess an asymptotic guarantee of success usually have a low accuracy and very high computational cost.

Proposed algorithms, computer systems architectures and their implementations as well as mathematical theories aim to improve accuracy, conditions, and complexity of solving global optimization problems by applying:

• coupled accuracy adaptation in solving forward and inverse problems to obtain both low cost and low error,

• multi-population stochastic strategy for parallel search of multiple extrema,

• multiobjective approach based on multiple physics to improve problem condition-ing, and

• results of partial theoretical analysis of the proposed algorithms to verify assumed goals.

1.2 Objectives

The main objective is to verify the following thesis:

It is possible to solve difficult inverse problems formulated as single-and multiobjective global optimization problems by using adaptive genetic strategies performing evolution of the hierarchy of populations.

To prove the above claim, the following research tasks were performed:

• Literature analysis of the state-of-the-art methods for single- and multiobjective global optimization applicable to inverse analysis.

• Development of algorithms and strategies capable of solving multimodal poorly-conditioned tasks.

• Showing the algorithms properties by using the concept of well-tuning and con-struction of the heuristc.

• Implementation and veriﬁcation of the proposed methods by simulations per-formed for benchmark and real-world engineering problems.

(21)

1.3 Contribution

1.3 Contribution

The main contributions of the dissertation are as follows:

• Development of algorithmics for the real-number encoding hp-HGS and its ap-plication to inverse analysis (Section3.4.2.1–3.4.2.3 without error estimations for elasticity; Section3.5.2.1, and global phase in 3.5.3.2).

• Error estimation (obtained collectively with Robert Schaefer and Maciej Smołka, Section 3.1.3) and simulation analysis for inversion of borehole resistivity log-ging measurements for direct current case. Pure genetic case simulations, global phase in hybrid simulations, and time evaluation for parallelization options (Sec-tion3.5.1without local phase simulations).

• Simulation analysis for inversion of borehole resistivity logging measurements for alternate current case. Global phase in hybrid simulations and time evaluation for parallelization options (Section3.5.2without 3.5.2.2).

• Simulation analysis for parameter identiﬁcation in elasticity (results not pub-lished yet). Pure genetic case simulations, global phase in hybrid simulations, and time evaluation for parallelization options (Section3.5.3without local phase simulations).

• Development of MO-HGS with rank selection (Section 4.3). Creation of a rank modiﬁcation strategy to solve multi-physics problems formulated as multiobjec-tive problems (Section4.4). Implementation and benchmark studies of the strat-egy (Section4.6.2). (Results not published yet.)

• Simulation analysis of the multiobjective hierarchic genetic strategy incorporat-ing clusterincorporat-ing to recognize connected parts of Pareto-optimal sets (Sections4.3.1 and 4.6.1).

• Theoretical results for a particular class of MOEA including a construction of selection and heuristic operators that leads to asymptotic results (Section4.5.1). Well-tuning deﬁnition (Section4.5.2, results not published yet) and well-ﬁltering proposition (Section4.5.3, results not published yet).

(22)

Some ideas, ﬁgures and portions of text presented in this dissertation have appeared previously in the following peer-reviewed publications:

1. Gajda E., Schaefer R., Smołka M., Evolutionary multiobjective optimization al-gorithm as a Markov system, Lecture Notes in Computer Science 6238:617–626, 2010.

2. Gajda-Zagórska E., Recognizing sets in evolutionary multiobjective optimization, Journal of Telecommunications and Information Technology 1:74–82, 2012.

3. Gajda-Zagórska E., Paszyński M., Schaefer R., Pardo D., Calo V., hp-HGS strat-egy for inverse 3D DC resistivity logging measurement simulations, Procedia Com-puter Science 9:927–936, 2012.

4. Gajda-Zagórska E., Multiobjective evolutionary strategy for ﬁnding neighbour-hoods of Pareto-optimal solutions, Lecture Notes in Computer Science 7835:112– 121, 2013.

5. Gajda-Zagórska E., Paszyński M., Schaefer R., Pardo D., hp-HGS strategy for inverse AC/DC resistivity logging measurement simulations, Computer Science 14(4):629–644, 2014.

6. Barabasz B., Gajda-Zagórska E., Migórski S., Paszyński M., Schaefer R., Smołka M., Inverse problems in elasticity by hierarchic genetic search, International Jour-nal of Applied Mathematics and Computer Science, 24(4), 2014.

7. Gajda-Zagórska E., Schaefer R., Smołka M., Paszyński M., Pardo D., A hybrid method for inversion of 3D DC resistivity logging measurements, Natural Com-puting, 2014, DOI 10.1007/s11047-014-9440-y.

8. Smołka M., Gajda-Zagórska E., Schaefer R., Paszyński M., Pardo D., A hybrid method for inversion of 3D AC resistivity logging measurements, submitted to Applied Soft Computing.

(23)

1.3 Contribution

1. Barabasz B., Gajda E., Migórski S., Paszyński M., Schaefer R., Studying inverse problems in elasticity by hierarchic genetic search, In: ECCOMAS thematic con-ference on inverse problems in mechanics of structures and materials, pp. 9–10, 2011.

2. Barabasz B., Gajda E., Pardo D., Paszyński M., Schaefer R., Szeliga D., hp-HGS twin adaptive strategy for inverse resistivity logging measurements, In: 19th international conference on Computer Methods in Mechanics, pp. 121–122, 2011.

3. Paszyński M., Gajda E., Schaefer R., Pardo D., hp-HGS twin adaptive strategy for inverse DC/AC resistivity logging measurement simulations, In: 10th World congress on Computational mechanics, pp. 15–16, 2012.

4. Gajda-Zagórska E., Schaefer R., Smołka M., Paszyński M., Pardo D., Inversion of Resistivity Logging Measurements Using a Hierarchic Genetic Strategy, In: EC-COMAS international conference on inverse problems in mechanics of structures and materials, pp. 19–20, 2013.

5. Gajda-Zagórska E., Schaefer R., Multiobjective hierarchic strategy for solving inverse problems, In: ECCOMAS international conference on inverse problems in mechanics of structures and materials, pp. 17–18, 2013.

6. Paszyński M., Gajda-Zagórska E., Smołka M., Schaefer R., Pardo D., Hybrid algorithm for inverse DC/AC resistivity logging measurement simulations, In: APCOM & ISCM, pp. 1–8, 2013.

7. Schaefer R., Smołka M., Gajda-Zagórska E., Paszyński M., Pardo D., Solving In-verse Problems Using Computing Agents: An Attempt to a Dedicated Hierarchic Memetic Strategy, In: ECCOMAS international conference on inverse problems in mechanics of structures and materials, pp. 53–54, 2013.

8. Schaefer R., Smołka M., Paszyński M., Gajda-Zagórska E., Faliszewski P., Essen-tial features of inverse solvers inspired by nature, In: ECCOMAS international conference on inverse problems in mechanics of structures and materials, pp. 55– 56, 2013.

(24)

1.4 Dissertation outline

Chapter2presents background related to the content of the dissertation. It starts with a general definition of inverse parametric problems formulated as global optimization problems and a brief characterization of their difficulties. Next, state-of-the-art algo-rithms suited to solve the above problems are presented. Particular attention is given to stochastic algorithms solving single- and multiobjective global optimization problems. Chapter 3 is devoted to single-objective inverse analysis. Firstly, Sections 3.1, 3.2 and 3.3introduce three main inverse problems studied in the dissertation: inversion of resistivity logging measurements for direct and alternate current cases, and restoration of mechanical parameters of an elastic body. Secondly, Section 3.4 shows algorithms and strategies for solving difficult inverse problems. Finally, Section3.5 describes sim-ulations plan and results.

In Chapter 4, the inverse analysis of multiobjective global optimization problems is considered. After the formal deﬁnition and motivation for multiobjective inverse analysis (see Sections 4.2 and 4.1), the Hierarchic Genetic Strategy for multiobjective rank-based optimization as well as the incidence-based rank modiﬁcation are intro-duced (see Sections4.3and4.4). Main theoretical results concerning Markov modelling of Evolutionary Algorithms for Multiobjective Optimization, their heuristic operators, and asymptotic behaviour, are gathered in Section 4.5.1. Chapter 4 is closed by the presentation of benchmark results partially verifying and illustrating proposed multi-objective approach (see Sections 4.6).

Finally, Chapter 5 concludes the dissertation and outlines the directions of future research.

(25)

Chapter 2

State-of-the-art

This chapter presents background related to the content of the dissertation. It begins with a deﬁnition of inverse parametric problems formulated as global optimization prob-lems. Next, state-of-the-art algorithms suited to solve the above problems are presented with a particular attention given to stochastic algorithms solving single- and multiob-jective global optimization problems. The chapter is concluded with a discussion of existing methods.

2.1 Forward and inverse problems

2.1.1 Forward problem deﬁnition

We will consider the Boundary Value problems (BVs) for partial diﬀerential equations (PDEs) deﬁned in the continuous domain Ω ⊂ Rn_{being the C}1 _{manifold with Lipschiz}

boundary [154]. The unknown function u : Ω → Rk belongs usually to the proper Sobolev space V (Ω) in case of weak formulations [46].

Problems under consideration can be written in a residual form:

hA(u), vi = 0, ∀v ∈ V (Ω), (2.1)

where A : V (Ω) → V′

(Ω) and hi : V (Ω) × V′

(Ω) → R denotes parity between V (Ω) and its conjugated space V′

(Ω). The residual problem can be also denoted for convenience as a(u, v) = f (v) or as the equation in V′

(Ω):

(26)

The operator A depends on the parameter defined as a function ρ ∈ U (Ω → D), where U is the admissible set of functions, and D ⊂ Rpis a parameter domain. One can also consider parameters being vector or tensor fields. Function ρ includes coefficients of the operator A, such as material constants or Neumann boundary conditions (details will be given later on).

Now we deﬁne a forward problem as the following: given ρ ∈ U ﬁnd u ∈ V (Ω) such that:

A(ρ; u) = 0. (2.3)

Generally, the properties (solvability and regularity) and features of the solutions of (2.3) result from the properties of operator A (e.g. continuity, Lipschitz continuity, strong convexity, coercivity, strong coercivity with a constant) and on the features of the domain Ω. Mathematical results of the forward variational problem analysis can be found in [46,112,133]. Properties of A may aﬀect the construction and properties of the solving algorithms. More details concerning solvability and regularity of the particular forward problems analysed later will be discussed in the following paragraphs (see Sections 3.1.1,3.2.1, and 3.3.2.1).

2.1.2 Forward numerical solvers

Only a few BVs may be solved analytically delivering a formula for u. Usually, it is necessary to apply other methods (two examples are given below). One group of methods for solving BVs is projection, in which the space of solutions is a separable space and the solutions are expanded with respect to the dense kernel in V (see e.g. [94,

150]). In some cases, the parameters of the expansion can be found by Fourier or Laplace methods, but generally, one has to seek for an approximation. Eﬃciency of these methods depends on the construction of the dense kernel of V - traditional kernels are often not numerically eﬀective enough.

The alternative group of methods incorporating projection are Galerkin methods, where the main idea is to create a sequence of ﬁnite-dimensional subspaces {Vn} : S

Vn ⊂ V and the closure of SVn in the topology of V is V (SVn = V ). For each of these spaces the solution of the approximated Galerkin problem can be found (see

(27)

2.1 Forward and inverse problems

In the dissertation we will focus on iterative methods with feedback, where the con-secutive subspaces are updated basing on the previous step of the method. In particular, we refer to hp-mesh reﬁnement methods (ﬁnite element method, FEM, by Demkowicz and Oden [42,43,113,132]).

There exists a vast literature concerning convergence and complexity analysis for different types of equations (see e.g. [31,128,151]). These results are given with regard to Galerkin equations and for specific definitions of approximations (e.g. FEM). Prop-erties of the particular instance of hp-FEM will be discussed later in the dissertation (see Section3.1.1.6).

2.1.3 Inverse problem deﬁnition

Inverse problems related to forward problems (2.3) are deﬁned as the following:

given u ∈ V (Ω) ﬁnd ˆρ ∈ U (Ω → D) such that (2.3) is fulﬁlled. (2.4)

Tasks of this type are diﬃcult to solve - one of the methods is to invert operator A with respect to ρ which is often analytically impossible. Another way is to formulate a global optimization problem (GOP), where a solution of the GOP is the solution of the original problem (2.4). The problem is formulated as given u ∈ V (Ω) ﬁndρ ∈ U (Ω → D)b such that:

b

ρ = arg min_ρ∈UM (ρ; u), (2.5)

where ρ, u fulfil (2.3), and M is a misfit functional. The definition of the misfit functional depends strongly on the inverse problem to be solved and on the available information about u (details will be given in Sections3.1.2,3.2.2, and 3.3.2.2).

It is important to notice that inverse problems formulated as above are often ill-posed, which means i.a. that their solution is unstable under data perturbations.

2.1.4 Inverse numerical solvers

There are several methods of solving inverse problems formulated as GOPs. We will distinguish three groups of them: convex analysis methods, methods with regularization, and stochastic methods.

Convex analysis methods can be successfully used to ﬁnd a minimum of a convex function over a convex set (see e.g. [71, 135]). In case of multimodal problems, these

(28)

methods are capable of ﬁnding one minimum which is not necessarily one of the global minimizers. Although, these methods can be very useful when applied in the basin of attraction of a particular minimizer, as well as started multiple times to ﬁnd multiple (local) extrema [159].

One group of the simple convex optimization methods are gradient methods [116], in which each step is proportional to the negative of the gradient of the function at the particular point. The function has to be differentiable in that point, therefore the method can be used when the derivatives exist and can be calculated or approximated with reasonable cost. Method of the steepest descent is a modification of the gradient descent in which in each step, after a direction is set, the lowest value of the function is found in that direction. This algorithm is not the best one to be used to solve in-verse problems, because it incorporates some of the disadvantages of both gradient and stochastic methods (e.g. it is more expensive than a simple gradient method and can get stuck in a local minimum), and its convergence rate can be very slow for ill-conditioned systems. Another iterative technique, the conjugate gradient (CG) method, with an acceleration scheme called preconditioning, can solve symmetric positive-definite linear systems efficiently, even if they are large and ill-conditioned [167]. The CG method can also be generalized to solve non-symmetric optimization problems [52,164].

Due to their low computational cost, gradient methods are widely used to solve well-conditioned inverse problems (with a globally convex misﬁt functional) [11,69,70,

75,83,167].

Regularization is a technique that introduces some additional knowledge to prevent overfitting or, most importantly, to solve ill-posed problems (for regularization in the context of inverse problems refer to [51]). One of popular regularization methods is Tikhonov regularization [161]. Here, a term depending on the step of the iteration is added to the misfit in order to balance fitting the data versus reducing a norm of the solution. Penalty usually decreases with the consecutive iterations.

Regularization is useful for irregular problems, but with one global minimum (with possible disturbances). Used with a convex method, it slightly increases the computa-tional cost, and is capable of eliminating some artefact solutions. Nevertheless, it may also cause other artefacts to be more likely to be found. One can ﬁnd many examples of Tikhonov regularization applied in practice [51,63,72,80,166].

(29)

2.2 Stochastic algorithms for global optimization

To sum up this section, presented techniques are suitable for solving inverse prob-lems formulated as single-objective GOPs, where the objective is regular and where there is one global minimizer. The research described in the dissertation is focused on ﬁnding all of the solutions in case of irregular multiomodal objectives, therefore diﬀer-ent methods must be used. In the next section, we will focus on stochastic algorithms as some of them are capable of meeting this demand.

2.2 Stochastic algorithms for global optimization

In this section we discuss selected stochastic algorithms for global optimization, which are related to the research presented in the dissertation. We begin with simple stochastic sampling methods with no or little adaptation. Secondly, we focus on genetic methods with adaptation. We also discuss some more complex genetic strategies, including algo-rithms incorporating multiple populations and multiple strategies in one method. Next, we introduce methods for detecting basins of attraction. Finally, we give a formal de-scription of several concepts from the theory of genetic algorithms, which will be used later in the dissertation.

We consider global optimization problems with an objective function

f : D → [0, M ] ⊂ R, M < +∞, (2.6)

deﬁned over a ﬁnite decision space D ⊂ Rn.

2.2.1 Stochastic sampling with simple adaptation

Monte Carlo methods are a broad class of computational algorithms which perform repeated random sampling to obtain numerical results. Most often, they are used in physical and mathematical problems when it is impossible to obtain a formula or to apply a deterministic algorithm. A basic scheme of the Monte Carlo method incorpo-rates a deﬁnition of the domain of possible inputs, random generation of inputs from a probability distribution over this domain, computation of the inputs, and aggregation of the results, although particular methods may vary [66,134].

Nonlinear inverse problems with no analytical expression for the forward relation between data and model parameters are often approached by Monte Carlo methods (see

(30)

data may be calculated for any given model basing on the formulation of the forward relation. Monte Carlo methods can be divided into two groups: the sampling methods and the optimization methods. Monte Carlo sampling can be used to explore the space of feasible solutions and to measure the resolution and uncertainty of solutions. Most widely used Monte Carlo samplers in this category are the Metropolis algorithm [95] and the Gibbs sampler (see e.g. [27]), and modiﬁcations of these methods (e.g. the neighbourhood algorithm used in particular in geophysical inversion [139, 140]). The simplest case, Pure Random search, produces a uniform sequential sample drawn from the space of feasible solutions and returns the best found function value as an estimate of the global optimum. Monte Carlo optimization methods are powerful tools when a global optimal solutions are sought amongst numerous local optima.

Simulated annealing algorithm [82] is an adaptation of the Metropolis algorithm. It is based on annealing in metallurgy where metals are slowly cooled to reach a state of low energy. This idea of slow cooling is applied in Simulated Annealing as a slow decrease in the probability of accepting worse solutions during search space exploration. In other words, the exploration starts with high randomness and ends as a pure greedy descent leading to a local minimum. Simulated annealing is discussed in the context of inverse problems in [157].

2.2.2 Single-objective genetic algorithms

Genetic algorithms (GAs) are search heuristics inspired by natural selection mecha-nisms. One of their characteristic features is that they operate on individuals, which are representatives of a random sample. This representation is obtained by encoding points from the admissible domain D, which allows to select a subset of admissible points from D that can be checked during the optimization process and which allows to perform genetic operations (see e.g. [146]).

Let us introduce a set of points Dr⊆ D which can be eﬀectively explored. Elements of Dr will be called phenotypes. The cardinality of the set of phenotypes will be de-noted by r and by U we will denote a genetic universum. Elements of U will be called

genotypes. It is important that for both finite and infinite sets, their cardinalities fulfil

(31)

2.2.2.1 Encoding and decoding operations

Definition 1. (see Def. 3.1 in [146]) Encoding is a bijection

code : U → Dr. (2.7)

Definition 2. (see Def. 3.2 in [146]) Decoding associated with encoding function 2.7

is a partial mapping

dcode : D 7−→ U (2.8)

that satisfies the following conditions: 1. dcode(code(x)) = x, ∀x ∈ U

2. ∀x ∈ U, ∀y ∈ D\Dr, ∀z ∈ U \{x}, d(y, code(x)) < d(y, code(z)) ⇒ dcode(y) = x,

where d is a metric in the space of solutions.

Among multiple encoding techniques, one can name e.g. binary aﬃne encoding introduced by Holland in [73] (see also [146]), Gray encoding and phenotypic encoding, used in many evolutionary algorithms and in case of inﬁnite and uncountable genetic universum [149]. In binary encoding, the genetic universum is usually denoted by Ω and it is composed of binary strings of length l ∈ N.

In the dissertation we will focus on genetic algorithms that operate on populations which are multisets P = (U, η) of individuals. The occurrence function η : U → Z+∪{0}

returns η(i) being the number individuals with the genotype i ∈ U and µ =P_i∈Uη(i) <

+∞ stands for the population cardinality. GAs produce a sequence of populations {Pt}

in the consecutive genetic epochs t = 1, 2, . . . starting from the population P0uniformly sampled from U . We will consider (µ, λ) succession scheme, where µ = λ, and λ is the oﬀspring population cardinality (for a detailed explanation of the Schwefel’s notation and diﬀerent succession schemes, refer e.g. to [7,146]).

2.2.2.2 Selection

There are several operations that are used in order to transform population Ptto Pt+1. The ﬁrst of them, selection, is applied to chose individuals that will mate in the following phases of the algorithm. Usually, selection is a random operation that may be modelled as multiple sampling from Pt_{, although other implementations are also possible. What} is more, one can apply elitist selection in which only the best individuals are selected.

(32)

Examples of selection mechanisms (for a comparison and complexity analysis, refer to [62]):

• Fitness proportional selection (roulette-wheel selection). In this type of selection we perform multiple sampling where the probability of sampling an individual x is proportional to its ﬁtness multiplied by the number of individuals with genotype

x (for the exact formula refer to [146]). In the standard implementation, the fitness values are calculated for all individuals and normalized. Next, individuals are sorted accordingly to descending fitness values and accumulated normalized fitness values are computed. After that we repeat choosing a number between 0 and 1 and selecting the first individual whose accumulated value is greater that this number, until we get a sufficient number of individuals.

• Stochastic universal sampling. This method diﬀers from the previous one by using one random value to sample all solutions at evenly spaced intervals instead of repeated random sampling. The main advantage of stochastic universal sampling is that it protects the set of chosen individuals from being dominated by several best-ﬁtted individuals [10].

• Tournament selection. A single sampling in tournament selection involves select-ing a tournament mate – a sub-multiset (see e.g. def. 2.10 in [146]) of cardinality

k (typically k = 2) from the population Ptand selecting the best individual from the tournament mate. To obtain a tournament mate, one can apply e.g. k-time sampling without returning according to the uniform probability distribution on

Pt. Tournament selection was studied i.a. in [19], [115], and in [97].

• Elitist selection [160], [12]. One of the main features of elitist selection is that it ensures that the best individuals from the current population will proceed to the next phase of GA. The best individuals from a current population are selected in a deterministic way, whereas other individuals are chosen according to other (probabilistic) selection methods.

• Rank selection. The main idea of rank selection (introduced by [9] for GAs) is that it deﬁnes ranks that designate sampling probabilities for individuals in

Pt_{. For example, the population is sorted from best to worst individuals, each} individual is given a number according to a non-increasing assignment function

(33)

and a proportional selection is performed according to these assignment (for more details refer to [62]).

2.2.2.3 Binary genetic operations

In order to produce new individuals basing on the individuals from the current popu-lation, genetic operations are used. In particular, binary genetic operations are applied to operate on binary genotypes from Ω:

• Binary multipoint mutation. In the binary multipoint mutation, for each parental code, a binary mutation mask of length l is sampled and added to that code. For details, refer e.g. to [146].

• Binary multipoint crossover [40]. Similarly to binary mutation, operation of binary multipoint crossover consists of sampling a binary crossover mask of length l. The mask determines whether a particular bit of the oﬀspring should be inherited from the ﬁrst or the second parent. One of the most widely-used examples of binary crossover techniques is a single-point crossover (introduced in [73]), where one crossover point is selected – bits from the beginning of the code to the crossover point are copied from one parent while the rest is copied from the second parent.

If mutation is independent (i.e. mutation mask is sampled due to Bernoulli model of independent trials, see [168], Theorem 4.1) then a mixing operation can be defined as a composition of binary mutation and crossover. The order of component operations does not affect the probabilistic affect of mixing.

2.2.2.4 Simple Genetic Algorithm

Next, we will introduce the Simple Genetic Algorithm (SGA) by Vose [168], which is one of the few examples of genetic computations for which the probability distribution for sampling to the next epoch can be delivered explicitly. SGA transforms Pt_{to P}t+1_, where both populations are multisets of binary genotypes from Ω. The parameters of SGA are: the population size µ ∈ N, the ﬁtness function f , mutation and crossover rates and the crossover type.

We present a pseudo-code for SGA (Algorithm 1) with a succession scheme (a scheme for obtaining the next population from the current one) introduced by Vose. In

(34)

this succession scheme, the next generation is constructed one-by-one by choosing two parents from the previous generation, mutating them and adding an oﬀspring created by crossover to the next generation. This succession scheme diﬀers from the standard one, in which an intermediate population of size µ is created and the next-epoch population is obtained by performing genetic operations on the intermediate population (refer to [144]).

Algorithm 1 Pseudo-code of SGA.

1: t ← 0;

2: create an initial population P0;

3: evaluate P0;

4: while stopping condition not satisﬁed do

5: create an empty population Pt+1_;

6: while Pt+1 contains less than µ individuals do

7: select two individuals x, y from Pt;

8: produce binary code z from x and y by performing mixing; 9: add z to population Pt+1; 10: end while 11: evaluate Pt+1; 12: t ← t + 1; 13: end while 2.2.2.5 Evolutionary algorithms

Evolutionary algorithms (including Simple Evolutionary Algorithm, SEA [168]) are genetic algorithms with phenotypic encoding, where the genetic universum is a subset of RN. In that case, the following genetic operations are used:

• Phenotypic mutation. One of the most popular types of phenotypic mutation is the normal phenotypic mutation in which the oﬀspring is created by adding a perturbation sampled according to the normal probability distribution to the parent (for details refer e.g. to [6,17]). For improved perturbation distributions, refer to [108,109,110,111].

• Phenotypic crossover. A simple crossover operation, the arithmetic crossover de-scribed e.g. in [178], produces an oﬀspring in the N -dimensional interval between

(35)

two parents with uniform distribution. For other phenotypic crossover operations, as well as for the discussion of phenotypic operations in constrained domains, refer to [146].

2.2.2.6 A brief survey of complex genetic strategies

In this section we will brieﬂy discuss selected strategies that aim to improve eﬃciency of the genetic algorithms, e.g. by modifying genetic mechanisms during computations and by applying multiple populations for concurrent search in the whole admissible domain.

Delta Coding Delta coding [173] is a strategy related to the affine binary encoding in which the search accuracy is consecutively increased during evolution. It consists of two phases – conventional genetic operations that are repeated until the evolution process stagnates (the population is sufficiently concentrated) and the delta phase. In the delta phase a more accurate search is performed in the neighbourhood of the best-fitted individual, which is updated after each iteration of the phase.

Multi-deme strategies Genetic strategies consisting of multiple populations (called demes) are notably competitive with single-population strategies when applied to solve multimodal GOPs. They exhibit lower computational cost (they need fewer evaluations of the fitness function), in contrary of their higher formal complexity. The difference is especially significant for difficult multimodal problems in which basins of attractions of local extrema are separated by huge areas in which the fitness function does not change much [146].

An example of a multi-deme strategy is the island genetic strategy, where several populations (called islands) solve the same GOP. The genetic universum, encoding and ﬁtness are the same for all islands whereas the starting states and genetic algorithms applied for particular islands may diﬀer. An important mechanism utilized in island models is migration which allows for exchange of the genetic material between the islands. For representatives of island genetic strategies, refer e.g. to [64,144,174].

Another example of a multi-deme algorithm is the Hierarchic Genetic Strategy (HGS) introduced in [145]. HGS will be discussed in details in Section3.4.1.

(36)

Two-phase strategies In two-phase strategies for global optimization (see e.g. [117]), the ﬁrst global phase aims at thorough exploration of the search space, and the second

local phase aims at locally improving the approximation to a local minimum. These

two phases can be separated or incorporated in one algorithm that switches between exploration and reﬁnement automatically. Some of the previously mentioned methods may be classiﬁed as two-phase strategies if they incorporate some kind of a local im-provement, or they can be extended to such by adding a second phase (e.g. clustering discussed in the following section).

One of the simplest two-phase strategies is the Single Start method, where Pure Random search is followed by local improvement of the best found solution. Its popular extension is the Multistart method (see e.g. [117]) repeating steps of the Single Start method until a stopping condition is fulﬁlled.

Depending on the deﬁnition, Memetic algorithms introduced in [100] can be consid-ered as an extension of two-phase strategies. They incorporate concepts from diﬀerent metaheuristics combining population-based search with local improvement or separate individual learning. Memetic algorithms are discussed e.g. in [101] and [87]. Spatial genetic algorithm combining genetic computations with cellular automata approach is presented in [23].

For more details and examples of two- and more-phase strategies, refer to [117] and [146].

2.2.3 Detecting basins of attraction

In this section we will consider basins of attraction of local minimizers in GOPs and methods designed to detect central parts of these sets. These techniques are useful when some information about the number of global or local optima is sought, preferably at low cost. What is more, separating particular minimizers in multimodal problems can be utilized to determine starting point of local methods in two-phase strategies.

Before presenting the deﬁnition for basins of attraction, let us introduce several concepts. Let L(y) = {x ∈ D : f (x) ¬ y} and L(y) = {x ∈ D : f (x) < y} stand for twob

types of level sets of function f . The connected parts of L(y) and L(y) that contain xb

(37)

x∗

, let the cutting level y(x∗

) ∈ R (written shortly as y) be deﬁned as follows:

y =        inf ( y : ∃z∗ local minimizer of f, z∗_{6= x}∗ , z∗_{∈ L} x∗(y) ) if z∗ exists maxx∈Df (x) otherwise. (2.9)

Definition 3. The basin of attraction Bx∗ of a local minimizer x∗ is the connected part

of Lbx∗(y) that contains x∗.

The deﬁnition of basins of attraction was introduced in [47]. For minimization and maximization problems, compare with [146].

One of the most common techniques for separating basing of attraction in population-based stochastic search is sample clustering. It is an unsupervised technique of recog-nizing patterns and dividing data sets into groups (see e.g. [78]). Formally, during clustering, a discrete data set X = x1, . . . , xm is partitioned into non-empty, exclu-sive subsets X1, . . . , Xk; k ¬ m called clusters. For clustering in continuous global optimization, refer to [117].

In global optimization, clusters can be interpreted as sets of points belonging to approximated basins of attraction of global minimizers and, what is important, clus-tering methods can be used to approximate these basins. Often, the approximation is performed to reduce the number of starting points of the local method to one in each basin of attraction (see e.g. [81]). The diﬀerence in computing time may be especially signiﬁcant with objectives that are expensive to evaluate.

A common schema for clustering methods in GOPs that appears in the literature (see e.g. [146]) is the following. Firstly, generate m random points from the admissible set D according to the uniform distribution. Secondly, transform the set of random points to facilitate cluster recognition. Finally, recognize clusters (clustering). These steps can be repeated until a global stopping condition is met. A local method can be started in each of the recognized clusters.

According to [117], in global optimization clustering methods, random points can be transformed in two ways: by reduction and concentration. In the former, gener-ated random points that have a ﬁtness value above a certain threshold are reduced. The set of remaining points (often non-connected) approximates the level set of the objective function. This approach was utilized by Becker and Lago [15] in the ﬁrst

(38)

(up to author’s knowledge) clustering technique used to aid global optimization. An-other early approach using a clustering method to aid global optimization was done by T¨orn [162].Here, after a few steps of a simple local method (e.g. the gradient method), the results are clustered and only a reduced number of processes in each cluster is resumed.

In GAs, the ﬁrst two phases (sampling and transformation) are performed by the genetic algorithm. When it is well-tuned (Def.1), genetic samples in succeeding epochs concentrate in the central parts of basins of attraction, therefore no further reduction is needed and the samples can be clustered immediately. The concept of well-tuning was presented in [145]. One of the algorithms possessing this property is HGS, which was used for in the context of clustering e.g. by Adamska in [3]. HGS will be discussed later in the dissertation (see Section3.4.1).

Clustering in the context of inverse problems was studied e.g. by Telega in [158]. He introduced Genetic Clustering (later called Clustered Genetic Search, CGS), which is a parallel global optimization algorithm designed to solve parametric inverse problems. In particular, the strategy allows to detect central parts of basins of attraction (approx-imations of some level sets). For a comparison of two versions of CGS, genetic-based and density-based, refer to [141]. What is more, in [146] one can ﬁnd a deﬁnition for

cluster extensions that are sought in CGS. Brieﬂy, a cluster extension for an isolated

local minimizer x∗

is a closed positive-measured set located in the central part of Bx∗ that contains all cluster points.

An example of clustering in continuous global optimization is described in [3]. In this method, called FMM, clusters correspond to level sets of the estimated measure density. Their representation as ellipsoids allows to approximate central parts of the basins of attraction of local extrema. A hybrid approach for supporting niching strategies, CSFD, was introduced in [177].

Apart from clustering, we would like to mention a different method for finding basins of attraction, the Topological Multimodality Estimator (TME), introduced in [156]. Here, the authors present an EA–based strategy which aims at estimating the number of optima at low cost. This goal is reached by using a variable size population, where the number of individuals is steered by the number of found attraction basins. To determine, whether two individuals belong to the same basin of attraction (which is particularly difficult for basins with complicated shape), the hill-valley method (see

(39)

e.g. [21]) is used. The authors also consider a hybridization of TME with one of the clustering methods.

In the dissertation we will focus on clustering, because in well-tuned SGA-utilizing strategies genetic samples concentrate in the central parts of basins of attraction and clustering can be directly performed on these samples.

2.2.4 Theoretical results

In this section we will present selected theoretical results that are used to develop genetic strategies presented in the dissertation.

Let us consider genetic algorithm that operates on a population represented as the multiset P = (U, η) and which may be identiﬁed with its frequency vector x =

{1_µ η(p)}, p ∈ U . All frequency vectors belong to the ﬁnite subset Xµ of the Vose’s simplex Λr=   x = {xp}; 0 ¬ xp ¬ 1, p ∈ U, X p∈U xp = 1   . (2.10)

Although only finite populations (µ < +∞) can be represented unambiguously by frequency vectors, it is also possible to represent the infinite size populations. In such case, a modified occurrence function is defined basing on multiset representation (for details, refer to [146]). We will identify the population with its frequency vector if it does not lead to ambiguity.

Notice, that each population frequency vector x ∈ Λr _{belongs to M(U ), which is a} set of probabilistic measures on U . Therefore, we deﬁne Θ : Λr_{→ M(U ) and extend it} to the mapping ˆΘ : Λr_{→ M(D}

r) by using the assumed one-to-one coding between the genetic space and the set of phenotypes.

Next, let us deﬁne a heuristic operator (introduced by Vose, see e.g. [168]).

Definition 4. The mapping H ∈ C(Λr _{→ Λ}r_{) will be called the heuristic of the}

partic-ular class of genetic algorithms if:

1. Each coordinate (H(x))p is equal to the sampling probability of the individual with

the genotype p ∈ U in the epoch that immediately follows the epoch in which the population x ∈ Λr appears.

2. The value H(x) is the expected population in the epoch that immediately follows the epoch in which the population x ∈ Λr appeared, for all algorithms from the considered class.

(40)

3. It stands for the law of evolution of the abstract, deterministic, infinite pop-ulation algorithm (we assume that it exists in the considered class). In other words, the infinite population algorithm is the dynamic system that starts from a particular initial population x0 ∈ Λr _{and then passes consecutively through}

H(x0_{), H}2_(x0_{), H}3_(x0_{), . . . .}

A class of genetic algorithms that admit a heuristic operator will now be called

genetic algorithms with heuristic.

We consider evolutionary algorithms in which the next epoch population xt+1∈ Λr

is obtained by µ-times sampling with return according to the generalized Bernoulli scheme, see e.g. [18]. We assume the probability distribution on the set of genotypes which in case of GA with heuristic is H(xt).

2.2.4.1 SGA heuristic

Let us consider SGA as deﬁned in Section 2.2.2.4. The proportional selection operator utilized in SGA is a mapping F : Λr_{→ Λ}r _{such that:}

F (x) = diag(f )x

(f, x) , (2.11)

where diag(f ) denotes the r × r diagonal matrix with the diagonal f , and the scalar product (f, x) is the mean ﬁtness of the population represented by x ∈ Λr_.

In each epoch of SGA, selection is followed by genetic operations (e.g. mutation, crossover). They can be represented by the mixing operator M ∈ C1_(Λr _{→ Λ}r_{). The} mixing operator with binary mutation and positional crossover utilized in SGA was introduced in [168]:

(M (x))i= (σix)TMσix, ∀ x ∈ Λr, i ∈ Ω, (2.12)

where σi stands for the r × r dimension permutation matrix with entries (σi)jk = [j ⊕ k = i], i, j, k ∈ Ω. The entries Mij of the symmetric r × r matrix M express the probability of obtaining the genotype 0 ∈ Ω (string of zeros) from parents i, j ∈ Ω by crossover and mutation.

The genetic operator (heuristic) for SGA is a mapping H : Λr _{→ Λ}r _{composed of} proportional selection and mixing operators:

(41)

For a heuristic operator for MOEA, refer to Section 4.5.1.

The relation between the heuristic operator and the Markov transition function (represented by the transition probability matrix) modelling SGA was explicitly given by Nix and Vose:

Theorem 1. (Theorem 1 in [105]) If mutation and crossover parameters are

con-stant, SGA can be modelled as the stationary Markov chain with the finite space of states Xµ and with the transition probability matrix Q

Q = {Qxy}_x,y∈Xµ, Qx,y= µ! r Y i=1 ((H(x))i)µyi (µyi)! . (2.14) 2.2.4.2 Asymptotic features

The probability distribution of the population Pt associated with a frequency vector

xt ∈ Xµ in epoch t will be denoted by πtµ ∈ M(Xµ). What is more, πµt+1 = Qπµt,

t = 0, 1, 2, . . ..

Theorem 2. (Theorem 4.31 in [146]) If the mutation rate is strictly positive then

the Markov chain describing the finite population SGA is ergodic and there exists a weak limit πµ∈ M(Xµ): lim t→+∞π t µ= lim_t→+∞Q t π0 µ= πµ (2.15)

for the arbitrary initial measure π0

µ∈ M(Xµ).

One of the important implications of the Markov chain ergodicity is that SGA will visit all states x ∈ Λr_{, therefore it will search a whole set of genotypes regardless of the} initial population P0_.

Theorem 3. (see [105, 168]) If the population cardinality grows to infinity, the

se-quence of limit measures {πµ} ⊂ M(Xµ) (defined in Theorem 2) contains a subsequence

{πµk} ⊂ {πµ} that weakly converges to some measure π∗ ∈ M(Λr).

The proof of this feature is based on the Prokhorov theorem (see Theorem 29.3 in [18]).

Definition 5. (see [146]) We say that heuristic H is focusing if there exists a

nonempty set of fixed points K ⊂ Λr of H that for all x ∈ Λr the sequence {Ht(x)}

(42)

Next, we present a theorem determining the value of the limit measure π∗

on the ﬁxed set K.

Theorem 4. (Theorem 3 in [105]) If the genetic operator H : Λr → Λr associated with the class of Simple Genetic Algorithms is focusing and K ⊂ Λr _{is the set of its}

fixed points, then π∗

(K) = 1.

In other words, if H is focusing, then a sufficiently large SGA population can be concentrated arbitrarily close to the set of fixed points K with a fixed probability after a number of evolutionary epochs.

2.2.4.3 Well-tuning

Let us assume that the ﬁxed points of H represent populations containing maximum information about the global optimization problem to be solved, which can be gathered by SGA spanned by H. Let us denote by W the ﬁnite set of local minimizers to the objective function f , and by {Bx∗}, x∗ ∈ W the family of their basins of attraction.

To construct the measure density ρω over D corresponding to a population w ∈ Λr, we apply partitioning D into r polyhedra ϑi, i ∈ Ω, similarly to Voronoi partitioning (for details, refer to [146]):

ρω(x) = ( ˆΘ(ω))i meas(ϑi)

where i ∈ Ω is such that x ∈ ϑi. (2.16)

The above density ρω is an Lp(D) mapping deﬁned almost everywhere on D, in partic-ular it is not deﬁned on the intersections ϑi∩ ϑj, i 6= j which have zero measure. Condition 1. (Def. 4.63 in [146]) The class of SGA spanned by the genetic operator

H is well-tuned to the set of local minimizers W if:

1. H is focusing and the set of its fixed points K is finite, 2. ∀x∗ _{∈ W ∃C(x}∗

) a closed set in D such that x∗_{∈ C(x}∗

) ⊂ Bx∗, meas(C(x∗)) > 0

and

ρω  threshold, almost everywhere on C(x∗) (2.17)

ρω < threshold, almost everywhere on D \ [ x∗_∈W

C(x∗), (2.18)

where ω ∈ K is a fixed point of H, and the positive constant threshold stands for the definition parameter.

(43)

2.3 Multiobjective optimization techniques

2.3 Multiobjective optimization techniques

There exists a plethora of algorithms for solving multiobjective optimization problems. As a natural consequence, there have been multiple attempts to classify these methods. In general, we can identify two distinct stages of solving a multiobjective problem: search (the optimization of the objective functions) and decision making, which is often connected with selecting compromise solutions (deciding what kind of trade-oﬀs are appropriate from the decision maker perspective).

One of the classiﬁcations, basing on how optimization and decision process are combined, was proposed by Cohon and Marks in [36] and appears in the literature since (see e.g. [35, 182]). It divides multiobjective optimization methods into three groups, utilizing:

1. Decision making before search (so called a priori methods).

2. Search before decision making (so called a posteriori methods).

3. Integrated search and decision making (so called interactive methods).

The ﬁrst group includes approaches in which the decision maker can perform a pre-ordering of the objectives or aggregate them into a single objective. In that case, one can use classical single-objective techniques, but some initial knowledge about the domain is required.

In the second group, search before decision making, profound knowledge about the problem is not necessary and the decision maker does not introduce any prior prefer-ence information to the system. A drawback of this approach is the lack of preferprefer-ence information, which could signiﬁcantly reduce the search space.

Techniques in the third group normally operate in three stages [36]: finding a non-dominated solution; showing this solution to the decision maker and modifying the preference of the objectives according to his decision; repeating the two previous steps until the decision maker is satisfied or no further improvement is possible. The in-tegration of search and decision making combines the advantages of the two other approaches, but it is not free of some of its difficulties, e.g. difficulties with the pre-sentation of nondominated sets for higher dimensional multiobjective problems, which appear in strategies in the second group.

Index of /rozprawy2/10903

AGH University of Science and Technology

Faculty of Computer Science, Electronics and Telecommunications

Department of Computer Science

Doctoral dissertation

Adaptive population-based

algorithms for solving single- and

multiobjective inverse problems

Ewa Gajda-Zagórska

Supervisor: Prof. dr hab. inż. Robert Schaefer

Akademia Górniczo-Hutnicza im. Stanisława Staszica

Wydział Informatyki, Elektroniki i Telekomunikacji

Katedra Informatyki

Rozprawa doktorska

Adaptacyjne algorytmy

populacyjne rozwiązywania

jedno- i wielokryterialnych

problemów analizy odwrotnej

Ewa Gajda-Zagórska

Promotor: Prof. dr hab. inż. Robert Schaefer

Acknowledgements

Contents

List of Figures

List of Tables

Glossary

Chapter 1

Introduction

1.1

Motivation

1.2

Objectives

1.3

Contribution

1.4

Dissertation outline

Chapter 2

State-of-the-art

2.1

Forward and inverse problems

2.2

Stochastic algorithms for global optimization

2.3

Multiobjective optimization techniques