Heuristics in dynamic scheduling: A practical framework with a case study in elevator dispatching

(1)

Heuristics in dynamic scheduling

a practical framework with a case study in elevator dispatching

Heuris

tics in dynamic scheduling

Jeroen de Jong

Jer

oen de Jong

Na afloop van de verdediging is

er een receptie

Jeroen de Jong

Paranimfen

Ineke de Jong

ineke_de_jong@hetnet.nl 015-2137841

Bram Kranenburg

a.a.kranenburg@gmail.com 0492-552025

Uitnodiging

Voor het bijwonen van

de verdediging van mijn

proefschrift

Woensdag 28 november 2012

om 15.00 uur in de Aula van de

Technische Universiteit Delft,

Mekelweg 5 te Delft

Voorafgaand aan de

verdediging zal ik om 14.30 uur

een korte uitleg geven over

mijn promotieonderzoek

(2)

(3)

(4)

(5)

a practical framework with a case study in elevator dispatching

Proefschrift

ter verkrijging van de graad van doctor aan de Technische Universiteit Delft;

op gezag van de Rector Magnificus prof. ir. K.Ch.A.M. Luyben; voorzitter van het College voor Promoties,

in het openbaar te verdedigen op 28 november 2012 om 15:00 uur

door Jeroen Leonard DE JONG Master in Artificial Intelligence

(6)

Samenstelling promotiecommissie:

Rector Magnificus Voorzitter

Prof. dr. C. Witteveen Technische Universiteit Delft, promotor

Prof. dr. F.M.T. Brazier Technische Universiteit Delft

Prof. dr. C.M. Jonker Technische Universiteit Delft

Prof. dr. J. Köhler Lucerne University of Applied Sciences and Art

Prof. dr. A. Nowé Vrije Universiteit Brussel

Dr. H.P. Lopuhaä Technische Universiteit Delft

Dr. C.H.M. Nieuwenhuis Thales Research & Technology

Prof. dr. K.G. Langendoen Technische Universiteit Delft, reservelid

The research described in this dissertation was financially supported by Thales Ne-derland B.V. and by the Dutch Ministry of Economic Affairs as part of the BSIK-ICIS project, grant nr: BSIK03024

Cover design by Hester van der Scheer

(7)

The book you hold in your hands could not have been created without the help of numerous people, in numerous ways, whom I would like to thank here.

Cees Witteveen, for accepting me as a PhD student although I was somewhat out of your normal scope of pupils and very application driven. You have greatly helped me shape this dissertation, and have spent countless hours reading and re-reading stuff in the iterative process that is finished now.

Kees Nieuwenhuis, for believing in me successfully working on a project of “hy-brid metaheuristics”, and for maintaining trust in the following years, while the sub-ject drifted off to heuristics in dynamic scheduling. Thanks to Jimmy Troost for mak-ing it possible to finish this study.

Rik Lopuhäa, for the hours you spent helping me to get the statistics right, and the numerous fun discussions that we had, trying to understand each other’s pro-fessional language.

The other members of the promotion committee, Jana Köhler, Catholijn Jonker, Frances Brazier, Ann Nowé, Koen Langendoen, for your thorough reading and valu-able comments.

Lukas Finschi and Paul Friedli of Schindler elevators, for providing me with a realistic elevator case, and valuable comments.

Colleagues at D-CIS and TU Delft for discussions and chit chat, and, for a large part of you, for willingly or unwillingly bestowing your precious computers on me to run my experiments on, for weeks uninterupted.

Hester, for your silent acceptance of me dragging away your husband as much as I can (mentally, if not physically), and for the many hours you spent in having me over yourself, creating the lovely cover that people will more strongly remember

(8)

than its contents! Roelof, for the never-ending joy of the dragging-away part... My friends and family, who have supported me in the previous years, never get-ting tired of having to ask after my progress.

My paranimfen. Bram, not only because I enjoy the long friendship that we have, but for the professional discussions as well. I still believe you would do a better job defending this book than I could. It was an honor serving you as paranimf, it is a complete pleasure having you as one.

Ineke. For your love, for knowing me like no one else, for sharing the parenthood of the most delightful people we will ever meet. Unstinting belief in your loved one is the one true door-knob test. You never fail it.

(9)

This dissertation investigates and analyzes methods to improve the performance of schedulers in the context of (highly) dynamic scheduling, with a case study in eleva-tor dispatching.

Chapter 1 introduces the problem of dynamic scheduling, and highlights the pit-falls in dealing with them. A number of generic research questions are identified, which are more elaborately outlined in later chapters.

Chapter 2 outlines the scheduling problem. Scheduling consists of putting tasks on a timeline, subject to various constraints and optimization criteria. Scheduling problems can be complex, depending on the nature of the problem or the size of the problem instance. For a large amount of problems, heuristic problem solvers are employed. These approaches aim for solutions that are as good as possible, without guarantees on the optimality, or even the distance to the optimal result. The chapter discusses heuristic approaches based on constructive algorithms and local search methods, as well as various extensions, such as metaheuristics and hybrids.

Chapter 3 introduces dynamicity, problems that change over time. The dynam-icity is mostly found in the tasks, becoming known or being retracted over time. The level of dynamicity is an important property. Static problems do not encounter change events within their scheduling horizon. Lowly dynamic problems have some such changes occurring, but the changes are regarded disruptions, and the original solutions are only changed slightly. In (highly) dynamic problems, the focus of this dissertation, change is a given: multiple change events occur within the scheduling horizon, and the scheduler is driven by those changes. A specific feature of dy-namic scheduling problems is that an optimal solution cannot be calculated while the system is running: Since the future is unknown, current solutions to the problem

(10)

become obsolete before they are finished executing, thereby rendering any notion of optimality void. The difficulty of solving such problems is further increased by decision timing constraints: solutions have to be calculated on short notice. These two aspects in particular call for heuristics when solving the problem.

Problems can be dynamic in two senses: the short term dynamicity regards the change events themselves. Second-order dynamicity means that the nature of the presented problem changes over time. For example, a junction with traffic lights may face a different kind of dynamic problem during the morning rush hour and the evening rush hour, since traffic streams and congestion risks may be different.

Dynamic scheduling problem solvers usually take the form of either ad-hoc algo-rithms or full reschedulers. Ad-hoc algoalgo-rithms make decisions on what to do next, each time a production unit becomes available. These decisions are based on the current system state and available problem information. Full reschedulers produce complete schedules, adjusting or overhauling them when necessary. Full resched-ulers are increasingly used to cope with short-term and second-order dynamicity, often applying heuristics and hybrid approaches. The application of heuristics and hybrids in dynamic scheduling is the focus of this dissertation.

Chapter 4 focuses on a number of issues in dynamic scheduling. There are issues related to the problem solver, to the problem itself, and to experimenting in this domain.

Looking more closely at the problem solver, we see that dynamic scheduling ap-plications that apply metaheuristics tend to focus most on the chosen algorithm. A metaheuristic problem solver, however, incorporates a triangle of algorithm (how to investigate the space of local alternative solutions?), neighborhood (how to mod-ify a solution to create local alternatives?) and objective function (how to assess the quality of a candidate solution?). Focusing on the algorithm alone may result in un-derexposure of the neighborhood and the objective function. To compensate for that, complex metaheuristics such as evolutionary algorithms are often used. However, this comes at the price of a less transparent problem solver, and possibly subopti-mality. The first Research Question (RQ) is:

RQ 1: Can improved neighborhoods and objective functions reduce the need for a complex metaheuristic while maintaining the same overall quality?

Since a large amount of research in this domain is devoted to evolutionary algo-rithms, a specific research question is:

RQ 2: What are the advantages and pitfalls when applying evolutionary algorithms to dynamic scheduling?

(11)

Different hybridizations are possible to enhance the performance of a single algo-rithm. However, this comes at the cost of increased computational complexity. Fur-thermore, when focusing on a single algorithm, with fixed parameter settings, a dy-namic scheduler may remain static in itself, and does not adapt itself to cope with second-order dynamicity. By making the parameterization flexible, or by adaptively mixing problem solving approaches, schedulers that effectively deal with second-order dynamicity come within reach.

RQ 3: How could a simple hybrid approach increase the performance of a dynamic scheduler, both in the case of short-term and second-order dynamicity?

Looking more closely at the problem of dynamic scheduling itself, we observe a ten-dency in the literature to focus on the scheduler alone. This may result in underexpo-sure of the environment surrounding the scheduler. The environment provides the information needed by the scheduler. Availability and timing of that information are important aspects in the schedulability of a dynamic problem. A system design that does not acknowledge this, may lead to a suboptimal performance, regardless of the scheduling technique applied. Since influencing the environment is often possible, it is worthwhile to investigate such opportunities.

RQ 4: To what extent could modifications in the environment increase the performance of dynamic schedulers?

Dynamic scheduling problems ask for schedules that are flexible or robust. Flexible schedules can easily accommodate future tasks. Robust schedules have little need for rescheduling of currently scheduled tasks, when change events occur. Flexibility and robustness are two sides of a medal. The scheduling approach used should reward schedules that are likely to exhibit flexible and/or robust qualities.

RQ 5: How could flexibility and robustness measures increase the perfor-mance of a dynamic scheduler?

Looking more broadly at studies conducted in dynamic scheduling - especially in elevator dispatching - the use of benchmarks or self-generated test data is often ob-served. The similarity of such data with real world data is questionable, and so is the applicability of research findings obtained from such studies.

RQ 6: What is the external validity of research findings obtained by bench-marks or self-generated test data?

Chapter 4 goes on to provide a set of building blocks, that are aimed at answering these questions. The building blocks are also used in the remaining chapters, when they are applied to improve elevator dispatching.

(12)

The Chapters 5-7 consist of an elevator dispatching case study. The elevator dis-patching problem is first detailed in Chapter 5. It is a good example of a complex dynamic scheduling problem. The combinatorial complexity is high, and even when treated as a static problem it is proven to be NP-hard. It has a high short term dy-namicity, where schedules become outdated long before execution is finished. It also has a strong second-order dynamicity that is often described in the literature. For example, the traffic characteristics of morning up-peak, lunchtime inter-floor-peak, and afternoon down-peak differ greatly. The problem has a strong link to routing and other logistic problems. The chapter presents a number of Research Hypotheses that translate the generic questions asked in Chapter 4 to the elevator dispatching domain.

In Chapter 6, the simulation framework is described that was used to perform the elevator dispatching experiments, as well as the statistical methods that were used to assess the quality of various elevator dispatching approaches. The chapter also fo-cuses on the question of self-generated test data (RQ 6). The simulated elevators are taken from an actual building, and the passenger data were actual traffic data from that same building. The chapter compares results obtained by using these realistic data with results from self-generated data such as found in various articles.

Chapter 7 discusses the experiments that have been run, in four sections. The first section investigates the influence of the environment of the scheduler (RQ 4). In the case of elevator dispatching, conventional control and destination control present two existing environments with different properties. Conventional control provides the scheduler with information both late and inaccurately, whereas destination con-trol provides early and accurate information. Conversely, conventional concon-trol has relaxed decision demands, whereas in destination control, decisions have to be made on short notice. A novel mode that is not currently manufactured, ‘cheating destina-tion control’ is investigated with expected ideal qualities: early and accurate infor-mation, and relaxed decision timing.

The second section in this chapter investigates the issue of complex metaheuris-tics, neighborhoods and objective functions. Since they are closely related, RQ 1 and 2 are treated in this section. Different neighborhood designs are investigated, and a wide range of objective function parameters is investigated that balances individual passenger needs with overall performance, reducing the need for a complex prob-lem solver (RQ 1). Evolutionary algorithms are compared to more straightforward k-opt search (RQ 2).

The third section investigates the search space & objective function relation even further by exploring various methods to increase robustness and flexibility (RQ 5). For low traffic periods, parking is an obvious, and well-known, method to increase flexibility. For medium to high traffic, a number of methods to create flexible and

(13)

robust schedules are investigated.

The fourth section investigates schedulers that adapt themselves to cope with second-order dynamicity, using hybrid heuristics (RQ 3).

Chapter 8 presents a discussion on the dissertation. RQ 1 could be positively an-swered, as abstract solution encodings and neighborhood optimization enabled the use of a straightforward algorithm, leading to a powerful combination. The answer to RQ 2 is ‘limited’, since evolutionary algorithms have profound problems that ap-pear to disadvantage in the area of dynamic scheduling, and did indeed perform badly on the elevator case study. RQ 3 could be positively answered by the use of a building block on hybridization, which lead to pragmatic hybridizations that outperformed all other approaches in this study. The use of hybridizations to cope with second-order dynamicity, however, appeared to be of limited use in the case study, but is still likely to be useful in other domains. Modifications in the schedul-ing environment (RQ 4) played a large role in schedulschedul-ing performance, showschedul-ing that obtaining as much information as possible, as early as possible, and delaying deci-sions as much as possible are important virtues in dynamic scheduling systems. RQ 5 could not be answered in general, since attempts at increasing robustness and flex-ibility are problem-dependent. However, calculating the risk of task-postponement and incorporating this risk into objective functions appeared to be a successful fac-tor in elevafac-tor dispatching, and is likely usable in other domains as well. The use of self-generated test data (RQ 6) is limited, since it was shown to lead to different performance results than when using actual data. The use of benchmarks is shown to be limited as well for specific algorithmic setups, since even a within the single case study, different results were found using the same algorithmic setup under a slightly modified scheduling regime.

(14)

(15)

In dit proefschrift worden methoden onderzocht en geanalyseerd, om de presta-ties van schedulers voor hoog-dynamische scheduling problemen te verbeteren. Er wordt ook een case study gedaan naar liftbesturing.

Hoofdstuk 1 is een introductie in dynamische scheduling, en laat zien waar de moeilijkheden zitten om hiermee om te gaan. Een aantal algemene onderzoeksvra-gen wordt gesteld, die in latere hoofdstuken uitvoerig behandeld zullen worden.

Hoofdstuk 2 beschrijft het scheduling probleem. Bij scheduling worden taken op een tijdbalk geplaatst, waarbij aan bepaalde eisen voldaan moet worden, en waarbij optimaliteit nagestreefd wordt. Deze problemen kunnen erg complex zijn, afhan-kelijk van het soort probleem en de grootte van het specifieke probleem. Voor veel problemen worden heuristische probleemoplossers gebruikt. Zulke benaderingen proberen oplossingen te genereren die zo goed mogelijk zijn, maar zonder dat er een garantie over de optimaliteit wordt gegeven (of zelfs maar hoe dichtbij optimaliteit ze zullen komen). Het hoofdstuk beschrijft heuristische methoden die gebaseerd zijn op constructieve algoritmen en local search algoritmen, alsmede uitbreidingen op dit thema, zoals metaheuristieken en hybrides.

Hoofdstuk 3 beschrijft dynamische problemen, die veranderen in de tijd. Meestal zit de dynamiek in de taken, die tijdens de uitvoering worden toegevoegd of te-ruggetrokken. De mate van dynamiciteit is een belangrijk aspect. Statische proble-men hebben geen last van veranderingen binnen hun scheduling horizon. Laag-dynamische problemen hebben zo nu en dan zo’n verandering, maar die verande-ringen worden gezien als verstoverande-ringen, en de oplossing die eerder bedacht is hoeft nauwelijks aangepast te worden. Bij hoog-dynamische problemen (het onderwerp van dit proefschrift) is verandering een gegeven: veel veranderingen zullen zich

(16)

voordoen binnen de scheduling horizon, en het zoekproces naar oplossingen wordt gedreven door die veranderingen. Een bijzonder aspect van dynamische problemen is dat je, terwijl het systeem draait, niet meer kunt spreken over optimaliteit: de toe-komst is onbekend, dus oplossingen die je nu bedenkt worden achterhaald tijdens de uitvoering. De lastigheid om dit soort problemen op te lossen wordt verder op-gevoerd door het feit dat beslissingen meestal heel snel genomen moeten worden. Deze twee aspecten in het bijzonder maken dat heuristieken gebruikt worden voor het oplossen van dynamische problemen.

Problemen kunnen op twee manieren dynamisch zijn. Korte-termijn dynamiek gaat over de veranderingen die zich voordoen, zoals net beschreven. Tweede-orde dynamiek betekent dat het karakter van het probleem in de tijd verandert. Een kruis-punt met stoplichten staat bijvoorbeeld ‘s ochtends (tijdens de ochtendspits) voor een ander soort probleem dan ‘s avonds, omdat de verkeersstromen en het risico op files anders zijn.

Probleemoplossers voor dynamische scheduling werken doorgaans ofwel met ad-hoc algoritmen of met volledige reschedulers. Ad-hoc algoritmen nemen elke keer dat er een machine vrijkomt, of wanneer er een nieuwe taak bijkomt, onmid-delijk een beslissing over wat er nu moet gebeuren. Volledige reschedulers maken complete roosters, en als dat nodig is passen ze die aan of halen ze die volledig over-hoop. Zulke volledige reschedulers worden steeds meer gebruikt om problemen met korte-termijn en tweede-orde dynamiek op te lossen, en maken meestal gebruik van heuristieken en hybride methoden - het hoofdonderwerp van dit proefschrift.

Hoofdstuk 4 zoomt in op een aantal problemen binnen de dynamische schedu-ling. Die kunnen te maken hebben met de probleemoplosser, met het probleem zelf, en met het experimenteren binnen dit probleemdomein.

Als we naar de probleemoplosser kijken, zien we dat dynamische schedulers die metaheuristieken gebruiken, zich vaak richten op het gekozen algoritme. Maar een metaheuristische probleemoplosser werkt op basis van een driekhoek van het eigenlijke algoritme (hoe doorzoeken we de ruimte van lokale oplossingen?), de zoekruimte (hoe veranderen we een oplossing om lokale oplossingen te genereren?), en een objective function (hoe bepalen we de kwaliteit van een potentiele oplos-sing?). Als je vooral naar het algoritme kijkt, blijven die twee andere aspecten onder-belicht. Om dat te compenseren wordt soms naar hele complexe metaheuristieken gegrepen, zoals evolutionaire algoritmen. Daarvoor moet echter wel een prijs be-taald worden: een minder transparante probleemoplosser, en mogelijk ook mindere prestaties. De eerste algemene onderzoeksvraag (Research Question, RQ) is daarom:

RQ 1: Kunnen verbeterde zoekruimtes en objective functions de nood-zaak voor complexere metaheuristieken wegnemen, zonder in te boeten op de kwaliteit van de oplossingen?

(17)

Aangezien zoveel onderzoek binnen dit domein zich richt op evolutionaire algorit-men, is de tweede onderzoeksvraag:

RQ 2: Wat zijn de voor- en nadelen bij het toepassen van evolutionaire algoritmen voor dynamische scheduling problemen?

Er zijn allerlei hybridisaties mogelijk om de prestaties van een enkel algoritme te verbeteren. Dit leidt alleen wel tot een ingewikkelder probleemoplosser. Aan de andere kant: als je met een enkel, vast algoritme werkt, houd je geen rekening met veranderingen in het karakter van het probleem (tweede-orde dynamiek). Een algo-ritme waarvan de parameters tijdens het draaien kunnen worden bijgesteld, of een variabele mix van algoritmen, kunnen mogelijk leiden tot een probleemoplosser die een antwoord biedt aan tweede-orde dynamiek.

RQ 3: Hoe kan een simpele hybride aanpak de prestaties van een dy-namische probleemoplosser verbeteren, zowel op korte als op lange ter-mijn?

Als we naar het dynamische probleem zelf kijken, zien we dat veel literatuur zich richt op de probleemoplosser, en niet op de omgeving waarin het probleem zich voordoet. Die probleemomgeving levert de informatie die de scheduler nodig heeft. De beschikbaarheid daarvan, en de timing waarmee die informatie beschikbaar komt, zijn belangrijke factoren voor het gemak waarmee het probleem kan wor-den opgelost. Een systeemontwerp waarin daar niet naar gekeken wordt, zou tot matige prestaties kunnen leiden, ook al zijn de gebruikte technieken nog zo goed. Aangezien het vaak mogelijk is de omgeving te beïnvloeden, is het zinnig die moge-lijkheden te verkennen.

RQ 4: In welke mate kunnen veranderingen aan de omgeving de presta-ties van dynamische schedulers verbeteren?

Dynamische problemen vragen om oplossingen die flexibel en robuust zijn. Flexibele oplossingen bieden gemakkelijk ruimte aan toekomstige taken die nu nog onbekend zijn. Robuuste oplossingen hoeven niet al teveel overhoop gehaald te worden als er in de toekomst iets veranderd. Flexibiliteit en robuustheid zijn twee kanten van dezelfde medaille. Een probleemoplosser moet daarom oplossingen prefereren die flexibel en/of robuust zijn.

RQ 5: Hoe kunnen maatregelen voor flexibiliteit en robuustheid de pres-taties van een dynamische probleemoplosser verbeteren?

Als we wat breder naar de literatuur kijken (zeker bij liftbesturing), zien we dat ex-perimenten vaak gedraaid worden op benchmark-problemen of zelf-verzonnen test-data. In hoeverre die overeenkomen met situaties in de gewone wereld is nog maar

(18)

zeer de vraag - en het is dus ook zeer de vraag of de resultaten van dat onderzoek ‘in het echt’ ook gebruikt kunnen worden.

RQ 6: Wat is de externe validiteit van onderzoeksresultaten die gebaseerd zijn op benchmarks en zelf-verzonnen test-data?

Hoofdstuk 4 gaat verder met het beschrijven van een aantal bouwstenen die ant-woord kunnen geven op deze vragen. Die bouwstenen worden in de volgende hoofdstukken ook gebruikt, in het bijzonder toegepast op liftbesturing.

De hoofdstukken 5-7 bevatten de case-study over liftbesturing. Het probleem zelf wordt beschreven in hoofdstuk 5. Liftbesturing is een goed voorbeeld van een complex, dynamisch schedulingprobleem. Zelfs wanneer je het probleem als een sta-tisch probleem bekijkt, is het bewezen NP-moeilijk. Het heeft een hoge korte-termijn dynamiek, en ook een sterke tweede-orde dynamiek die al vaak in de literatuur be-schreven is. Bijvoorbeeld: het reizigersvervoer tijdens de ochtend-piek, lunch-piek en namiddag-piek verschillen sterk van karakter. Het probleem is ook nauw verwant aan andere routerings- en logistieke problemen. Het hoofdstuk geeft een aantal on-derzoekshypotheses (Research Hypotheses) die de algemene vragen uit het vorige hoofdstuk vertalen naar het liftendomein.

In hoofdstuk 6 wordt de simulatieomgeving beschreven, die gebruikt is voor de liftbesturingsexperimenten, en ook manieren om vast te stellen hoe je de kwaliteit van een bepaalde besturingsmethode kunt meten. Dit hoofdstuk gaat ook in op het probleem van de zelf-verzonnen test-data (RQ 6). De passagiersgegevens die in dit proefschrift gebruikt worden zijn gegenereerd op basis van echte gebouwendata die zijn aangeleverd door een liftenfabrikant. Het hoofdstuk vergelijkt resultaten die met deze data verkregen zijn met resultaten op basis van zelf-verzonnen data zoals dat in de literatuur veel voorkomt.

Hoofdstuk 7 beschrijft de experimenten die gedaan zijn, in vier secties. De eer-ste sectie gaat in op de invloed van de omgeving op de scheduler (RQ 4). In het geval van liftbesturing zijn er twee bestaande omgevingen: een conventioneel be-sturingssysteem en een bestemmingsbebe-sturingssysteem. Deze twee systemen heb-ben verschillende eigenschappen. Het conventionele systeem voorziet de scheduler van late en gebrekkige informatie, terwijl bestemmingsbesturing juist vroeg en ge-detailleerd informatie aanlevert. Aan de andere kant vergt een conventioneel sys-teem geen snelle beslissingen, terwijl dat bij bestemmingsbesturing zeer snel moet gebeuren. Een nieuwe omgeving, die op dit moment niet gefabriceert wordt, ‘vals-spelende bestemmingsbesturing’, wordt ook onderzocht, en heeft naar verwachting ideale kwaliteiten: vroege en gedetailleerde informatie, terwijl beslissingen in alle rust genomen kunnen worden.

De tweede sectie gaat over het probleem van de complexe metaheuristieken, zoekruimtes en objective functions. Omdat ze nogal dicht bij elkaar liggen worden

(19)

RQ 1 en 2 in deze sectie behandeld. Verschillende zoekruimtes worden geanaly-seerd, en een breed scala aan parameters voor objective functions wordt onderzocht, zodat de noodzaak voor een complexe probleemoplosser wordt gereduceerd (RQ 1). Evolutionaire algoritmen worden vergeleken met het meer recht-toe-recht-aan k-opt algoritme (RQ 2).

De derde sectie gaat nog dieper op de relatie tussen zoekruimte en objective func-tion in, door verschillende methoden te onderzoeken om de robuustheid en flexibili-teit te verhogen (RQ 5). Bij weinig reizigersverkeer is het slim parkeren van liften een voor de hand liggende (en bekende) methode. Voor middelmatig tot druk reizigers-verkeer wordt een aantal specifieke methoden voor het verhogen van flexibiliteit en robuustheid onderzocht.

De vierde, en laatste, sectie gaat in op schedulers die zichzelf aanpassen om goed aan te sluiten op tweede-orde dynamiek, gebruikmakend van hybride technieken (RQ 3).

Hoofdstuk 8 sluit het proefschrift af met een discussie. RQ 1 kon positief be-antwoord worden. Abstracte oplossingscoderingen en het optimaliseren van de zoekruimte bleken een krachtige combinatie, die het mogelijk maakte een eenvou-dig algoritme te gebuiken. Het antwoord op RQ 2 is ‘beperkt’, want evolutionaire algoritmen sluiten vanuit hun aard slecht aan op dynamische problemen, en de-den het ook inderdaad erg slecht bij liftbesturing. RQ 3 werd positief beantwoord door gebruik van een bouwsteen over hybridisatie, die leidde tot zeer pragmatische hybridisaties, die elke andere gebruikte methode uit dit onderzoek overschaduwd hebben. Het gebruik van hybridisaties om met tweede-orde dynamiciteit om te gaan bleek echter, in ieder geval binnen deze case-study, niet te leiden tot verbetering (ook niet tot verslechtering). Dit was echter goed te verklaren vanuit het specifieke liftbe-sturingsprobleem zelf, en zulke hybridisaties zijn nog steeds veelbelovend in andere probleemdomeinen. Aanpassingen in de probleemomgeving (RQ 4) speelden een enorme rol in de prestaties van de scheduler. Het zo vroeg mogelijk verkrijgen van zo gedetailleerd mogelijke informatie, waarbij het nemen van een beslissing zoveel mogelijk wordt uitgesteld, blijken belangrijke deugden in dynamische scheduling systemen. RQ 5 kon niet in zijn algemeenheid beantwoord worden, aangezien po-gingen tot het verhogen van robuustheid en flexibiliteit erg probleem-gerelateerd zijn. Het bleek wel dat het incalculeren van het risico dat een taak in de toekomst uitgesteld zal worden, een belangrijke succesfactor was in liftbesturing, en waar-schijnlijk ook elders bruikbaar is. Het nu van zelf-verzonnen test-data (RQ 6) is beperkt, en het leidde tot andere resultaten dan wanneer échte gebouwen-data ge-bruikt werden. Het nut van benchmarks buiten hun eigen context is ook beperkt, en zelfs binnen deze enkele case study bleek al dat hetzelfde algoritme verschillende resultaten opleverde als de scheduling omgeving een klein beetje werd aangepast.

(20)

(21)

Preface v Abstract vii Samenvatting xiii 1 Introduction 1 1.1 Overview . . . 1 1.2 Pitfalls . . . 2 1.3 Research questions . . . 4 1.4 Dissertation outline . . . 5

I

General discourse on dynamic scheduling problems

7

2 Scheduling and heuristics 9 2.1 Generic scheduling problem . . . 9

2.1.1 Terminology . . . 9

2.1.2 Constraints and optimality criteria . . . 12

2.1.3 Problem complexity . . . 14

2.2 Solution strategies . . . 15

2.2.1 Basics of solution searchers . . . 15

2.2.1.1 Building a solution from scratch: constructive approach 16 2.2.1.2 Refining a solution: Local search . . . 17

(22)

2.2.2.1 Heuristic functions . . . 20

2.2.2.2 Heuristic algorithms and heuristic approaches . . . . 23

2.2.3 Metaheuristics . . . 24

2.3 Hybrid heuristics . . . 31

2.3.1 Low level vs. High level integration . . . 31

2.3.2 Relay vs. Teamwork . . . 33 2.3.3 Homogeneous vs. Heterogeneous . . . 34 2.3.4 Global vs. Partial . . . 35 2.3.5 Specialist vs. General . . . 35 2.3.6 Static vs. Adaptive . . . 36 2.4 Discussion . . . 38 3 Dynamic Scheduling 39 3.1 Dynamic aspects . . . 39

3.1.1 Direct change events . . . 40

3.1.2 Second-order dynamicity . . . 42

3.1.3 Issues and challenges . . . 43

3.2 Solution approaches for dynamic scheduling . . . 45

3.2.1 Different problem solving modes . . . 45

3.2.1.1 Schedule repair . . . 45 3.2.1.2 Full rescheduling . . . 46 3.2.1.3 Ad hoc scheduling . . . 47 3.2.1.4 Mixed modes . . . 48 3.2.2 Anticipation . . . 49 3.2.3 Adaptivity . . . 54 3.2.4 Reducing nervousness . . . 55 3.3 Discussion . . . 57

4 Designing heuristics for dynamic scheduling 59

4.1 Research questions . . . 59

4.1.1 Issues related to the problem solver . . . 60

4.1.1.1 Solution over-complexity . . . 60

4.1.1.2 Hybrid possibilities in dynamic scheduling . . . 61

4.1.2 Issues related to the dynamic scheduling problem . . . 62

4.1.2.1 Informedness of the dynamic scheduler . . . 62

4.1.2.2 Flexibility and robustness . . . 62

4.1.3 Issues related to experimenting . . . 63

4.1.3.1 Experimental validity . . . 63

4.1.4 Conclusion . . . 63

(23)

4.2.1 BB 1: System design enhancements . . . 64

4.2.2 BB 2: Realism measures . . . 66

4.2.3 BB 3: Abstract solution encoding . . . 66

4.2.4 BB 4: Hybridization . . . 71

4.2.5 BB 5: Heuristic interaction . . . 76

4.2.5.1 Algorithm and neighborhood . . . 77

4.2.5.2 Objective function . . . 79

4.2.6 BB 6: Evolutionary algorithms . . . 81

4.3 Elevator dispatching case study . . . 83

II

Case study: Elevator dispatching

85

5 Introduction to elevator dispatching 87

5.1 Background . . . 87

5.1.1 Elevator system setups . . . 88

5.1.2 Constraints . . . 89

5.1.3 Optimality criteria . . . 91

5.1.4 Classification within generic scheduling . . . 95

5.1.5 Complexity . . . 97 5.1.6 Existing studies . . . 100 5.1.7 Discussion . . . 103 5.2 Research hypotheses . . . 105 5.2.1 Scheduling environment . . . 105 5.2.2 Heuristic interaction . . . 106

5.2.3 Flexibility and robustness . . . 107

5.2.4 Hybridization . . . 107

6 Experimental setup 109

6.1 Elevator simulation setup . . . 109

6.1.1 Office building . . . 111

6.1.2 Elevator car . . . 112 6.1.3 Elevator doors . . . 114

6.1.4 Passenger . . . 115

6.1.5 Scheduler . . . 115

6.2 Data and analysis . . . 118

6.2.1 Input data . . . 118

6.2.2 Performance evaluation . . . 120

6.2.2.1 Pairwise comparison . . . 120

(24)

6.2.3 Statistical significance . . . 123

7 Experiments 129

7.1 Alternative control paradigms in elevator dispatching . . . 129

7.1.1 Operational hypotheses . . . 130

7.1.1.1 Destination control vs. conventional control . . . 130

7.1.1.2 Cheating destination control vs. standard destination

control . . . 131 7.1.2 Experimental setup . . . 131 7.1.2.1 Independent variables . . . 132 7.1.2.2 Dependent variables . . . 132 7.1.2.3 Experimental design . . . 132 7.1.2.4 Testing . . . 135 7.1.3 Results . . . 136

7.1.3.1 Conventional control vs. destination control . . . 136

7.1.3.2 Destination control vs. cheating destination control . . 136

7.1.4 Discussion . . . 136

7.2 Heuristic triangles in elevator dispatching . . . 138

7.2.1 Hypotheses . . . 139

7.2.1.1 On algorithms and search space structure . . . 139

7.2.1.2 On objective functions . . . 141 7.2.2 Experimental setup . . . 144 7.2.2.1 Independent variables . . . 144 7.2.2.2 Dependent variables . . . 145 7.2.2.3 Experimental design . . . 145 7.2.2.4 Testing . . . 150 7.2.3 Results . . . 150 7.2.3.1 Different neighborhoods . . . 150 7.2.3.2 Objective functions . . . 153 7.2.4 Discussion . . . 157 7.2.4.1 Different neighborhoods . . . 157 7.2.4.2 Objective functions . . . 159 7.2.4.3 Conclusion . . . 161

7.3 Robustness and flexibility in elevator dispatching . . . 161

7.3.1 Hypotheses . . . 162

7.3.1.1 Low traffic: Parking (flexibility) . . . 162

7.3.1.2 Medium and high traffic: elevator spreading

(flexibil-ity) . . . 162

7.3.1.3 Medium and high traffic: Future risk estimation

(25)

7.3.2 Experimental setup . . . 163 7.3.2.1 Independent variables . . . 164 7.3.2.2 Dependent variables . . . 164 7.3.2.3 Experimental design . . . 164 7.3.2.4 Testing . . . 170 7.3.3 Results . . . 171

7.3.3.2 Medium and high traffic: Elevator spreading

(flexi-bility) . . . 173

(ro-bustness) . . . 173 7.3.4 Discussion . . . 175

7.3.4.2 Medium and high traffic: Elevator spreading

(flexi-bility) . . . 176

(ro-bustness) . . . 176

7.3.4.4 Conclusion . . . 177

7.4 Hybridization in elevator dispatching . . . 177

7.4.1 Hypothesis . . . 177 7.4.2 Experimental setup . . . 177 7.4.2.1 Independent variables . . . 178 7.4.2.2 Dependent variables . . . 178 7.4.2.3 Experimental design . . . 178 7.4.2.4 Testing . . . 179 7.4.3 Results . . . 179 7.4.4 Discussion . . . 179

III

Discussion

183

8 Discussion 185

8.1 Improving neighborhood and objective function . . . 186

8.1.1 BB 3: Abstract solution encoding . . . 186

8.1.2 BB 5: Heuristic interaction . . . 186 8.1.3 Lessons learned . . . 187 8.2 Evolutionary algorithms . . . 188 8.2.1 BB 6: evolutionary algorithms . . . 188 8.2.2 Lessons learned . . . 189 8.3 Hybridization . . . 190

(26)

8.3.1 BB 4: hybridization . . . 190

8.3.2 Lessons learned . . . 191

8.4 Scheduling environment . . . 192

8.4.1 BB 1: System design considerations . . . 192

8.5 Flexibility and robustness . . . 193

8.5.1 BB 5: Heuristic interaction . . . 194

8.6 Self-generated test data . . . 195

8.6.1 BB 2 Realism measures . . . 195 8.6.2 Lessons learned . . . 196 8.7 Conclusions . . . 196 8.7.1 Concluding statements . . . 197 8.7.2 Future work . . . 198 Bibliography 201 A Algorithms 211

A.1 Creating a new data set . . . 211 A.2 Conventional control (CC) . . . 213 A.3 Destination control (DC) . . . 214 A.4 Fine-grained neighborhoods . . . 215 A.5 Elevator parking and flexibility . . . 216

(27)

1

Introduction

1.1 Overview

Dynamic scheduling enjoys increasing research attention. In dynamic scheduling problems, changes occur while the schedule is carried out. Various approaches from different research fields exist to cope with dynamic scheduling problems. One branch of problem solvers is ad hoc scheduling. A simple example of this can be seen at the check-in desks of an airport. Every time a desk becomes available, the ‘next in line’ is called forward. This scheme is better known as ‘first in first out’ (FIFO). Other problem solvers, instead of taking ad hoc decisions, create complete schedules based on the current knowledge. Upon change events, this schedule will have to be revised. This is known as ‘full rescheduling’, the main focus of this dissertation.

Techniques to perform full rescheduling are rooted in generic search algorithms. For dynamic scheduling, no ‘exact solutions’ exist, since the usability of a proposed schedule depends on an unknown future. Therefore, these techniques always in-volve the use of heuristics. Heuristics can be described as rules of thumb, approaches that try to deliver a best-as-possible solution in a problem environment in which no exact optimum can be calculated. In the field of heuristics, there is particular inter-est in so called metaheuristics, of which evolutionary algorithms are a well-known example. Furthermore, there is a growing interest in the application of hybridiza-tions: putting multiple algorithmic approaches together. The driving force behind hybridizations is the assumption that the whole is better than the sum of the parts. Metaheuristics and hybrids are also increasingly used for dynamic scheduling.

In the area of dynamic scheduling, hybrid solutions may yield a specific advan-tage. Many dynamic scheduling problems both have a short-term and a second-order dynamicity to them. Short-term dynamicity is described above: a problem

(28)

changes before its solution is carried out completely. Second-order dynamicity, how-ever, means that the nature of the problem can change. For example, consider a traffic junction with traffic lights. Incoming and leaving vehicles change the situa-tion continuously, all day long. But the traffic characteristics change as well: there are morning and afternoon rush hours, the more relaxed periods in between, and the tranquility of the night. For each of these situations, a different algorithmic ap-proach may perform best. Hybridization could be used to pick apap-proaches on the fly: choose the best scheduling algorithm to cope with varying circumstances, while the process continues to run.

Applying heuristics and hybrids to dynamic scheduling may incur a number of pitfalls. This dissertation tries to identify those, offering approaches and directions to divert away from these pitfalls. The dynamic scheduling problem will be dis-cussed from different angles, and the dissertation more closely investigates a par-ticular instance of a dynamic scheduling problem: Elevator dispatching. Elevator dispatching is a highly dynamic problem, having both short-term dynamicity (e.g., each time someone registers a hall call) and second-order dynamicity (e.g., the dif-ference in traffic characteristics between the morning rush hour and the lunch peak). Solution approaches in elevator dispatching suffer from the same vulnerabilities that were identified for dynamic scheduling in general, and could therefore benefit from the proposed approaches. Although investigating multiple instances of dynamic scheduling problems would have showcased the benefits even more, elevator dis-patching is an ideal test case to investigate dynamic scheduling mechanisms.

1.2 Pitfalls

• Heuristics as used in full rescheduling, involve the local search paradigm. In local search, a solution is compared to modifications of that solution. The latter are ‘local’ solutions, they are near the original. By continually hopping to local solutions, the search algorithm hopes to find a better solution than the original starting point. In such approaches, the algorithm is only one of three aspects. The others are the neighborhood (how to modify a solution to create a local alternative?) and the objective function (how to assess the quality of a candi-date solution?). In literature describing metaheuristics, the focus is strongly on the algorithm. Some of those metaheuristics are chosen to solve difficult problems, thereby often paying more attention to the chosen algorithm, than to neighborhood and objective function. In hybridizations, this tendency of overestimating the importance of the algorithm is even stronger, since the hy-bridization usually involves only the algorithms. However, all three aspects are important in local search. Therefore, the pitfall here might be underestimating

(29)

the importance of good neighborhood and objective function design.

• Dynamic scheduling problems are often difficult problems. In such problems, engineers rely on problem solving methods like metaheuristics and hybridiza-tions, since these methods are known for coping with complex problems. One particular example of a metaheuristic concerns evolutionary algorithms. These are often applied in a black-box manner. However, their inner working - mix-ing different solutions in order to create new solutions - may not work for solu-tions that have a strong internal coherence, such as schedules (Goldberg, 1989). Furthermore, these methods involve a (large) number of extra parameters to be set, which is a difficult problem in itself (Eiben et al., 1999). If the difficulty of applying such metaheuristics at all, leads to underexposure of attention to neighborhood and objective function, the resulting scheduler will likely pro-duce suboptimal scheduling results. Lastly, the short time frame in which so-lutions are required in dynamic scheduling might rule out algorithms with too high computational complexity.

• In dynamic scheduling there is an important tradeoff between available and necessary time for computation. A problem environment that provides infor-mation late, and also demands an immediate solution, makes it difficult for the dynamic scheduler to provide good results. In environments where these factors can be modified, such that information is provided earlier and deci-sions can be made later, a performance increase is very well possible using a scheduler that exploits these factors.

• In dynamic scheduling, lack of knowledge of the environment, such as ex-pected events in the near future, make it difficult for the scheduler to produce desirable results. Dynamic scheduling problems are often presented as a di-rect copy of their static counterpart, and literature sometimes even describes dynamic problems being solved as if they were static. However, if the dynam-icity of the problem is not taken into account in the search for a solution, the value of that solution may be short-lived. When a change event occurs, the change may be difficult to incorporate into the schedule (the schedule is ‘in-flexible’), or the objective value for the modified schedule would differ greatly from the previous one (the schedule is not ‘robust’).

• Benchmark problems and self-generated test data are often used to assess the performance of scheduling techniques. However, the correspondence between a benchmark problem and a real world problem may be disputable. In dy-namic scheduling, the moving peaks problem has been proposed and used as a benchmark. However, this benchmark has little or no connection to any actual

(30)

problem. This raises the question to what extent the results using that bench-mark are applicable to problems in the real world. To a lesser extent, this is also true for simulations of real world problems in which self-generated ran-dom test data are used to analyze the performance. When these test data are not modeled in accordance to realistic scenarios, the test results may be dis-putable as well.

1.3 Research questions

Deriving from the pitfalls, this dissertation aims to answer a number of research questions and to provide opportunities to increase the performance of dynamic schedulers:

1. Can improved neighborhoods and objective functions reduce the need for a complex metaheuristic while maintaining the same overall quality?

2. What are the advantages and pitfalls when applying evolutionary algorithms to dynamic scheduling?

3. How could a simple hybrid approach increase the performance of a dynamic scheduler, both in the case of short-term and second-order dynamicity? 4. To what extent could modifications in the environment increase the

perfor-mance of dynamic schedulers?

5. How could flexibility and robustness measures increase the performance of a dynamic scheduler?

6. What is the external validity of research findings obtained by benchmarks or self-generated test data?

In order to answer these questions, a number of building blocks is proposed for the design of practical dynamic schedulers. To check the viability of these building blocks in practice, elevator dispatching was used as a typical, and challenging, ex-ample of dynamic scheduling. A simulation environment was created that closely mimics an actual building in Paris. Using real world passenger data from that build-ing, realistic experiments were run and analyzed. The general research questions above were translated into hypotheses that posed specific questions regarding ele-vator dispatching. Finally, in the discussion chapter, specific attention is paid to the broader applicability of research findings from this specific case study.

(31)

1.4 Dissertation outline

In addressing the issues and challenges, Chapter 2 first charts the area of schedul-ing and the use of heuristic approaches in general. Chapter 3 then focuses on the particularities of dynamic scheduling, both from a problem and a solution perspec-tive. The chapter shows that the complexity of dynamic problems is of different nature than the complexity of static problems. On the solution side, the applicabil-ity of techniques that were discussed in Chapter 2 to dynamic scheduling problems are discussed. We will see that no exact solution can exist for a dynamic schedul-ing problem, and heuristics are always needed in some form or another. Chapter 4 discusses the previous two chapters, culminating in the aforementioned research questions. Then it proposes building blocks to alleviate the problems mentioned before.

The second part of the dissertation presents the elevator dispatching case study. Chapter 5 outlines the problem itself, and states four Research Hypothesis that are specific to the elevator dispatching problem. Chapter 6 discusses the experimental design, focusing on the simulation environment, the input data and the proposed methods for analysis. Chapter 7 provides operational hypotheses for each of the, more generic, Research Hypotheses. The chapter includes the specific experiments to investigate these operational hypotheses, the results and the conclusions.

Finally, in the third part, Chapter 8 discusses the research questions and the build-ing blocks from Chapter 4, both in view of the case study and in its general applica-bility.

(32)

(33)

General discourse on dynamic

scheduling problems

(34)

(35)

2

Scheduling and heuristics

This chapter presents a general background on scheduling problems, and approach-es to solve them. The first section outlinapproach-es the generic scheduling problem, followed by a section on solution strategies that are commonly applied to scheduling prob-lems. So-called metaheuristics and hybrid algorithms will come to closer investiga-tion there. The next chapter (3) discusses the focal point of this dissertainvestiga-tion, dynamic scheduling.

2.1 Generic scheduling problem

Pinedo (2008) gives the following definition of scheduling: “Scheduling concerns the allocation of limited resources to tasks over time. It is a decision-making process that has as a goal the optimization of one or more objectives”. Within research fields such as Artificial Intelligence and Operations Research, much attention is devoted to developing solu-tions to solve scheduling problems. This section discusses aspects of the scheduling problem, whereas the next section focuses on solution strategies.

2.1.1 Terminology

Tasks and Resources are the basic entities in all scheduling problems (T’kindt and Billaut, 2002; Cottet et al., 2002). A task set T = {τ0, ..., τn−1}with n tasks has a

number of properties, also depicted in Fig. 2.1.

• ri is the release time of task τi, the earliest possible starting time for that task.

(36)

ci ri,0 di pi ri,1 si ei τi t Di

Figure 2.1: Timing diagram of the task model

• piis the task period of task τi. If τi is a periodic task, piis the time at which τi

is repeated.

• ciis the completion time of task τi, the time it takes for task τi to complete. In

some contexts where the completion time is not fixed, cidenotes the worst-case

task duration, i.e., the maximum possible time it takes τito complete.

• di is the deadline of task τi, i.e., the latest time at which task τihas to be

com-pleted.

• Diis the relative deadline of task τi, the time interval within which task τiis to

be scheduled.

• siis the start time of task τi, the absolute time at which the execution task τiis

started.

• eiis the end time of task τi, the absolute time at which the execution of task τi

is finished.

A number of other properties of a task can be distinguished. Of those, Preemptiveness and Task Dependency are the most important ones. A preemptive task can be stopped once execution has been started. This may be done in order to free resources cur-rently in use by the execution of that task. A non-preemptive task has to be carried out completely after its execution has been started. Task dependency means that a task can only be executed after a specific set of one or more other tasks have been completed. The task dependencies within T induce a partial order on T, otherwise, a deadlock situation occurs.

Resources are the means using which tasks can be executed. There are two types, renewables and consumables. Renewable resources, like machines, can be used again. Consumable resources, like fuel, can only be used once. In most scheduling prob-lems, the resource environment is primarily described in terms of renewable re-sources, mostly machines. The number of machines is denoted by m, with subscript

(37)

of job j on machine i. If the processing time does not depend on the actual machine used, the subscript i is omitted.

Example 2.1. As an illustration of scheduling properties, consider a taxi

schedul-ing example in which multiple taxis are drivschedul-ing around in the same city, pickschedul-ing up and delivering passengers. In this example, the tasks consist of passenger trips. The tasks are not preemptive, since passengers are always delivered before new passengers are picked up. The resources are taxis, taxi drivers and fuel. The re-lease time for the task is the moment a passenger calls for a taxi (or hails one).

Shop scheduling problems

A specific family of scheduling problems is called Shop Scheduling Problems (SSP). Members of this family that are often treated in the literature include Job Shop Scheduling Problems (JSSP) and Flow Shop Scheduling Problem (FSSP). Within SSPs, tasks are grouped in jobs. Each job may consist of a number of tasks, which are called operations. The number of jobs is usually referred to as n, with the subscript j used to refer to a specific job. Properties like release date (rj) and due date (dj) are analogous

to what is mentioned for tasks alone (Pinedo, 2008). SSP’s are classified using the triplet α|β|γ(Graham et al., 1979):

• The α part describes the machine environment. For example, ‘1’ refers to a

single machine problem. ‘Pm0 refers to an environment with m machines in

parallel.

• The β part denotes the constraints that have to be met. For example, ‘sjk’ refers

to sequence dependent setup times, when a machine has a setup time after job j before it can handle job k. ‘nwt’ refers to a no-wait environment, when no waiting time is allowed between the execution of different operations in a job.

• The γ part refers to the scheduling objective, the goal to be reached. ‘Cmax’ is a

typical example, denoting makespan: minimize the completion time of the last job in the system.

Some practical problems can, with some moderations, be categorized as, or reduced to, Shop Scheduling Problems (Landa Silva et al., 2004). However, practical problems often have specific properties that make a translation into this framework hard or impossible (Pinedo, 2008).

(38)

2.1.2 Constraints and optimality criteria

As noted before, a schedule is a mapping of tasks to resources at specific times. In producing a schedule, the scheduler has to find a solution that does not violate constraints. Such a solution is called a feasible solution. Constraints are demands on a schedule, for which a violation leads to an infeasible schedule. Examples of constraints are “jobs that are scheduled on the same machine cannot have overlap-ping time intervals”, or “all jobs must be scheduled”. When a problem is under-constrained, multiple solutions to the problem exist. In this case, the solution provider can choose from multiple feasible solutions the one that meets his objectives best. When a problem is over-constrained, different constraints are counteracting each other to such an extent that no feasible solution exists (Beaumont et al., 2001).

Definition 2.1. If S is a set of solutions, an objective function is a function f : S→R

that provides an objective value to any solution s∈S.

In addition to satisfying constraints, the scheduler mostly wants to find an optimal solution. Using optimality criteria, the scheduler can compare different schedules and choose the best. Optimality criteria are encompassed in an objective function, which is

used to calculate an objective value1_{. The objective value is a measure to quantify the}

performance of a solution. A typical examples of an optimality criterion is “minimize the time needed to complete all jobs”.

In the literature, some make the distinction of constraints and optimality criteria as discussed before (Pinedo, 2008). Others make the distinction between hard con-straints and soft concon-straints (Cohen et al., 2006; Landa Silva et al., 2004). The hard constraints are the constraints we have seen before. The soft constraints, however, can be violated, without rendering the solution infeasible. The scheduler will try to minimize the soft constraint violations. For example, the hard constraint “all jobs must be scheduled” might be relaxed into a soft constraint: “maximize the number of scheduled jobs”. Soft constraints are therefore optimality criteria by themselves, and are often treated as such. In the remainder of this dissertation, we will not use this hard/soft distinction, but refer to constraints and optimality criteria instead.

Apart from constraints related to the actual problem and the environment, an-other type of constraint encountered in scheduling is the quality of service (QoS) con-straint. A QoS constraint is a demand on the problem solver, defining the perfor-mance levels to which the problem solver has to adhere. An example of a QoS-constraint is “a solution has to be provided within 0.1 seconds”. In real world

sce-1_{Objective functions and objective values are sometimes called fitness functions and fitness values, or,}

shortly, the fitness of a candidate solution. The term ‘fitness’ is derived from the field of evolutionary algorithms, in which the search for an optimal solution follows the Darwinian concept of the survival of the fittest (genetic algorithms are more closely discussed in Section 2.2.3). Nevertheless, the term fitness is often used in heuristics literature next to the term objective (value). In some cases, like ‘fitness landscape’, fitness is the only term used. In this dissertation, both terms will interchangeably occur.

(39)

f

0

f

1

Figure 2.2: Pareto optimality: The circles form a pareto-optimal boundary in a situation where

two optimality criteria play a role ( f0and f1- higher values are better). They dominate other

solutions (like the rectangles), by improving performance on at least one criterion, while being at least as good on all other criteria.

narios, especially in real-time dynamic scheduling which is the main topic of this dissertation, available scheduling time may be extremely short.

Example 2.2. Taxi driver scheduling is restricted by constraints, such as labor time

laws for the driver, speed limits, or the maximum number of passengers in a taxi. The optimality criteria can be diverse, for example, minimizing the waiting time for passengers or minimizing the distance travelled by the pool of taxis.

Multi-criteria optimization

In real world problems, multiple optimality criteria may be present. When a candi-date schedule is valued higher than another schedule according to one of the opti-mality criteria, without losing its value on the other criteria, this candidate dominates the other schedule. Alternatively, this candidate is called a Pareto improvement over the other schedule.

Definition 2.2. Pareto dominance: An objective vector u = (u0, ..., un−1)dominates

v= (v0, ...vn−1), denoted by u≺v, iff ui≤vi∀i<n and∃i : ui<vi.

A schedule that is not dominated by any other possible schedule is called a Pareto optimum.

Definition 2.3. A solution s∈S is Pareto optimal if F(x) ⊀F(s) ∀x ∈S.

In problems with multiple optimality criteria, a set of Pareto optimal solutions may exist that may be improvable for some of the optimality criteria, but only by sacrificing other optimality criteria (Fig. 2.2).

(40)

Example 2.3. Continuing the taxi driver scheduling example from Section 2.1.1, a new passenger calls for a taxi somewhere in the outskirts of town. There are two options: sending an empty taxi to the passenger, or assigning another taxi that coincidentally has to deliver another passenger in that area. The vacant taxi will arrive earlier than the taxi that has a delivery first. On the other hand, sending the empty taxi will increase the overall distance travelled for all taxis. The choice is whether to let the passenger wait five minutes longer, or have all taxis drive five kilometers extra. When the scheduler assigns equal values to an extra minute waiting time and an extra kilometer driven, the choice becomes arbitrary, since no solution is a Pareto improvement over the other.

Several schemes to multi-criteria optimization can be applied (Landa Silva et al., 2004):

• a priori: when different values are, via some calculation, joined into one single objective value. The weighing of values (adding? multiplying? using one value as an exponent for another?) is a difficult issue. However, when the engineer aims at a fully automated system, this is the method of choice as it produces one number that can easily be used in any optimizing algorithm, and it does not require user intervention.

• mixed initiative: when the (human) user can intervene in the process, guiding the search for a preferable solution. As synchronizing a human user and a com-puterized scheduler is a research topic in itself, this option is not used much. • a posteriori: where the (human) user is presented with a choice, after the search

for solutions has been done. The user can make a decision based on his per-sonal preferences and experience combined with the specific situation at hand. T’kindt and Billaut (2002) give an excellent overview of multi-criteria scheduling. For a more concise overview, the reader is referred to (Landa Silva et al., 2004).

2.1.3 Problem complexity

In scheduling, the number of tasks and resources is referred to as the size of a prob-lem instance, probprob-lem size for short. The size of the solution space, on the other hand, reflects the amount of scheduling possibilities. An increase in problem size typically leads to an increase of the size of the solution space. Although the size of the solution space and its growth speed relative to the problem size seem an indication of the problem complexity, problem complexity is actually defined by the best algorithm available to solve it. For example, the problem of finding the shortest path between two cities yields a solution space that grows exponentially with the size of the road

(41)

network (i.e., the number of roads). Yet, the A* algorithm solves this problem in a time span that is a polynomial function of the problem size (not the solution space size). Such an algorithm is called an efficient algorithm. Hence, the shortest path prob-lem is regarded an ‘easy’ probprob-lem. A probprob-lem is considered ‘hard’ when no efficient algorithm exists. The notion of (non) polynomial time complexity of algorithms is formalized in the theory of NP-completeness (Cook, 1971; Garey and Johnson, 1979). An important aspect of NP-hard optimization problems, such as most scheduling problems that are treated in this dissertation, is that no efficient algorithm exists to solve them.

More aspects than the time complexity can play a role in describing the complex-ity of a problem. For example, the spatial complexcomplex-ity of an algorithm (not of a problem) deals with the fact that the memory requirements of an algorithm can be explosive. However, both spatial and time complexity can lead to a situation in which the prac-tically available processing capacity is limited below the bounds of efficient calcula-bility, so that even an ‘easy’ problem becomes ‘hard’. Furthermore, the information provided to the problem solver may be uncertain or incomplete. In such situations, finding a solution may become more difficult, and reaching optimality may not even be possible.

2.2 Solution strategies

The scheduling problem as introduced in the previous section, lends itself to be solved by a diverse range of problem solvers. For some problems, solutions can efficiently be calculated. As we have seen in Section 2.1.3, a large number of prob-lems cannot be solved efficiently. In that case, heuristics are used. Heuristics are problem solving tools that help to solve the problem, aiming for solutions that are as good as possible: optimality cannot be guaranteed. Since heuristic problem solv-ing approaches are strongly rooted in generic problem solvsolv-ing techniques, those are treated first in this section. The next subsection focuses on heuristics. In the last sub-section, metaheuristics are introduced as a more generic means to solve problems.

2.2.1 Basics of solution searchers

Producing a solution to a given problem basically takes one of two forms: construct-ing a solution from scratch, or refinconstruct-ing a given solution by investigatconstruct-ing related so-lutions. The former, the constructive approach, builds up a solution by concatenating atomic solution elements. The latter, called local search, starts with a given com-plete solution and improves it until no further improvement can be found, or time is running out. There are problems for which finding a solution at all is the major

(42)

a b c d e f g h i a b c d e f g h i

Figure 2.3: Depth first search and breadth first search

challenge. In that case, using a constructive algorithm is the appropriate approach. In other problems, finding a solution in itself is easy, but the real challenge lies in op-timization. In that case, constructive algorithms may be helpful in quickly creating a starting solution, but local search methods will be used to optimize the solution.

2.2.1.1 Building a solution from scratch: constructive approach

Typically, a constructive approach takes the shape of a tree: initially, the solution is empty (the root of the tree), and at each iteration atomic elements are added, thereby branching the tree. The amount of possibilities to do so defines the branching factor. The nodes within the tree represent incomplete solutions, whereas the leaves at the end of the tree are the complete solutions. The walk from root to leaf defines how the solution is created. Some branches may lead to dead ends: it is not a complete solution yet, but due to constraint violations, no more elements can be added. This may be possible at a great distance from the root of the tree. In Depth First Search (DFS), a solution is created by consecutively adding elements to an incomplete so-lution, and backtracking each time a dead end is found. As only the path to the current (incomplete) solution is stored, this method is not memory intensive. How-ever, it can easily run into long, unpromising branches. In Breadth First Search (DFS), as long as no solution is found, the complete next layer of (incomplete) solutions is examined. This method is memory intensive as all nodes up until the current layer are stored, but it guarantees finding a solution with the shortest path from the root of the tree to the final leaf (Fig. 2.3).

In order to find an optimal solution, a complete search has to be carried out. Of course, incomplete branches to which no element can be added anymore limit the search space. Furthermore, branches from which it can be determined that they cannot produce better results than a solution that was already found, can also be discarded (‘Branch and Bound’, Land and Doig (1960)).

In scheduling problems, constructive approaches take a somewhat different form.

Time is continuous, but tasks have to be placed at specific points in time. A task τi

with a shorter completion time ci(Fig. 2.1) than its relative deadline Dihas an infinite

(43)

t τ0 τ2 τ1 τ0 τ2 τ0 τ0 τ2 τ1 τ0 τ1 τ2 τ2 τ1 S0 S1 S2 S3 S4

Figure 2.4: Constructive approach applied to scheduling. Tasks τ0, τ1and τ2with their

re-spective earliest possible starting times and deadlines are shown above the timeline. Below,

schedules S0..S4result from placing tasks on the timeline. For example, S0starts with putting

τ0at the earliest possible time, after that, there is no room left for τ1, and finally τ2is put on

its earliest possible start time. If the scheduler does not allow for schedules in which tasks remain unscheduled, the scheduler would backtrack when such a dead end is found. Finally,

by starting with τ2, a schedule containing all tasks can be created.

at the earliest possible time, or to end at the deadline, or the task is placed directly adjacent to another task (Fig. 2.4). Limiting the placement options in this manner facilitates tree-based constructive approaches.

2.2.1.2 Refining a solution: Local search

Local search comprises a few key notions, that can be formalized as follows (follow-ing Michiels et al. (2006)):

Definition 2.4. An instance of an optimization problem is a pair(S, f), where the

solution space S is a set of solutions and f is a cost function. f is a mapping f : S→R

that assigns a real value to each solution in S. This value is called the objective value, or alternatively, the cost of the solution.

Definition 2.5. An optimization problem is, given an instance(S, f), to find a solution

s0 ∈S that is globally optimal. The problem is either a maximization or a minimization

problem. For a minimization problem, s0∈S is a global optimum if f(s0) ≤ f(s) ∀s∈

S. Consequently, for a maximization problem, s0 ∈ S is a global optimum if f(s0) ≥

f(s) ∀s∈S.

Optimization means to either maximize or minimize a cost function. For the remainder of this dissertation, without loss of generality a maximization problem

(44)

is assumed. Obviously, a minimization problem can be turned into a maximization problem by changing the sign.

Definition 2.6. A neighborhood function is a mapping N : S →2S, with 2Sdenoting

the power set{V|V⊆S}. The neighborhood of a solution s∈S is a set N(s) ⊆S. s0is a neighbor of s if s0∈ N(s).

Example 2.4. Traveling salesman problem neighborhood

Consider a traveling salesman problem (TSP) with five cities (a..e). A tour has to be created that visits each city once. The optimization problem strives for the shortest possible tour.

In the picture, on the left, an initial tour is given. Writing this solution as a string, we get "abedc". A neighborhood function creates alternative solutions by swapping the cities from two positions in the original string. Swapping the first two positions from the string ("ab") leads to a new tour "baedc". Doing this for all permutations of positions leads to the 10 neighboring tours depicted on the right. The two thick-lined neighbors are the obvious shortest routes.

Local search starts with making a complete initial solution of a problem. This initial solution is created by some heuristic constructive algorithm. It may not even be a feasible solution yet. Next, a local search algorithm searches through the solution space by continually moving from a solution to one of its neighbors. Choosing a solution from the neighborhood is done using a decision strategy. The process is re-peated until some stopping criterion is met. A typical criterion that can be used in the TSP example is to stop if the neighborhood of a solution does not contain a better solution. Alternatively, timing constraints can also serve as stopping criteria.

In the most simple setup (a greedy hill climber, or steepest ascent algorithm, see Algo-rithm 2.1) the current solution is always replaced by the best solution in the neighbor-hood, until no further improvements are possible. In the TSP-example, the stopping criterion is already met after one neighborhood investigation, after which the algo-rithm returns one of the two best solutions. This resulting solution is a local optimum, which is not necessarily a global optimum (see Fig. 2.5).