Models for Predictive Railway Traffic Management

(1)

s erie s t 2 01 4/5 thesis series Pavle Kecman

Models for Predictive Railway T

raffic Management

Summary

The potential growth in transport demand in the next decade and beyond requires a change from reactive to proactive traffic control to maintain and improve the reliability of railway traffic. In order to enable an anticipative approach to traffic management, it is necessary to develop the tools for monitoring, prediction and optimisation of the traffic operations. This thesis presents the models that can be used as components for a decision support system for predictive traffic management.

About the Author

Pavle Kecman received his M.Sc. degree from the University of Belgrade in 2008. In June 2010 he joined the Department of Transport and Planning, Delft University of Technology, as a Ph.D. candidate. He currently works as a postdoctoral researcher at the Department of Science and Technology, Linköping University in Sweden.

TRAIL Research School ISBN 978-90-5584-175-2

Pavle Kecman

Models for Predictive Railway

Traffic Management

(2)

Pertaining to the dissertation

Models for Predictive Railway Traffic Management Pavle Kecman

20 October 2014

1. Extracting and processing information from a real-time data stream requires all typical steps for offline data analysis to be performed simultaneously. (Chapter 3)

2. Having in mind the observed variability of running and dwell times, accurate modelling of the latter requires more attention in order to create valid railway planning and control models. (Chapter 4)

3. A data-driven approach outperforms microscopic simulation tools for real-time prediction in terms of prediction quality, requirements for implementation and computation speed. (Chapter 5)

4. A macroscopic rescheduling model that takes into account minimum headway times in stations and overtaking constraints on open track provides fast solutions of good quality. Thus it is applicable to serve as a decision support system for traffic controllers. (Chapter 6)

5. In order to ensure sustainable mobility, the transport market should be strictly regulated based on the proven (dis)advantages of certain modes for certain trips.

6. The number of citations does not say much about the paper quality just like the number of sold copies or tickets is not a quality indicator of a music record or a movie.

7. George Orwell’s dystopian principle: “Who controls the past, controls the future” is turning out to be correct with the increasing impact of historical data on decision making processes in economy, finance, trade and retail. 8. Rational people push the world forward but it’s the irrational people that

make it worth living in.

9. Copyright infringement has a better effect on arts and popular culture than the restrictive intellectual property laws.

10. Everything looks bad if you think about it long enough.

These propositions are considered opposable and defendable and have been approved as such by the promotor Prof. Dr.- Ing. I.A. Hansen.

(3)

Models for Predictive Railway Traffic Management Pavle Kecman

20 october 2014

1. Het extraheren en verwerken van informatie vanuit een real-time datastroom vereist dat alle kenmerkende stappen van offline data analyse simultaan worden uitgevoerd. (Hoofdstuk 3)

2. Met de geobserveerde variabiliteit van rijd- en wachttijden in

ogenschouw genomen, vereist het nauwkeurig modelleren van wachttijden meer aandacht, om zodoende goede rail planning en regeling modellen te cre¨eren. (Hoofdstuk 4)

3. Een data gestuurde aanpak presteert beter dan microscopische simulatie voor real-time voorspellingen met betrekking tot de kwaliteit van de

voorspelling, benodigdheden voor implementatie, en rekensnelheid.

(Hoofdstuk 5)

4. Een macroscopisch model dat minimale volgafstand op stations en inhaalmogelijkheden op de open baan meeneemt voor de herplanning van

de dienstregeling levert snelle en goede oplossingen. Daarom is het

bruikbaar als beslissingsondersteunend systeem voor treindienstleiders. (Hoofdstuk 6)

5. Om duurzame mobiliteit te bewerkstelligen moet de transportmarkt strikt gereguleerd worden op basis van bewezen voor- en nadelen van bepaalde vervoerswijzen voor bepaalde reizen.

6. Het aantal citaties van een artikel zegt niet veel over de kwaliteit, net zoals het aantal verkochte kopie¨en of toegangsbewijzen geen kwaliteitsindicator is voor een muziekstuk of film.

7. George Orwells dystopisch principe: “Wie het verleden beheerst,

beheerst de toekomst” blijkt correct met de toenemende invloed van

historische gegevens op besluitvormingsprocessen in de economie,

financi¨en, en detailhandel.

8. Rationele mensen brengen de wereld voorwaarts, maar het zijn de irrationele mensen die het waard maken erop te leven.

9. Schending van auteursrecht heeft een beter effect op kunst en populaire cultuur dan de beperkende eigendomsrecht wetten.

10. Alles lijkt slecht als je er maar lang genoeg over nadenkt.

Deze stellingen worden opponeerbaar en verdedigbaar geacht en zijn als goedgekeurd door de promotor Prof. Dr.- Ing. I.A. Hansen.

(4)

Management

Pavle Kecman

(5)

funded by the Ministry of Economic Affairs.

(6)

Management

Proefschrift

ter verkrijging van de graad van doctor aan de Technische Universiteit Delft,

op gezag van de Rector Magnificus prof. ir. K.Ch.A.M. Luyben, voorzitter van het College voor Promoties,

in het openbaar te verdedigen op maandag 20 oktober 2014 om 10:00 uur door

Pavle Kecman

Master of Science in Traffic & Transport Engineering University of Belgrade

(7)

Toegevoegd promotor: Dr. R.M.P. Goverde Samenstelling promotiecommissie:

Rector Magnificus voorzitter

Prof. Dr.- Ing. I.A. Hansen Technische Universiteit Delft, promotor

Dr. R.M.P. Goverde Technische Universiteit Delft, toegevoegd promotor Prof. dr. ir. R.P.B.J. Dollevoet Technische Universiteit Delft

Prof. dr. L.G. Kroon Erasmus Universiteit Rotterdam Prof. Dr.-Ing. J. Pachl Technische Universit¨at Braunschweig

Prof. dr. C. Roberts University of Birmingham

Prof. dr. D. Mandi´c University of Belgrade

Prof. dr. ir. S.P. Hoogendoorn Technische Universiteit Delft, reserve

This thesis is the result of a Ph.D. study carried out from 2010 to 2014 at Delft Uni-versity of Technology, Faculty of Civil Engineering and Geosciences, Department of Transport and Planning.

TRAIL Thesis Series no. T2014/5, the Netherlands TRAIL Research School

TRAIL P.O. Box 5017 2600 GA Delft The Netherlands Phone: +31 (0) 15 278 6046 Fax: +31 (0) 15 278 4333 E-mail: info@rsTRAIL.nl ISBN 978-90-5584-175-2

Copyrightcbe 2014 by Pavle Kecman.

This work is licensed under the Creative Commons Attribution-NonCommercial 3.0 Unported License. It may be freely shared, copied and redistributed in any medium or format. Transformation and building upon the material is permitted for non-commercial purposes under the condition that the work is properly cited.

(8)

(9)

(10)

All successful Ph.D. projects are the same, every problematic Ph.D. project is prob-lematic in its own way. The so-called Anna Karenina Principle, named after the first sentence of the great book by Leo Tolstoy, applied in the context of a Ph.D. research implies that little can be said about a project that went according to plan during its whole course. An interesting and well defined topic, good supervision and my pas-sion for railways and research made the previous four years an enjoyable and fruitful period. Or is just my memory playing tricks because the work is finally completed? The work presented in this thesis was carried out as a part of the joint project “Model-predictive railway traffic management” between the Department of Transport and Plan-ning (T&P) and Delft Centre for Systems and Control (DCSC) of the Delft University of Technology (DUT). The project was funded by the Dutch Technology Foundation STW. The goal was to develop new models and a new model-predictive controller for anticipative management of railway networks. The work described in this thesis repre-sents the first step towards reaching this objective. It is planned to have the presented models integrated in a closed-loop control approach that will be presented in a separate Ph.D. thesis completed atDCSC.

There are many people who have directly and indirectly helped me to produce this dissertation. My supervisors and colleagues deserve special gratitude for their help and dedication. My direct supervisor Rob Goverde has been closely involved in this research from the very beginning to the final approval of the thesis. He was always available to help and I am very grateful for his contribution and guidance at the difficult points in my research. It was a great pleasure and an honour to work with him a learn from him. I would also like to thank my advisor Professor Ingo Hansen for his critical point of view that was always followed by useful advice to help me improve my work. On a more general note, I am also grateful to him for the fact that his enthusiasm helped to establish railway operations research as an independent scientific discipline represented with the high quality journals and conferences.

I would furthermore like to thank the other people involved in the research project: Bart Kersbergen, Nicol´as Weiss, Ton van den Boom and Bart De Schutter on behalf of DCSC. Keeping up to schedule was to a great extent helped by the fact that we presented our main findings and research plans on a regular basis to the user commit-tee consisting of experts from academia and industry. The commitcommit-tee members: Bob vii

(11)

Jansen, Leo Kroon, Edo Nugteren, Alfons Schaafsma, Ello Weits, Jianxin Yuan helped a lot with their comments, questions and advice. Direct support for this research was provided by ProRail, Dick Middelkoop in particular, who helped by providing the data sets needed to build and test our models.

Furthermore, I would like to thank Francesco Corman and Andrea D’Ariano for their help and contribution to the part of this thesis related to real-time rescheduling. Work-ing with them at the early stage of my research was truly a lesson in efficiency, pre-cision and quality that helped me adopt such attitude in my work. I owe a lot of gratitude to my dear colleagues from the rail group atT&Pand the University of Bel-grade. Not a single part of this work remained undiscussed between us. From the problem definition, via methodology and programming, to the clarity of the figures, they provided a useful feedback for each aspect at any time and place. I am very grate-ful to Daniel Sparing, Francesco Corman, Lingyun Meng, Egidio Quaglietta, Nikola Beˇsinovi´c, Nadjla Ghaemi and Peca Jovanovi´c for their time and patience in many casual brainstorming sessions. It was surely the most fun part of doing research. Fi-nally, I would like to thank all technical and administrative staff ofT&Pand TRAIL Research School for taking care of many practical issues, which allowed me to fully focus on the project.

During the last four years spent in the Netherlands I was lucky to be surrounded by many wonderful people to rely on and have fun with at work and outside. My dear friends in Delft, Rotterdam, The Hague, Amsterdam, Almelo and Groningen were there for me to offer a good laugh and their advice and solution to all problems from doing laundry to weltschmertz and existential crises. Having them in my life is defi-nitely my most important achievement from this period. Having a delayed train is not that bad if you’re in a good company. And of course, I am always most grateful to my family for their unreserved support and love that makes the physical distance between us seem so unimportant.

Pavle Kecman Belgrade, August 2014

(12)

Preface vii

1 Introduction 1

1.1 Background . . . 1

1.2 Railway traffic control in the Netherlands . . . 2

1.3 Motivation. . . 4

1.3.1 Short-term traffic prediction . . . 5

1.3.2 Network-wide traffic management . . . 6

1.3.3 Model-predictive control . . . 7

1.4 Thesis objectives . . . 8

1.4.1 Research objective 1 – Monitoring and traffic state prediction 8 1.4.2 Research objective 2 – Rescheduling models for network-wide traffic control . . . 10

1.5 Thesis contributions. . . 11

1.5.1 Monitoring and real-time traffic state prediction . . . 12

1.5.2 Macroscopic models for network-wide traffic rescheduling . . 15

1.6 Thesis outline and scope . . . 16

2 An overview of railway operation planning and control 19 2.1 Introduction . . . 19

2.2 Terminology and basic concepts of railway traffic . . . 20

2.2.1 Railway timetable . . . 20

2.2.2 Signalling and interlocking . . . 21

2.2.3 Blocking time theory . . . 22 ix

(13)

2.2.4 Train position detection . . . 24

2.2.5 Classification of train delays . . . 26

2.2.6 Operational control of railway traffic and transport . . . 26

2.3 Review of approaches for data mining of traffic realisation data . . . . 30

2.4 Review of approaches for process time estimation . . . 32

2.4.1 Running time estimation . . . 32

2.4.2 Dwell time estimation . . . 33

2.4.3 Headway times . . . 34

2.5 Review of delay propagation analysis and prediction models . . . 35

2.5.1 Delay propagation analysis . . . 35

2.5.2 Identifying structural timetable errors and systematic delays . 37 2.5.3 Delay propagation models . . . 38

2.5.4 Models for delay prediction in real-time . . . 41

2.6 Review of rescheduling models . . . 43

2.7 Discussion. . . 48

3 Process mining of train describer event data 51 3.1 Introduction . . . 51

3.2 Methodological framework of the process mining tool . . . 53

3.2.1 Process mining . . . 53

3.2.2 Process model . . . 53

3.3 The Dutch train describer system . . . 55

3.3.1 System architecture. . . 55

3.3.2 Data structure and information contained in log archives . . . 56

3.3.3 Shortcomings in TROTS log files . . . 57

3.4 Traffic monitoring on open track and in stations . . . 58

3.4.1 Associating signal messages to train number steps . . . 58

3.4.2 Logging of automatic block signal passing events . . . 59

3.4.3 Logging of station events . . . 60

(14)

3.5.1 Process mining train describer data . . . 60

3.5.2 Main algorithm . . . 62

3.5.3 Process discovery . . . 64

3.5.4 Automatic identification of route conflicts . . . 65

3.5.5 Identification of hindering trains . . . 65

3.5.6 Estimation of departure and arrival times . . . 65

3.6 Process mining tool . . . 66

3.6.1 Case study . . . 66

3.6.2 Graphical user interface . . . 67

3.7 Conclusions . . . 70

4 Data analysis and estimation of process times 73 4.1 Introduction . . . 73

4.2 Methodological framework for statistical analysis . . . 75

4.2.1 Description of the data set . . . 75

4.2.2 Global model . . . 75

4.2.3 Local model . . . 77

4.3 Statistical learning methods . . . 77

4.3.1 Multiple linear regression . . . 77

4.3.2 Tree-based non-linear methods . . . 79

4.4 Process time estimates – global model . . . 81

4.4.1 Running time estimates derived from the global model . . . . 81

4.4.2 Dwell time estimates derived from the global model . . . 85

4.5 Process time estimates - local model . . . 90

4.5.1 Estimation of running times over a particular block . . . 90

4.5.2 Estimation of dwell times for a particular station . . . 93

4.6 Comparison of statistical models . . . 95

4.6.1 Comparison of running time estimation models . . . 95

4.6.2 Comparison of dwell time estimation models . . . 96

4.6.3 Comparison of prediction accuracy for scheduled processes . 96 4.7 Conclusions . . . 98

(15)

5 Real-time prediction of train event times 101

5.1 Introduction . . . 101

5.2 Framework of the real-time prediction tool . . . 102

5.3 Microscopic graph based model . . . 104

5.3.1 The graph model . . . 104

5.3.2 Graph construction . . . 105

5.4 Computation of arc weights. . . 107

5.4.1 Running and dwell arc weights. . . 109

5.4.2 Headway and connection arc weights . . . 109

5.4.3 Online process time estimation. . . 110

5.4.4 Time loss due to route conflicts . . . 110

5.5 Online prediction of event times . . . 113

5.5.1 Prediction algorithm . . . 113

5.5.2 Adjusting the running time estimates due to route conflicts . . 115

5.5.3 Adaptive adjustments of running time predictions . . . 116

5.6 Application on a case study . . . 118

5.6.1 Experimental setup . . . 118

5.6.2 Description of the case study . . . 118

5.6.3 Comprehensive analysis . . . 119

5.6.4 Example of algorithm execution . . . 122

5.7 Conclusions and outlook . . . 125

6 Rescheduling models for real-time traffic management in large networks 127 6.1 Introduction . . . 127

6.2 Macroscopic modelling of railway operations . . . 128

6.2.1 Timed event graphs . . . 128

6.2.2 Alternative graphs . . . 129

6.2.3 Conversion of timed event graphs to alternative graphs . . . . 132

6.2.4 Resources as building blocks of alternative graphs . . . 133

(16)

6.3 Models examined . . . 137

6.3.1 Macroscopic models . . . 137

6.3.2 Mesoscopic model . . . 141

6.3.3 Overview of the five models . . . 141

6.4 Test case A: corridor Utrecht - Den Bosch . . . 141

6.4.1 Test case settings . . . 142

6.4.2 Comprehensive evaluation . . . 144

6.5 Test case B: Dutch national railway network . . . 146

6.5.1 Description of the tested instances . . . 146

6.5.2 Comprehensive evaluation . . . 147

6.5.3 Network-wide effects of reducing delay propagation . . . 149

6.6 Conclusions and outlook . . . 149

7 Conclusions 153 7.1 Summary of the main findings and contributions . . . 153

7.1.1 Monitoring and traffic state prediction . . . 154

7.1.2 Network-wide traffic rescheduling . . . 157

7.2 Recommendations for future work . . . 158

Bibliography 161

List of acronyms 178

Summary 179

Samenvatting 183

About the author 187

(17)

(18)

1.1 Hierarchical structure of traffic control . . . 2

1.2 Railway map of the Netherlands . . . 3

1.3 Workplace of a local traffic controller in Amsterdam . . . 4

1.4 Cascade MPC framework for traffic control . . . 8

1.5 Research objectives integrated in a closed loop . . . 9

1.6 Integration of requirements for real-time prediction tool . . . 11

1.7 Flowchart of the thesis structure . . . 16

2.1 Blocking time . . . 23

2.2 Route conflict . . . 23

2.3 Blocking time stairways . . . 24

2.4 Infrastructure based train detection . . . 25

2.5 Regular time interval train detection . . . 25

2.6 Structure and information flow within operational planning level . . . 27

2.7 Illustrative example of the parallel-shift prediction method . . . 29

3.1 Process mining framework . . . 54

3.2 Events and processes in micro and mesoscopic models . . . 54

3.3 Three-layer process model . . . 55

3.4 Screen shot of a TROTS log flle . . . 59

3.5 Process mining TROTS data . . . 61

3.6 Flowchart of the process mining algorithm . . . 63

3.7 Example network . . . 64

3.8 Observed area for the case study . . . 67 xv

(19)

3.9 Graphical user interface . . . 68

3.10 Train selection panel . . . 68

3.11 Infrastructure selection panel . . . 69

3.12 Time distance diagram . . . 70

3.13 Blocking time diagram . . . 71

4.1 Regression tree for running time estimation . . . 83

4.2 Relative running time prediction error depending on the tree size . . . 84

4.3 R2of the running time model depending on the tree size. . . 84

4.4 MSE of running time model depending on the number of trees . . . . 85

4.5 R2of running time model depending on the number of trees. . . 85

4.6 Regression tree for dwell time estimation . . . 88

4.7 Relative dwell time prediction error depending on the tree size . . . . 89

4.8 R2of the dwell time model depending on the tree size . . . 89

4.9 MSE of the dwell time model depending on the number of trees . . . 89

4.10 R2of the dwell time model depending on the number of trees . . . 90

4.11 Dependence of running time on delay (left) and box-plots of running times for punctual and delayed trains (right) . . . 91

4.12 R2for prediction of running time on The Hague HS – Rotterdam corridor 92 4.13 Delay over corridor Leiden - Dordrecht for train line 2200 . . . 92

4.14 R2for prediction of dwell times on Leiden – Dordrecht corridor . . . 93

4.15 Dependence of dwell time on delay (left) and box-plots of dwell time (right) . . . 94

4.16 Dependence of dwell time on scheduled departure time . . . 94

4.17 Prediction error for dwell times of delayed trains . . . 95

4.18 Prediction error of running time estimation models . . . 96

4.19 Prediction error of dwell time estimation models. . . 97

4.20 Precision of dwell time and running time estimates . . . 97

4.21 Precision of dwell time and running time estimates relative to sched-uled process time . . . 98

(20)

5.2 Space-based train separation . . . 106

5.3 Time-based train separation. . . 106

5.4 An example of a mesoscopic DAG . . . 108

5.5 Time loss dependence on conflict duration: quadratic fit for short (up) and linear fit for long conflicts (down) . . . 112

5.6 An example of execution of Algorithm 2 . . . 115

5.7 An example of route conflict prediction . . . 117

5.8 A schematic example of adaptive prediction . . . 117

5.9 Network and train lines for the case study . . . 119

5.10 Box plots of prediction error distributions for different prediction hori-zons . . . 120

5.11 Mean absolute prediction error depending on prediction horizon . . . 121

5.12 MAE comparison for adaptive and nonadaptive prediction . . . 121

5.13 MAE of scheduled event times for a parallel shift strategy and the real-time prediction . . . 122

5.14 Time-distance diagram of predicted (at 7:13) and realised train paths . 123 5.15 Blocking time diagram predicted at 7:13 (up), realized blocking time diagram (down) . . . 124

5.16 Effects of adaptive prediction . . . 125

6.1 Graph representation of resources with infinite capacity . . . 133

6.2 Graph representation of resources with infinite capacity and headway constraint (left) and a possible selection (right) . . . 134

6.3 Graph representation of resources with infinite capacity and FIFO con-straint (left) and a possible selection (right) . . . 135

6.4 Graph representation of resources with finite capacity (left) and a pos-sible selection (right) . . . 135

6.5 Example of sequence-dependent setup times . . . 136

6.6 Layout of the illustrative example . . . 137

6.7 Illustrative example - Model 1 . . . 138

6.9 Incompatibility graph of illustrative example. . . 140

(21)

6.12 Layout of infrastructure and main stations . . . 142

6.13 Timetable . . . 143

6.14 Dutch railway network considered (in black), with main stations . . . 147

(22)

2.1 Summary of presented approaches for real-time rescheduling . . . 48

3.1 Train number messages generated by TROTS . . . 56

4.1 Summary of the training set for running time estimation . . . 81

4.2 Summary of the LTS model for running time prediction . . . 82

4.3 Summary of the training set for dwell time estimation . . . 86

4.4 Summary of the LTS model for dwell time prediction . . . 87

5.1 Model size for different prediction horizons . . . 119

6.1 Operational constraints in models . . . 142

6.2 Quantitative assessment of the 5 models . . . 144

6.3 Difference in orders between the mesoscopic and each macroscopic

model. Direction Ht → Ut . . . 146

6.4 Characteristics of the network-wide test case. . . 148

6.5 Quantitative assessment of the macroscopic models on test case B . . 148

(23)

(24)

Introduction

1.1 Background

A railway system integrates infrastructure, rolling-stock, staff and a set of strict op-erational rules into a functional system for transporting passengers and goods. Each of the afore mentioned subsystems represents a very complex structure of intercon-nected entities. Following the directive of the European Commission (2001), a number of railway systems in Europe are horizontally separated into infrastructure managers (IMs) and train train operating companys (TOCs). This reform was introduced in order to improve the competitiveness and modal share of railways on the transport market (C. Nash,2010). TheIM is responsible for management and maintenance of railway infrastructure, allocating railway capacity (train paths) to TOCsand organizing and controlling traffic along the network. On the other hand, the main task of a TOCis planning, operation and control of passenger and freight transport.

AnIMhas the task to coordinate train path requests ofTOCs, allocate infrastructure ca-pacity through a timetable in the tactical planning stage, and control traffic in real time at the operational planning level. A timetable in railway traffic is the master schedule that reflects the relationship between the supply and demand in the railway sector. It contains the scheduled process time of each operation, i.e., a train dwelling in a sta-tion, running between two scheduled stops, passenger transfers, rolling-stock or crew connections, etc. However, daily variations and unforeseeable disruptive events may render the planned timetable infeasible. Such events are inevitable in modern railway systems with many interacting processes that depend on human behaviour, technical devices, and the environment. In busy and heavily utilized networks, a deviation from the planned path of a single train can easily propagate as a secondary delay to other trains that run over the same infrastructure or have a planned passenger, rolling-stock or crew connection. Prevention and minimisation of delay propagation, and main-taining timetable feasibility are the main responsibilities of operational traffic control (Pachl,2009).

Railway traffic control is typically hierarchically structured into a local and a global 1

(25)

(network) level (Figure 1.1). Local traffic control has the task to perform all safety related actions, set routes for trains, predict and solve conflicts, and control processes that take place on the designated part of infrastructure (Kecman, Goverde, & Van den Boom, 2011). A train typically crosses multiple traffic control areas controlled by different local controllers (signallers and/or dispatchers). The global level (regional or network controllers) comprises the supervision of the state of traffic on the network level, detection of deviations from the timetable, resolution of conflicts affecting the overall network performance, handling failures and events that may have big impact on performance indicators, etc.

Network controller Local controller n Local controller 2 Local controller 1

Figure 1.1: Hierarchical structure of traffic control

1.2 Railway traffic control in the Netherlands

The railway system in the Netherlands is a typical example of a highly interconnected transport and traffic network. It is one of the most densely utilized networks in the world (Hansen, Wiggenraad, & Wolff,2013). More than 3000 km of railway tracks that connect 404 stations in virtually all cities in the country (Figure1.2), are managed by the infrastructure manager ProRail (ProRail,2013). With regard to the traffic volumes, i.e., the number of trains, train kilometres and the amount of passengers and goods transported per line kilometre, the Dutch railway network performs almost as well as Switzerland and Japan (Wolff,2011). At any moment in time during peak hours, there are approximately 400 running trains, 93% of which are passenger trains, mainly operated by the national operator Netherlands Railways (NS) (NS Group,2013). In the heavily utilized Dutch network, trains are scheduled with short headway times and small time supplements using the advanced timetabling tool Design of Network Schedules (DONS) (Hooghiemstra, Kroon, Odijk, Salomon, & Zwaneveld,1999). Due to dense traffic and interconnected services, delays and incidents that occur in one part of the network may easily propagate through the whole network (Goverde,2007). Therefore, operational traffic control has an important task to maintain the planned schedule and recover from disruptions as quickly as possible in order to minimise delays and increase traffic reliability.

Railway traffic control in the Netherlands is divided in two levels with 13 local traffic control centres (Figure 1.3) and a network control centre Operational Control Centre Rail (OCCR). Timetable updates are negotiated and determined in theOCCRbetween

(26)

In use for passenger and cargo trains Cargo trains only

Figure 1.2: Railway map of the Netherlands

the network traffic controllers of ProRail and transport controllers ofNS. The network traffic controllers receive information about the current delays and traffic condition from the local level. Their task, in cooperation with the TOC, is to coordinate the dispatching measures that would decrease deviations from the planned timetable on the network level. The controllers on behalf of NS verify that the proposed updates are feasible with respect to rolling-stock and crew circulation plans. Finally, timetable updates are transmitted to the local level for implementation.

Local traffic control centres are staffed with signallers, who are in charge of controlling the signals and setting routes in stations, and dispatchers, who observe and supervise traffic conditions and resolve conflicting train routes. In order to manage the complex tasks of controlling dense traffic, controllers and signallers are supported by computer tools for traffic monitoring, remote control of signals and switches, and automatic route setting (Renkema & Van Visser, 1996). Process plans containing the planned routes and schedules for each train are transmitted to the traffic control centres one day in advance. Train positions are monitored and reported by the train describer system Train Tracking and Observation System (TROTS).TROTS messages are received by the traffic control system Verkeersleiding (VKL) that compares the train arrival and departure times with the daily process plan. Train delays are derived by the VKL

(27)

full minutes.

Figure 1.3: Workplace of a local traffic controller in Amsterdam (Source: ProRail, photo by Jos van Zetten)

The dispatching support system Procesleiding (PRL) is provided with the process plan and actual train delays from VKL. It comprises the command and monitoring inter-face with the signalling and interlocking system. Traffic is visualised in the form of a dynamic planned time-distance diagram. PRLis furthermore equipped with an auto-matic route setting system Automatische Rijweginstelling (ARI), which sets the routes for trains according to the actual process plan and train delays (Berends & Ouburg,

2005). Route conflicts are resolved according to the first-come-first-served principle based on the current train positions or according to the relative train order defined in the timetable. Signallers may also use PRL to set the routes manually in order to manage disruptions and resolve conflicting routes in the process plan. However, they have to rely on their own experience and a set of predetermined what-if scenarios. More details about the Dutch railway system and practice in traffic control are given by Goverde (2005).

1.3 Motivation

While a timetable is carefully planned a year in advance using the sophisticated math-ematical models, the daily operational control of disruptions and delays still relies pre-dominantly on the predetermined rules and experience and skills of personnel. Traf-fic controllers do not commonly have any intelligent support tools such as short-term traffic prognosis, conflict detection and prediction or optimal dispatching. Moreover, working in a preventive manner is poorly supported and train traffic controllers are usu-ally restricted to just solving problems as they occur (Kauppi, Wikstr¨om, Sandblad, & Andersson,2005).

(28)

1.3.1 Short-term traffic prediction

Signallers typically do not have any intelligent decision support system to estimate the expected running times. In the current Dutch practice, their situational awareness about current train delays is further limited due to the imprecision in the measurements of actual delays. Local traffic control usually takes the expected arrival delay equal to the current upstream delay as they have no information about the possible recovery times (except from experience) (D’Ariano,2008). This method neglects the fact that some trains make up for their delays by running in the maximum performance regime and exploiting the running time supplements incorporated in the timetable. On the other hand, other trains may get even more delayed due to a possible time loss in route conflicts.

Delay propagation could be prevented or reduced if the traffic was managed proac-tively, i.e., if controllers had a reliable prediction of a route and connection conflict with a possibility to prevent it. The main advantage of predictive traffic management is that the traffic controllers can anticipate the occurrence of conflicts so that they have enough time to prevent them using a conflict resolution method. Potential conflict-ing train paths in the current process plan need to be predicted in advance based on the accurate monitoring of train positions, speed and infrastructure conditions such as temporary track or speed restrictions that can affect the process times. As a result, unscheduled stops before red signals could be avoided by resolving such conflicts in advance using e.g. rerouteing, changing the train order at the conflict point, retiming or giving speed advice to the drivers.

Several approaches to traffic state prediction can be found in the current practice or academic literature. Macroscopic models (Berger, Gebhardt, M¨uller-Hannemann, & Ostrowski,2011; Hansen, Goverde, & Van der Meer,2010) focus on predicting only the event times in stations (departures, arrivals and through rides). That way, the train separation principles cannot be accurately modelled and the route conflicts on open track sections (between two stations) cannot be predicted. On the other hand, the meso-scopic prediction models that are integrated in the traffic control systems such as Rail Control System (RCS) in Switzerland (Dolder, Krista, & Voelcker, 2009), Styrning av T˚ag genom Elektronisk Graf (STEG) in Sweden (Isaksson-Lutteman,2012) or the short-term prediction module of the rescheduling system Railway traffic Optimization by Means of Alternative Graphs (ROMA) (D’Ariano, 2008) may support the traffic controllers in conflict detection. Every signal passing event is explicitly included in those models. However, the estimates of running and dwell times are computed inde-pendently from the actual traffic state. Different performance regimes between delayed and punctual trains, the impact of peak hours or behavioural factors are not considered in the predictions. We use the terminology and definition of macroscopic and micro-scopic level of modelling described by Radtke (2008) and Schlechte, Bornd¨orfer, Erol, Graffagnino, and Swarat (2011). Macroscopic models consider only station events such as departures, arrivals and through rides. On the other hand, a train run is mod-elled microscopically on a detailed level of track-clear detection sections. In this aspect

(29)

we also define a class of mesoscopic models that model traffic on the level of block sections and station routes. The different levels of modelling are explicitly defined in Section3.5.

1.3.2 Network-wide traffic management

The current practice in operational control of disruptions and delays still relies pre-dominantly on the predetermined rules and experience and skills of personnel. Neither local nor network traffic controllers have a reliable supporting tool to make despatch-ing decisions, predict their effect and evaluate them. For local traffic controllers, that often leads to creating new conflicts in the adjacent areas and suboptimal effects on the network wide level. Even the advanced, recently developed tools that can pro-duce optimal solutions for traffic disruptions for a single traffic control area (Caimi, Fuchsberger, Laumanns, & L¨uthi,2012; D’Ariano, Pranzo, & Hansen, 2007) or mul-tiple areas (Corman, D’Ariano, Pacciarelli, & Pranzo, 2012b) are not able to tackle large, network-wide instances, due to the demanding computational requirements of inter-area coordination (Corman, D’Ariano, Pacciarelli, & Pranzo,2014).

A decision support system for network-wide traffic management is required to con-tinuously supervise traffic on the network and create network-optimal updates to the timetable, that can be used as a reference master schedule by the local control level. Due to a high combinatorial complexity of the train rescheduling problem (T¨ornquist,

2006) the tools developed so far are not directly applicable for large and busy net-works. An important requirement for real-time railway applications is the knowledge of the actual train positions, speed and the time needed for computing the solution, perception, decision and subsequent execution of the dispatching measure (e.g. lock of signal or set-up of a new route). A feasible or (near) optimal solution has to be produced and implemented before it is outdated. In other words, the computation time must not exceed the validity of prediction of the traffic state that is given as an input to the rescheduling problem (L¨uthi,2009).

Simplifications to the existing microscopic and mesoscopic tools that reduce the prob-lem complexity were therefore required for applications on the level of national net-works. A series of macroscopic models was developed that do not regard all capac-ity constraints on open track sections and in station areas (T¨ornquist,2007; Van den Boom & De Schutter, 2007). Another stream of research was directed to so called delay management with the purpose to optimise passenger delays by controlling the planned passenger transfer connections (Dollevoet, Huisman, Schmidt, & Sch¨obel,

2012; Schachtebeck & Sch¨obel, 2010; Sch¨obel,2007). However, these macroscopic models for rescheduling were tested mostly on subnetworks of a national network or large urban networks. Therefore, the problem of controlling railway traffic on the level of national network still remains unsolved.

An important aspect with high impact on applicability of the existing rescheduling tools in practice is the way they handle uncertainty. Running and dwell times are in

(30)

practice characterised by variability. Moreover, the interdependence of train runs may increase the uncertainty of delays in future. Apart from the recent contributions di-rected at non-anticipative delay management (Bauer & Sch¨obel,2014; Gatto, 2007), other rescheduling models assume full knowledge of current delays and their propaga-tion (with or without any rescheduling acpropaga-tions) which a limitapropaga-tion for practical appli-cation. For that reason, the accuracy of delay predictions is very important for validity of offline rescheduling models.

1.3.3 Model-predictive control

A possible way to model and optimize railway traffic control and overcome the prob-lem of uncertainty is through a closed-loop control paradigm, called model-predictive control (MPC) (Maciejowski,2002). The essential characteristic of the proposed frame-work is that it suggests proactive and anticipative (in contrast to reactive) traffic man-agement. Real-time information can be used to predict the occurrence of potential conflicts. Moreover, delay propagation, resulting from route conflicts and planned connections, is prevented by computing optimal control actions.

The theoretical framework of the closed-loop railway traffic control is presented in Figure1.4. A cascade control system is used to model the hierarchical relationship be-tween the global and a local control level L¨uthi (2009). Trains are operated according to a timetable and a daily process plan. Due to inevitable disturbances and deviations from the planned schedule, train runs need to be continuously monitored. By moni-toring we assume keeping track of all performance indicators such as the actual train positions, delays, realised running and dwell times of all trains, etc. Monitoring there-fore provides the actual traffic state that can be used to predict the future evolution of traffic on the network. A predictive traffic model continuously provides local con-trol level with the information about the expected traffic conditions. It can further be used to evaluate the impact of traffic control actions. In case of larger disruptions that may affect the traffic in a wider area, network traffic controllers can use the prediction model to optimise the traffic on the network, compute the network-optimal timetable updates and transmit them as a reference to the local level. That way all traffic control actions on the local level will lead to the network-optimal traffic state.

MPChas been implemented on the station level by Caimi et al. (2012) and on the level of a corridor by Quaglietta, Corman, and Goverde (2013). Whereas the reschedul-ing models embedded in these approaches can efficiently control traffic in (multiple) control areas, the prediction component relies on the theoretical estimation of running times and minimum dwell times. Variability of the process times is therefore not incor-porated in predictions and prediction accuracy has not been tested against the realised train event times. Moreover, due to high computational requirements, these approaches are not directly applicable for controlling traffic on the level of national network.

(31)

Prediction Timetable Monitoring Local controller Network controller Reference Reference Railway operations

Figure 1.4: Cascade MPC framework for traffic control

1.4 Thesis objectives

The main objective of the research presented in this thesis is to develop the systems for monitoring, traffic state prediction and network-wide rescheduling that can be em-bedded in the cascadeMPC framework presented in the previous section. The main objective is divided into two research objectives:

• Research objective 1 (RO1) – Develop a tool for monitoring and traffic state prediction

• Research objective 2 (RO2) – Develop a macroscopic model for network-wide traffic control

Research objectives integrated in a feedback loop are illustrated in Figure 1.5. Train positions are reported by a train describer system. The system for monitoring and traffic state prediction (RO1) uses the live stream to determine the actual and future traffic conditions. In case of deviations with impact on a large part of the network, a rescheduling tool (RO2) can produce a network-optimal timetable update as the new reference for railway operation.

1.4.1 Research objective 1 – Monitoring and traffic state

predic-tion

The first research objective in this thesis is to develop a system for monitoring and traffic state prediction. A way to overcome the drawbacks of the current practice and the existing tools for monitoring and short-term traffic prediction (§1.3.1) emerged with the availability of historical traffic realisation data. A real-time stream of raw train describer data can be processed in a way that extracts the actual traffic conditions in

(32)

the network: train positions, accurate estimates of current delays and realised running and dwell times. Moreover, the archives of event logs can be used to learn how trains behave depending on the traffic conditions. The variability of process times can thus be explained by isolating the factors with a high impact on the corresponding process time. Estimates of future process times depend on the current or predicted values of explanatory variables. Therefore, the predictions will incorporate the empirically determined variation of process times due to e.g. driving style, passenger behaviour or peak hours.

In order to develop a system for for monitoring and traffic state prediction, a number of requirements needs to be fulfilled.

The first requirement is to develop a data processing tool consisting of a detailed process model of railway traffic and an environment comprising the objects that rep-resent the infrastructure elements and trains. The archives of event logs of the Dutch train describer systems have already been used for reconstructing the realised train paths (Goverde & Hansen, 2000) and identifying route conflicts (Daamen, Goverde, & Hansen,2008). However, the changes in the data structure and system architecture require a fundamentally different approach. Such environment should be compatible with an online stream of train describer messages. All objects need to be updated in real time, thus providing the current state of traffic in order to raise the situational awareness of controllers. Moreover, the data processing tool needs to be applicable for ex post processing of traffic realisation data.

The second requirement is to derive robust estimates of process times. The appli-cation of the data processing tool results in the clean, structured data that are prepared for analysis. The earlier efforts in punctuality analysis (Goverde,2005; Yuan,2006) focused on computing the descriptive statistical parameters and deriving probability distributions. The resulting distributions can be used for an ex ante timetable analysis and development of stochastic models (B¨uker & Seybold,2012; Medeossi, Longo, &

Liveydatay stream Monitoringyandy prediction (RO1) Actualyandyfuture ytrafficystate Networkywide rescheduling (RO2) Railway operations

(33)

de Fabris, 2011). However, in order to develop the predictive models, it is necessary to determine a set of explanatory variables and quantify their impact on process times. The aim is therefore to apply statistical predictive modelling on a training set of histor-ical traffic realisation data. The predictive models can be used to compute the estimates of process times with respect to the actual values of explanatory variables that reflect the state of the traffic on the network. An important requirement in predictive mod-elling of process times is that the resulting process time estimates need to be robust to noise, missing data and the outliers in the real-time data stream. The robustness of the models is considered as one of the key criteria for selection of the appropri-ate statistical learning technique (§4.3). Robust estimates and coefficients that reflect the dependence of process times on explanatory variables are stored in a database of historical data.

The third requirement is to build a real-time prediction tool. A live stream of train describer data can be processed with the processing tool and give the actual state of the network. Given the information on the current train positions, actual delays, and the re-alised running and dwell times, the robust estimates of process times can be computed in real time using the predefined predictive models. However, accurate modelling of dependencies between processes of a single train as well as among multiple trains is required in order to predict traffic in large networks over long prediction horizons. The third requirement is therefore to create a fast prediction algorithm that calibrates the realistic traffic model in real time based on the current (future) traffic condition, and estimates the realisation times of all events within a prediction horizon.

The integration of the three requirements defined to reach the first research objective in this thesis is illustrated in Figure1.6. The approach consists of two parts. The offline part (dash-dot box) comprises data mining of a training set of raw train describer data and creating a database of historical traffic realisation data. The online part (dashed box) processes the live data stream, determines the traffic conditions in the network and predicts the future traffic state using the historical database. Note that the processing tool can be used in both parts of the tool.

1.4.2 Research objective 2 – Rescheduling models for

network-wide traffic control

The second research objective in this thesis is to develop and validate a macroscopic rescheduling model that can be applied for optimal control of traffic in large and heav-ily utilised networks. An approach of representing the railway rescheduling problem as a job-shop scheduling problem modelled with alternative graphs has been devel-oped by Mascis and Pacciarelli (2002) and Mascis, Pacciarelli, and Pranzo (2002). In a series of improvements of the solution procedure, the model has been successfully applied for optimal control of traffic in station areas (Mazzarello & Ottaviani,2007), on a corridor (D’Ariano, Pranzo, & Hansen,2007) or in multiple traffic control areas (Corman et al., 2012b). However, the problem of controlling country-wide traffic is

(34)

Data processing Live data stream

Process time estimates Prediction Database Future traffic state Actual traffic state Online Offline

Training data set

Figure 1.6: Integration of requirements for real-time prediction tool

still open since the coordination of local areas is hard to tackle within a short time and there are multiple interdependencies between trains across the whole network. There-fore, the granularity of the alternative graph model needs to be modified in order to become applicable to problems of rescheduling trains on a large scale network-wide level.

An important objective of this work is to search for a compromise between the precise modelling of railway capacity constraints and a reasonable time to compute the alter-native solutions for the large scale railway traffic management instances. A suitable choice of granularity of the macroscopic model is crucial in order to find the balance between limiting the problem complexity and maintaining the feasibility of produced solutions.

1.5 Thesis contributions

This section presents the main theoretical, methodological and practical contributions of the research project presented in this thesis. As outlined in the previous section, the research focused on studying, extending knowledge and improving the two main aspects of operational traffic control.

(35)

1.5.1 Monitoring and real-time traffic state prediction

Data processing tool

The presented work on fulfilling this requirement provides an insight into data struc-ture and system architecstruc-ture, and into the advantages and drawbacks of using the new Dutch train describer systemTROTS for traffic performance analysis. The previously developed algorithms for processing train describer event data (Goverde & Hansen,

2000) and automatic conflict identification (Daamen et al., 2008) have been taken as a starting point in developing the new data processing tool. However, the new data structure implemented inTROTSrequired a fundamental modification of the existing algorithms. The tool is developed in an object-oriented environment which makes it suitable for real-time application in monitoring train movements over the network. In other words the tool is able to process large data sets in short time, thus it is applicable for processing live data streams as well as large archives of traffic realisation data. An important contribution is a data mining algorithm that can learn the mutual depen-dence between track sections and signals implemented in signalling and interlocking systems from the TROTS data. This allows straightforward coupling of signal as-pect changes to the train numbers that have caused them, sinceTROTSdata structure does not reveal a connection between messages coming from signals and train num-ber messages. Moreover, the automatic block signals on open tracks between stations are not logged. This problem has been overcome by incorporating the signalling and interlocking logic in the data mining algorithm, thus enabling accurate monitoring or reconstruction of the realised train paths even in these ‘dark territories’.

The methodology of process mining (Van der Aalst, 2011) has been applied for the first time for mining the train describer event data. This required a development of a 3-level process model that reflects the majority of microscopic operational constraints of railway traffic. The work resulted in a software tool for automatic recovery of train paths and route conflict identification. Application of the developed tool on a set of train describer data significantly increases the precision of delay estimates and enables distinction between primary and secondary delays.

The tool is equipped with a graphical user interface that simplifies analysis of the re-alised or actual traffic conditions. Traffic data for a particular corridor or station can be selected and visualised to enable the analyst to focus on a specific instance. Moreover, the tabular output of occupation and blocking times of infrastructure elements can be exported and used for analysis of process times and realised capacity utilisation. Re-alised train paths and route conflicts are visuRe-alised using time-distance and blocking time diagrams.

The main contributions are summarised in the following list:

(36)

• recovery of process times on the level of signals and block sections in station areas and open tracks

• automatic identification of route conflicts in station areas and open tracks • computation of delays for all scheduled arrivals and departures

Robust estimates of process times

Advanced predictive modelling and statistical learning techniques are used to develop process time prediction models with strong predictive power. Two different approaches are presented. A single general predictive model is developed that, given the current traffic condition, accurately predicts all running and dwell time estimates. The results of this generic approach can be generalised to the parts of the network and train lines that are not included in the training set. Strong and quantified predictive power of the presented models indicate the applicability of presented approach for deriving accurate process time estimates.

Moreover, the data structures obtained using the data processing tool motivated the development of separate statistical models for each block section (station route, plat-form) and each train line. The variability of running and dwell times was explained with greater precision by significantly reducing the number of predictors. Both ap-proaches are validated on an independent test set. We show that the application of the local statistical models produces more accurate predictions. Earlier approaches in this direction (Van der Meer, Goverde, & Hansen,2010) are improved to include running times on the level of block sections, headway times and time loss due to route conflicts. Robust regression (Rousseeuw & Driessen, 2006), regression trees (Breiman, Fried-man, Ohlsen, & Stone,1984) and random forests (Breiman,2001) were used for com-puting the process time estimates. The resulting estimates are insensitive to outliers and data errors which is crucial for real-time applications. Therefore, stability of process time predictions is ensured, which is of utmost importance for reliability of estimates and controlling the error propagation to other dependent processes. Moreover, all cap-tured dependencies and results are interpreted and validated using domain knowledge. The main contributions are summarised in the following list:

• a set of predictors for estimation of dwell times and running times on the level of block sections

• global predictive models for process time estimation based on robust linear re-gression, regression trees and random forests

• robust regression models for process time estimation for each combination of block, station and train lines

(37)

Real-time traffic state prediction

A real-time prediction of train event times is the main contribution of this thesis. Very little work on real-time prediction exists in the current literature and the existing ap-proaches rely on the static predictions that are independent of the actual traffic condi-tions. Therefore, this work required a development of a new methodology to predict the traffic state. The data-driven approach monitors the current traffic conditions on the network and performs the prediction of the future events.

A mesoscopic traffic model is developed that reflects the microscopic traffic constraints on open track sections and in station areas. The graph model can be continuously up-dated with new information about the train positions or traffic control actions. Fur-thermore, a fast prediction algorithm has been implemented that in a single execution calibrates the model and computes the predicted realisation times of all events within a prediction horizon. The model is calibrated depending on the actual traffic conditions on the observed parts of the network.

The mesoscopic character of the tool allows the accurate prediction of route and con-nection conflicts. For every predicted route conflict, the time loss of the hindered train due to braking, re-acceleration, running with lower speed and unscheduled stops is modelled realistically. The dependence of time loss on the conflict duration is deter-mined from the historical traffic realisation data and quantified. The train dynamics can thus be accurately modelled and the computationally demanding iterative approach to deriving the running times of hindered trains (D’Ariano, Pranzo, & Hansen,2007) can be avoided.

In order to further increase the accuracy of predictions in real-time, an adaptive online error-smoothing component has been implemented. The prediction errors for running trains are monitored and an adaptive filter computes the adjustment to the downstream process time estimates. The trains that significantly deviate from the estimated trajec-tories are therefore identified in real time and the prediction error for future processes is decreased.

A comprehensive analysis of algorithm performance has been carried out on a real-life case study. The computation speed and accuracy of predictions prove the applicability of the concept of data-driven predictions. The obtained results indicate a significant improvement of precision compared to the approaches used in the current practice and implemented in the relevant academic tools. The stability of predictions over different horizons is examined and the optimal prediction horizon is determined with respect to the accuracy of predicted arrival and departure times and accurate prediction of route conflicts.

• a mesoscopic traffic model that reflects microscopic constraints on open track sections and in stations

(38)

• a prediction algorithm that can quickly predict the traffic evolution in large and busy networks over a long prediction horizon

• adjustment of running times of the trains hindered in route conflicts • adaptive adjustment of process time estimates in real-time

1.5.2 Macroscopic models for network-wide traffic rescheduling

The main methodological contribution of the work on this objective is a realistic macro-scopic model for real-time rescheduling that can solve the network-wide problem in-stances in short time. The appropriate macroscopic rescheduling model is created as a result of investigating the trade-off between the quality of solutions and the computa-tion time. The effect of increasing the number of considered macroscopic constraints on solution quality, feasibility and the corresponding computation time is presented. The macroscopic models are validated by comparing their performance with the re-sults obtained using a detailed mesoscopic model model.

Aggregation of mesoscopic constraints to the macroscopic level was performed in a realistic manner that ensures the feasibility of the solutions produced by the macro-scopic model. This includes computation of minimum headway times with respect to blocking time theory instead of using the predetermined norms which is a common approach in current practice and academic research. The modification of the existing alternative graph models is therefore presented that enables computation of minimum headway times with respect to train orders.

The feasibility of the approach is demonstrated by a real-world case study for the Dutch national railway network. It is based on theDONSdatabase represented in the form of a timed event graph (TEG) (Goverde, 2007). A data mining algorithm was developed that sweeps through aTEG, builds the macroscopic resources, i.e., stations, open track sections, and converts theTEGinto an alternative graph based on running, dwell, headway and connection constraints. The model is applied to a substantial number of realistic disruption scenarios in a large instance that includes a peak hour of traffic in the complete Dutch railway network.

• a mesoscopic alternative graph model (D’Ariano, 2008) modified in order to incorporate macroscopic operational constraints

• an approach to convert a timed event graph into an alternative graph

• four macroscopic rescheduling models created to investigate the trade-off be-tween solution quality and computation time

• the most complex model built with respect to macroscopic operational con-straints produces feasible solutions in short computation time

(39)

1.6 Thesis outline and scope

This thesis consists of seven chapters (including this one). Based on the content, the first two chapters can be grouped into an introductory part. Similarly the three chapters focusing on monitoring and traffic state prediction can intuitively be grouped into a coherent content. The structure of the parts and their relationship is illustrated in a flowchart in Figure1.7. Related chapters are grouped and arrows indicate the order in which the chapters could be read.

Chapter 7 Rescheduling models for real-time traffic management

in large networks

Chapter 8 Conclusions Chapter 2

An overview of railway operation planning and control

Chapter 1 Introduction

Chapter 5 Data analysis and estimation of process times

Chapter 6 Real-time prediction

of train event times Chapter 4 Process mining of train describer event data

Figure 1.7: Flowchart of the thesis structure

Part I contains the first two chapters of this thesis. In Chapter2the main concepts of railway systems, definitions and terminology needed for understanding the remainder of the thesis are introduced. Moreover, a review of the most important contributions in the scientific literature related to the problem of railway operation and traffic control is presented.

Part II of the thesis consists of the chapters related to data-driven decision support sys-tems for monitoring and real-time traffic state prediction. Chapter3presents a process model that is used for mining the traffic realization data. The developed process mining method is applied for real-time monitoring of railway traffic and ex post analysis, i.e.

(40)

recovery of realized train paths and identification of route conflicts. The chapter first describes the data structure ofTROTSfiles, preprocessing steps and input preparation. Moreover, the underlying algorithms and procedures are described with a great level of detail. Finally, the graphical user interface and visualisation component is presented that can be used to raise the situational awareness of traffic controllers or simplify performance analysis, depending on the application of the tool.

Chapter4focuses on statistical analysis of traffic realisation data and computing robust estimates of process times. The used statistical learning tools are described, followed by the the descriptive and inferential statistical analyses of process times and route conflicts. Furthermore, we use the real-life data set to test and validate the common assumptions used to describe the variability of process times. We test the impact of delays and peak-hours on process times. Finally, the results of model performance in an application to a test set of historical data are presented.

Chapter5gives a description of a real-time prediction tool as well as its position in the railway traffic control loop. The main prediction algorithm is presented followed by a description of the adaptive components that modify the estimates of process times with respect to the current (unexpected) traffic conditions. A real life case study is further described that is used to test the performance of the complete tool for monitoring and traffic state prediction. The integration of the components for data processing, analysis and prediction is described and model accuracy is extensively discussed.

Chapter6 presents different rescheduling models for dynamic management of large-scale networks. The principles of deriving alternative graph models from macroscopic data are described, as well as the procedure to convert a timed event graph into a rescheduling model. Furthermore, this part focuses on the procedure to find an appro-priate level of granularity for modelling railway traffic on a macroscopic scale with respect to the basic requirements for rescheduling problems such as the solution qual-ity and computation time. A comprehensive analysis of the model performance with respect to the validated mesoscopic model is given, followed by the results of the ap-plication to a real-life case study of the Dutch national network.

Finally, Chapter7summarises the main findings and contributions of the thesis. Lim-itations of the performed research are also discussed and clear directions for further research are given.

As illustrated in Figure1.7there are multiple ways to read this thesis depending on the prior knowledge and interest of the reader. Readers with good knowledge of railway terminology and system properties may proceed directly to Chapter3. Similarly, read-ers with particular interest in rescheduling aspect of dynamic traffic control can, after the introductory part, proceed directly to Chapter6.

(41)

(42)

An overview of railway operation

planning and control

2.1 Introduction

The research motivation and main objectives addressed in this thesis were described in the previous chapter. Before presenting the main contributions of this research in the following chapters, it is important to define the problems of traffic control in more detail and review the existing contributions from the scientific community.

This chapter first presents the adopted terminology and the basic concepts of railway traffic. The operational rules, implemented in the signalling system and timetable, are of crucial importance for developing new and analysing the existing mathematical models of railway traffic. Moreover, the current practice of traffic control is presented and the main problems are identified. The problems related to railway traffic control and performance analysis have been addressed by numerous contributions. We give a critical review of the existing approaches and emphasise the gaps in the state-of-the-art models that are filled by the tools presented in this thesis.

In the first part of the chapter, the basic definitions related to railway timetables, sig-nalling and safety systems, train delays and traffic control are given (§2.2). This is followed by a separate literature review for each research objective and the correspond-ing requirements. Section2.3gives the literature review of processing and mining the train describer and traffic realisation data. The description of running, dwell and head-way times, and approaches to their computation and estimation is given in Section2.4. Section 2.5 presents the recent works on delay analysis, propagation modelling and prediction. The recent real-time rescheduling models are presented in Section2.6. Fi-nally, we discuss the existing practice and models and analyse their applicability for monitoring, traffic state prediction and network-wide rescheduling (§2.7).

(43)

2.2 Terminology and basic concepts of railway traffic

2.2.1 Railway timetable

Railway traffic usually operates according to a timetable. The railway timetable in the Netherlands is periodic, meaning that the pattern of arrivals and departures of all trains is repeated in regular intervals. The timetable construction for the Dutch network is supported by a sophisticated mathematical optimisation model based on the periodic event scheduling problem (PESP) (Serafini & Ukovich,1989), that was applied to the train timetabling problem by Schrijver and Steenbeek (1993).DONSdatabase contains the running and dwell times for each train, as well as the headway and connection constrains that need to be respected in order to design a feasible timetable for dense traffic of interconnected train lines (Hooghiemstra,1996).

The running time comprises the period between the train departure and the complete halt at the arrival station. It contains the outbound running time from the platform track to the departure signal, the running time from the departure signal to the home signal at arrival station and the inbound running time between the home signal until standstill at the platform track.

We distinguish between minimum running times and scheduled running times. Mini-mum running times are computed with respect to the defined maxiMini-mum speed on the train route and dynamic properties of rolling-stock and infrastructure which reflect the acceleration and breaking characteristics (Br¨unger & Dahlhaus,2008). In terms of dy-namic properties, a train run in full performance regime between two scheduled stops can be distinguished into acceleration, cruising at the maximum speed and braking continuously at the standard braking rate until stop at the platform (Albrecht, Goverde, Weeda, & van Luipen,2006).

The scheduled running times are given in the timetable. In order to increase the reli-ability and robustness of the timetable to varying running times and decrease energy consumption, the scheduled running times contain a certain amount of running time supplements (Goverde,2005). The value of the running time supplement, currently in use in the Netherlands, is 5% of the minimum running time. The running time supple-ments can also be used for energy efficient driving. The typical strategies are cruising at a lower speed than maximal and/or by coasting before braking to standstill (Albrecht et al.,2006).

The dwell time is the time between arrival of a train to standstill at the platform track and subsequent departure after the scheduled stop. Weidman (1995) determined the factors with high impact on the duration of dwell times. They include among other: the number, structure and distribution of passengers on the platform as well as the vehicle and platform design. Dwell times are modelled to a great level of detail by dis-tinguishing them into several sub-processes: door unblocking, door opening, passenger boarding and alighting, door closing and train dispatching (Buchmueller, Weidmann, & Nash,2008).