Day-to-day origin-destination tuple estimation and prediction with hierarchical bayesian networks using multiple data sources

(1)

DAY-TO-DAY ORIGIN DESTINATION TUPLE ESTIMATION AND PREDICTION WITH HIERARCHICAL BAYESIAN NETWORKS USING MULTIPLE DATA SOURCES

Yinyi Ma

(Corresponding author) Erasmus University Rotterdam

Department of Decision and Information Science Burg. Oudlaan 50 3062 PA Rotterdam The Netherlands Tel. +31 10 4089783 Fax +31 10 4089010 yma@rsm.nl Roelof Kuik

Erasmus University Rotterdam

Department of Decision and Information Science Burg. Oudlaan 50 3062 PA Rotterdam The Netherlands Tel. +31 10 4082019 Fax +31 10 4089010 rkuik@rsm.nl

Henk J. van Zuylen

Delft University of Technology Section Transportation and Planning 2600 GA Delft

The Netherlands Tel. +31 13 5056331 Fax +31 13 5056332 h.j.vanzuylen@tudelft.nl

Date of Submission: 15 Nov, 2012

Words Count: 5497+ 250* (6 figures + 2 tables) =7497

Submitted for presentation and publication to the 92nd_{annual meeting of the Transportation Research} Board, 13-17 January 2013, Washington D.C.

(2)

ABSTRACT 1

Predicting traffic demand becomes essential, either to understand the traffic state in the future or 2

to take necessary measures for alleviating the congestion in the next time period. Usually, an 3

origin destination matrix (OD) is used to represent traffic demand between two zones in 4

transportation planning. Vehicles are assumed to be homogenous and the trips of each vehicle are 5

examined separately. In fact, this traditional OD-matrix lacks of a behavioral basis and trip based 6

model structure. There is additionally another research stream of travel activity-based research 7

which digs into the individual travel behaviors. This stream really takes care of the trip chain for 8

travelers. But their research scope is on the attributes of the trips, ignoring the road network. In 9

order to link these two fields and to better predict traffic demand, we propose the concept of 10

Origin Destination Tuple (ODT), a sequence of dependent OD pairs. With the help of advanced 11

monitoring systems to identify and track vehicles in the road network, the additional uncertainties 12

from ODTs can be mitigated, reducing the under-specification more specifically. We propose the 13

Hierarchical Bayesian Networks mechanism in Gaussian Space with multi-process to get the 14

posterior of uncertain parameters. The model includes level and trend components to make 15

prediction of future traffic volumes. A case study demonstrates that the proposed method is 16

feasible to predict the demand and the path flow from cameras can reduce the uncertainty in the 17

estimation and prediction process, especially for the OD-tuples. 18

19 20

KEY WORDS: Origin Destination Tuple, Hierarchical Bayesian Networks, Multi-process, 21

Demand Prediction, Multiple Data Sources. 22

(3)

1. INTRODUCTION 1

Predicting traffic demand becomes essential for policy makers to understand what may happen in 2

the future in the road network. For instance, road congestion happens in a special time and day, 3

such as the peak hours, bad weather and festivals. Consequently, authorities want to predict the 4

travel volume during these special events, so as to alleviate congestion. 5

The concept of traffic demand is derived by trip generation in transportation planning. It 6

addresses the issue of production and attraction between two zones, represented by origin 7

destination (OD) pairs. The vehicles travelling on among OD pairs to fulfill demand are usually 8

assumed to be homogenous and the trips of each vehicle are taken to be separated. However, trips 9

of vehicles in reality are inter-related. One vehicle may appear in the time dependent OD-matrices 10

several times a day, according to their schedules or travel plans. Commuters travel from home to 11

work in the morning and back home in the afternoon. Trucks with multi-tasks drive every day 12

from distribution center to a store and later to a port area, for instance. The drivers have to find a 13

resting area after two-hour driving, according to the rule. Actually, the traditional definition of 14

OD matrix lacks a behavioral basis and trip-based model structure (1). This setup ignores the 15

behavioral fact that people plan ahead and choose attributes of each trip (including mode, 16

destination, and departure time) while considering the entire trip chain, not each individual trip 17

separately (1). For traffic engineering, the dynamic OD-matrices are mainly taken as inputs to the 18

dynamic traffic assignment, generating traffic flow on each link. Loop detectors as a unique 19

observation instrument on the highways are used to capture the road network information, to 20

calibrate traffic simulation and to further estimate the OD-matrices. Thus, one reason of the 21

ignorance of the trip chain in the OD-matrices could be argued from the anonymous loop detector 22

data, with which vehicles cannot be identified. 23

Meanwhile, another research stream of travel activity-based research digs into the 24

individual travel behaviors, such as activity schedule and travel choice. Jones et.al. (2) provide a 25

comprehensive definition of activity analysis as: it is a framework in which travel is analyzed as 26

daily or as multi-day patterns of behavior, related to and derived from differences in life styles 27

and activity participation among the population. They take care of the fact that travelers have 28

travel plans as a trip chain, such as from home to work and back (3). Survey data (4, 5) is the 29

main information sources supporting this research. Although surveys may demonstrate some trip 30

chains of travelers, the sparseness of survey data is an issue restricting the presentation of travel 31

behavior. It is consequently hard to replace the OD-matrices by the patterns of behavior as the 32

input of dynamic traffic assignment. Thus, the scope of the patterns of behavior research is 33

normally limited to the demand side, ignoring the road network. 34

Due to the conceptual differences and the data capture limitation, there is no link between 35

the OD-matrices in transportation planning and the trip chains in the behavior activity-based 36

research, although many similarities and potential benefits have been shown (1, 2, 3). In order to 37

fill the gap, we introduce the concept of Origin Destination Tuple (ODT) to represent traffic 38

demand. A tuple as used in set theory is a sequence of elements. An Origin Destination Tuple is a 39

sequence of OD pairs within a certain time period, representing the trip chain in the road network. 40

The traditional OD pair is obviously the simplest case of the OD tuple. However, introducing 41

ODT actually brings the extra challenge to estimate and predict the demand if measurement is 42

based on the link flows only. It is usually assumed that demand for traffic from origin to 43

destination act as an antecedent for the travel volume on links in the network. Since the number 44

of OD pairs is much larger than the number of links, the estimation problem becomes under-45

specified (6). In this respect, the use of ODT is definitely going to deteriorate the issue. 46

Fortunately, nowadays has seen massive advanced monitoring systems, such as 47

Automated Number Plate Recognition (ANPR) cameras and Bluetooth scanners, being installed 48

along the road network, which can identify individual vehicles. GPS navigation systems installed 49

in the vehicles record the exact routes of vehicles. These devices can deliver rich traffic 50

information, which is potentially useful to understand and even predict the trip chain of travelers 51

(4)

in the road network. The good usage of these advanced monitoring systems should decrease the 1

uncertainty of estimating and predicting OD-tuples. 2

The research questions arise how to connect the macro Origin Destination Tuple and the 3

micro activity-based level; how to predict the ODT; and how to fuse the multiple data sources to 4

reduce the uncertainty from ODT. 5

In the following section, the literatures of the dynamic OD estimation and prediction, and 6

the travel activity based model are reviewed. The methodology is presented in section 3. The case 7

study in section 4 illustrates the methodology. And Section 5 finalizes the paper with discussions. 8

9

2. LITERATURE REVIEW 10

In this section, the literature of dynamic OD estimation and prediction is reviewed, considering 11

the different methodologies and the diverse data combinations. Also, the papers related to the 12

travel activity based model are analyzed from the concept and data aspects. 13

14

2.1 Dynamic Origin-Destination Estimation and Prediction 15

There are three main methods to estimate and predict dynamic OD-matrices: least squares, 16

Kalman filtering and Bayesian method. Other methods that have drawn attention are entropy 17

maximization (7), variation inequality (8), gradient approximation (9) and Thompson estimator 18

(10). Most of the researchers use loop detector data, while some of the more recent researches 19

combine such with AVI data. The way that they apply the AVI data ignores the vehicle 20

identification, so trip chain in the road network is unaccounted for. TABLE 1 presents an 21

overview of the literatures. 22

23

TABLE 1 Summary of OD Estimation Literature

24

Least Squares Kalman Filtering Bayesian Method Other Methods

Loop

Li and Moor (2002), Chang and WU (1993),

Ashok and Ben-Akiva (2000),

Lin and Chang (2007), Zhou and Mahmassani (2007)

Hazelton (2008) Nguyen (1988) – entropy maximization,

Nie and Zhang (2008) – variation inequaity,

Frederix and Viti (2011)- gradient approximation Loop, AVI / Bluetooth/GSM Asakura (2000), Zhou and Mahmassani (2006) Dixon (2002), Barcelo (2010)

Van der Zijp (1997) Zhang and Qin(2010)-Horvitz-Thompson estimator

Loop , Survey Cascetta (1993)

25

Researchers apply least squares to minimize the deviation between historical OD-26

matrices and the estimated OD-matrices that fits the traffic flow best (11, 12, 13). Estimators 27

based on least squares have the advantages of being mathematically rather easy to solve, 28

especially for large problems. 29

Kalman filtering method is widely used to adapt model parameters to the measured 30

characteristics of the modeled reality. Considering the state space, this method usually assumes 31

the errors in a Gaussian space where the normality makes the computation easy and efficient. 32

Chang and Wu (14), Ashok and Ben-Akiva (15), Dixon (16), Zhou and Mahmassani (17), and 33

Barcelo (18) use the Kalman Filtering method to estimate and predict the dynamic OD matrices. 34

(5)

The Bayesian method (19, 20) is a classical way to update information by incorporating 1

prior information in a natural manner. It can deal with all kinds of distributions, which means 2

more realistic with less restrictions, but with a quite time-consuming computation due to the 3

requirement of codifying prior knowledge into a prior distribution. 4

These three methods show great similarity in the basic mechanism if normal distributions 5

of errors are assumed. Kalman filtering where normality dominates all the distributions is one 6

specific case of the recursive Bayesian estimation. 7

8

2.2 Travel Activity-Based Model 9

Travel activity-based model integrates household activities, regional demographics and 10

transportation networks in an explicitly time-dependent fashion (21). The activity based research 11

enriches trip generation in the conventional four-step transportation planning process. In order to 12

understand the individual choice behavior, researchers represent it in a discrete choice model (3, 13

22) and micro-simulation (21), based on survey data (3, 23), GIS (24, 25) and GPS data (23, 25). 14

Bowman and Ben-Akiva (3) presented a disaggregated discrete choice activity schedule. 15

The model is designed to capture interactions among individual’s decisions throughout a 24h day 16

by explicitly representing tours and their interrelationships in an activity pattern. They generate 17

the time and mode specific trip matrices for prediction from an available daily survey data in the 18

transportation system level. Wang and Cheng (24) develop a spatial-temporal data model to 19

support activity based transport demand modeling in a GIS environment, identifying spatial and 20

temporal opportunities for activity participation. 21

22

2.3 Summing up 23

Traditional OD-matrix estimation is the input of the road network with anonymous vehicles. In 24

contrast, the focus of the travel activity-based model is on the individual behavior. Both parts of 25

research have been left unconnected yet, due to either the research scope or the lack of individual 26

data in the road network. But actually there is some potential benefit to be reaped for integrating 27

two sides. Our research fills the gap between the two research streams through the concept of 28

OD-tuple and further predicts traffic demand. 29

30 31

3. METHODOLOGY 32

In this section, the concept of Origin Destination Tuple is elaborated, and then the Hierarchical 33

Bayesian Networks mechanism is applied to obtain the posteriori predicted Origin Destination 34

Tuple. Considering the stochastic feature of the model and the computation efficiency, Kalman 35

filtering in Gaussian space is used to get the mean and variance of ODT. 36

37

3.1 Origin Destination Tuple 38

The proposed concept of Origin Destination Tuple is an extension of Origin Destination Pair. An 39

ODT is an ordered set of OD pairs, a number of vehicles with the same entries and exits of the 40

road network. Clustering travelers in the same travel pattern from the geographic point of view 41

during a certain time period actually takes the individual travel behavior into account. ODT 42

brings the travel demand from the aggregated level to the individual behavior level. It not only 43

addresses the issue of the number of anonymous vehicles from an origin to a destination, but 44

specifically focuses on the trip chain of the vehicles with the same travel pattern as well. 45

For example, in the network of FIGURE 3, the traditional OD data for a whole day could 46

be 6000 vehicles for 34 and 5000 for 67. Actually among these demand data, there are 500 47

vehicles travelling with the trips of first 34 and then 67 as an ODT. Consequently, the 48

demand data should be with 3 OD tuples instead of 2 OD pairs: 5500 vehicles for 34, 4500 for 49

14, and 500 for 34~67. Additionally, the way to assign ODT to the network follow the 50

same rules as All-or-Nothing or Stochastic-User-Equilibrium, except that the mapping from loop 51

(6)

detectors should be multi-counted if vehicles pass the same link more than once during a certain 1

time interval. 2

In the short term, predicting ODT can help to better understand the interaction between 3

activity driven travel and real travel behavior in the network. For the long term, transport policies 4

such as road pricing or tolling system to improve the travel situations may refer ODT. 5

6

3.2 Hierarchal Bayesian Networks to Predict Origin Destination Tuple 7

Hierarchical Bayesian Networks represent the probabilistic dependencies between variables as a 8

directed acyclic graph, where each node of the graph corresponds to a random variable and is 9

linked by the conditional probability of that variable given the value of its parents in the graph 10

(27). FIGURE 1 illustrates the diagram of the Hierarchical Bayesian Networks for predicting the 11

Origin Destination Tuple with three layers: hyper-parameter, parameter and data layers. In the 12

hyper-parameter layer, the survey data and an a priori ODT distribution are located. The variables 13

corresponding to volumes of ODT are in the parameter layer. The observations in the data layers 14

are link flows and path flows. 15 16 Estimated ODT D(i,t,d) Link Flows V(l,h,d) Predicted ODT D(i,t,d+1) dynamics Path Flows W(c1c2,h,d) Individual Trip Chain C(r,i,t,d) “DATA” “PARAMETER” “HYPER-PARAMETER” Upscaled Factor A Priori ODT U(i,t,d) Aggregated Trip Chain S(i,t,d) 17 18

FIGURE 1 Hierarchal Bayesian Networks for Predicting Origin Destination Tuple.

19 20

3.2.1 Hyper-Parameter Level 21

The setup in the hyper-parameter level is first to have the individual trip chain from survey data. 22

After aggregating the travelers who have the same travel pattern, the aggregated trip chain can be 23

obtained. Since the aggregated trip chain is a sample data, up-scaling is carried out to generate the 24

a priori demand. 25

The individual activity trip chain

C

_{r i t d}_{, , ,} with individual traveler r having an ODT pattern 26

i at departure time interval t in day d, is related to both scheduled time and the activity location. 27

With the scheduled time, travelers determine the departure time, and with the activity locations, 28

travelers decide the travel patterns. Here we assume that travel modes are either cars or trucks. 29

(7)

Aggregating the number of vehicles r with the same ODT pattern, _{i t d}_{, ,} _{r i t d}_{, , ,} r

S





C

, is to get the 1

sample demand of ODT i and at time interval t and day d. Then a random growth factor



_{i t d}_{, ,} is 2

used to scale up the sample demand, obtaining an a priori ODT

U

_{i t d}_{, ,} . This random factor may 3

consequently bring uncertainty to the further estimation, especially for the extreme case that only 4

one or zero vehicle in a specific ODT. 5 6 , , , , , , i t d i t d i t d

U







S

(1) 7 8

Furthermore, the distribution of an a priori ODT

U

_{i t d}_{, ,} may have the feature of multiple 9

peaks. Kernel density estimation, as one of the non-parametric approaches, is applied to smooth 10

the density function covering the whole observation time period and to estimate density function 11

directly from the available a priori data along time. These known data are treated as stochastic 12

following a certain type such as Gaussian, triangular, rectangular, biweight and Epanechnikov 13

(26). The individual kernels are mixed to generate one main kernel density to represent the 14

density function of a random variable. Here, we assume that each kernel density of a priori ODT i 15

at departure time interval t,

k U

(

ˆ

_{i t d}_{, ,}

)

, follows the normal distribution. It illustrated in Equation 2, 16

with average a priori ODT over the day

U

ˆ

_{i t d}_{, ,} , the total number of time intervals H and smoothing 17 parameter w. 18 19 , , , , , ,

ˆ

1 ˆ

(

)

i t d i t d i t d

U

k U

k

Hw

w



_

















(2) 20 21 3.2.2 Parameter Level 22

The ODT denoted as Di,t,d is a parameter in this layer. Predicting ODT in the next day Di,t,d+1 uses 23

exponential smoothing with the weights



. To examine the shares of the past demands for 24

predicting the future demand, we treat the weights



as unknown. Thus, the applied trend model 25

of the dynamics is a Multi-process Model (28). Here we treat the weights constant in the demand 26

dynamics model, but the uncertainties on their values are updated through the flow observations 27

in the measurement model. It is discussed in the subsection 3.2.3. 28 29 1 , , 1 , , , 1 0 x i t d z i t d z t d z D _ 



D _



_  



 (3) 30 31

In order to represent this evolution process in a convenient way, we collect the demands 32

in different days into vectors as T_{, ,}

(

_{, ,}

,

_{, ,} ₁

, ,

_{, ,} ₍ ₁₎

)

i t d i t d i t d i t d x

D



D

_

D

_{ } , including the demands

33

from the day d to the day d (x 1), where x is the length of recursive process. The first 34

component of the error vector



_{t d}_, _₁ is assumed to follow a normal distribution with null mean 35

and variance



_{t d}_, _₁. 36

(8)

, 1 , , 1 , ,

0

t d i t d d i t d

D

B D



_ 































(4) 1

Additionally, the trend model can produce the demand at day d+k with the recursive 2

Equation 4. Analytically, the expression is derived. 3 , 1 , , , , 1

0

t d k z k k k z i t d k d i t d d z

D

B D

B



_{  }   

































(5) 4 5 where, 6 1 2 1

1

0

1

0

1

0

x x

B

 



_





































7 8

Further, if we want to have the future demand in the next k days, the approach based on 9

Equation 3 is as follows. Given the demand in the past days, taking today and yesterday for 10

instance, we would like to have the expression of the demand forecasting k days further in the 11

future,

D

_{d k}_ . For reasons of the expositional ease we work on the situation of a two-day case in 12

some more detail. First, considering the fact that the eigenvalues of the evolution matrix 13

1

1 0

B_{ } _

  are 1 and  (1



), we propose two new variables Xd and



D

d which are

14 defined as below. 15 1 1

(1

)

d d d d d d

X

D



_ 



 







16 17

These variables manifest the two eigenvalues and so have simple dynamics as follows. It 18

shows that one variable X_dkeeps constant, and the other one decreases over time with damped 19

trend guaranteeing convergence of demand. 20 1 d d X _ X 21 1 (1 ) d d D_



D      . 22 23

Then, the variables can be generalized to the k-day future where the deviations between 24

demands flip flop over time depending on the parity of k. 25 d k d

X

_



X

26 [ (1 )]k d k d D_



D      27 28

Based on this, we can derive the future demand in Equation 6. 29

(9)



₁ ₁



1 1 1 1 [ (1 ) ] 2 1 (1 ) (1 )[ (1 )] ( ) 2 1 1 [1 ( 1) (1 ) ] [ (1 ) ( 1) (1 ) ] 2 2 d k d k d k k d d d d k k k k d d D X D D D D D D D



                                  (6) 1 2

Further, settingr  



1 to have 3 1 1

1 (1

)

(1

)

1

k k d k d d

r

D

r

D

r

  











4 5 Due to 1 0

1

k k l l

r

 









, the relation between demand in the future with k days ahead and 6

the measured demands in the two consecutive days is expressed in Equation 7. 7 1 1 0 0 1 1 0 1 0 ( 1) ( 1) ( 1) ( 1) ( ) ( 1) k k l l d k d d l l k l d d d l k l d d l D D D D D D D D



                        



(7) 8 9

In the extreme situation of predicting the demand in the far futureD_d_, the predicted 10

ODT converges to a certain number as Equation 8, if the deviation of demand



D

_dis small. And 11

the weight



can be in the range of

(0, 2)

. 12 13

1

1 1 (

1)

2

d d d d d

D



D



D





















(8) 14 15 16 17

3.2.3 Data Level with Multiple Data Sources 18

For the measurement model on the data level, there are two types of flow data generated from 19

different devices. First is the link flow, representing the traffic counts on links during a certain 20

time period. Second is path flow, the traffic counts which pass a particular path with multiple 21

links. Actually, the path flow may indicate the origin-destination information, which may reduce 22

the uncertainty from the under-specification issue. 23

Loop detectors measuring link flows, only include anonymous counts. In principle, they 24

cannot distinguish trip chains of vehicles. Denoting the flow observation on link l at observation 25

time h in day d asV_{l h d}_{, ,} , the relation between observed flows and ODT in the previous days is 26

linked with the route proportionA_{( , , )( , )}_{l h d i t} . Error



_{l h d}_{, ,} is assumed to be white noise. 27 28 , , ( , , )( , ) , , , , , l h d l h d i t i t d l h d i t V 



A D 



(9) 29

(10)

1

Path flows are generated by the devices which can identify vehicles. For instance, 2

cameras track the trajectories of each vehicle, which is a rich information source to obtain the 3

traveling routes and even ODT information. Through the identification of individual vehicles, the 4

cameras offer the pair-wise flow, which is a part of vehicle trajectory. Denoting path flow as 5

, , cc h d

V with multiple passed cameras cc, and the route proportion as A_{( , , )( , )}_{cc h d i t} , the linear relation 6

between path flow and ODT is expressed in Equation 10. Actually, here the route proportion is 7

changing over time if the camera recognizes the vehicle is back to the road system on a secondary 8

trip. It may bring complexity to the real-time simulation. But for the day-to-day situation, it is not 9

an issue even if a multi-day trip chain is broken up per day. Error



_{cc h d}_{, ,} is a white noisy, 10

independent on the error



_{l h d}_{, ,} in Equation 9. 11 12 _{, ,} _{( , , )( , )} _{, ,} _{, ,} , cc h d cc h d i t i t d cc h d i t V 



A D 



(10) 13 14

3.3 Posterior Estimation Method in the Gaussian Space 15

After having the hierarchical Bayesian Model with relations among layers, the estimation and 16

prediction of the posterior ODT is carried out with the stochastic features. We concern Gaussian 17

space for the errors, in which an analytical approach is effective. Specifically, Kalman filtering is 18

a special method for recursive Bayesian inference in such Gaussian space with the assumptions 19

that all the error terms have multivariate normal distributions. It operates recursively on the 20

streams of noisy input data to produce a statistically optimal estimate of the underlying system 21

state. 22

Kalman filtering has two main updating steps: predicting and updating. The advantage of 23

the two-step updating is to decrease computation complexity. The updating procedure is as 24

following, illustrated in FIGURE 2. First, the observation flow at day d is used to estimate 25

demand D(d|d), which is based on the a-priori information. And then through dynamic prediction 26

with a trend model, the demand at day d+1 can be predicted. The predicted demand predicts the 27

flow data at the day d+1 as f(d+1|d) based on Equation 9 and 10. Further, once the observed flow at 28

the predicted day d+1 is obtained, there is likely a deviation between the simulated flow f(d+1|d) 29

and the observed flow W(d+1). Then the deviation and the prior predicted state at day d+1 is used 30

to get the posterior demand. Actually, this updating procedure has the same mechanism as 31

predicting flow directly. But the computation time of updating ODT in the Kalman filtering 32

framework is much less than the time to update flows. 33 34 Demand (d|d) Demand (d+1|d) Flow f(d+1|d) Demand (d+1|d+1) Observed Flow W(d+1) dynamics Filtering (Predicting) Estimating/ Calibrating deviation Estimating/ Calibrating Observed Flow W(d) 35

FIGURE 2 Kalman Filter for ODT Estimation and Prediction in the Day Level.

36 37

(11)

The Kalman filtering updating in our case follows four steps (28) to get the posterior 1

ODT at day d+1. 2

3

Step 1: Initializing with the posterior at days d with normal distribution, having mean md and 4 covariance



_d. 5 (D C_d | ) ~ (N m_d,



_d) 6 7

Step 2: Computing the predicted ODT with mean

a

_d_₁, and covariance R_d_₁ with covariance 8

matrix of dynamics error



_d_₁. 9 1 1 1 (D_d_ |W_d) ~ (N a_d_ ,R_d_ ) 10 Where,a_d_₁B m_d_₁ _d, R_d_₁ B_d_₁



_dB_d'_₁



_d_₁ 11 12

Step 3: One-step forecast of flow data with mean

f

_d_₁ and covariance

Q

_d_₁ with covariance 13

matrix of measurement errors

v

_d_₁. 14 1 1 1 (W_d_ |W_d) ~ (N f_d_,Q_d_) 15 Where, f_d_₁ A a_d_₁ _d_₁, ' 1 1 1 1 1 d d d d d

Q

_



A R A

_ _ _



v

_ 16 17

Step 4: Posterior at day d+1 updating mean m_d_'and covariance



_d_₁. 18 1 1 1 1 (D_d_ |W_d_ ) ~ (N m_d_,



_d_ ) 19 1 1 1 1 d d d d m _ a _ X e_ _ and



_d_₁ R_d_₁X_d_₁X Q_d'_₁ _d_₁ 20 Where, 1 1 1 1 1 d d d d X _ R A Q_ _ _ and

e

_d_₁



W

_d_₁



f

_d_₁ 21 22

3.4 Approach to Handle the Evolution Parameters



in the Multi-Process Model 23

The evolution parameters



in the Multi-Process Model are constant over time but unknown. We 24

assume to have an a priori probability distribution on a finite set of possible values for



. The set 25

of the evolution parameters, denoted as , is discrete with



. Given any weights



, the 26

trend model in Equation 3 can be analyzed in the Gaussian space to produce sequences of prior, 27

posterior and forecast distributions of ODT that are sequentially updated over time as flow 28

observations are processed (28). The means and variances of the distributions all depend on the 29

specific weight value



under consideration. The inference about the ODT at day d+1 is based on 30

the density of demand given all the available flow observation over historical 31

time,p D( _d_₁| , ,



V V_d _d_₁,...). 32

In order to have the density function of the weight



given all the observed flow data 33

1

( | _d, _d ,...)

p



V V _ , we start with an initial a priori density

p

( | )



V

₀ . Information is sequentially 34

processed to provide inference about



via posterior p( |



V V_d, _d_₁,...). This is sequentially 35

updated using Bayes’ theorem as Equation 11. 36 1 1 2 1 2 ( | _d, _d ,...) ( | _d , _d ,...) ( _d| , _d , _d ,...) p



V V_  p



V_ V _ p V



V_ V_ (11) 37 38

And then to make inferences about

V

_dwithout reference to any particular value of



, the 39

required unconditional density is in Equation 12. 40 1 1 1

(

_d

|

_d

,

_d

,...)

(

_d

| , ,

_d _d

,...) ( |

_d

,

_d

,...)

p D V V

_

p D



V V

_

p



V V

_

d







_

(12) 41

(12)

1

4. CASE STUDY ON A15 MOTORWAY IN THE NETHERLANDS 2

The proposed method is tested in a real network of a part of the A15 motorway (between entry 3

17 and exit 15 from east to west) in the Netherlands. There are seven highway sections, four 4

on-ramps as origins and four off-ramps as destinations. Loop detectors are installed on each 5

highway section. Cameras are on highway section 3, 4, and 6. 6

7

FIGURE 3 A15 Network in the Netherlands.

8 9

The travel survey of individual trip chain in this area is obtained from Statistics 10

Netherlands, including the departure time, travel pattern and so on. The travel patterns 11

information is used to understand the types of trip chain of these travelers within one day. 12

Besides the OD pairs in this network, two types of the trip chains are presented in the survey: 13

one is in3-out4 and then in6-out7; the other is in3-out5 and then in6-out7. These two types of 14

trip chains are the introduced Origin Destination Tuples. The 14 OD pairs in this case are the 15

simplest case of ODT. In addition, through clustering the travelers with the same travel 16

pattern within one day, the initial sampled demand data is derived. The expectation of the a 17

priori demand (

m

_d ) is available, after up-scaling the sampled demand. And the variance of 18

the a priori demand (



_d) is ODT-wise, assumed as one thousandth of each ODT demand to 19

be relatively realistic. Based on this one day demand, given a random factor between 0.8 and 20

1.2, the mean of the initial four-day demands are available. 21

Three scenarios of the stationary weights (



) of predicting demand are designed as 22

(0, 0, 0.1, 0.9), (0.25, 0.25, 0.25, 0.25) and (0.2, 0.4, 0.1, 0.3), with the probability of 10%, 23

40% and 50% for each scenario. These probabilities represent the trust level of these 24

scenarios. For instance, people believe that scenario 1 is almost impossible to happen, thus 25

associated with the probability of 10%. 26

To demonstrate the behavior of the model, two tests are carried out. First, we 27

generate demand data with OD tuples based on the stationary weights within the designed 28

scenarios, and the second is the stationary weights beyond the designed scenarios. The first 29

test is to show that the multi-process model can help to find out the right scenario. The 30

second test is more realistic. In the real demand forecasting, people can design different 31

scenarios of weights, but the real weight is unknown and may not be included in the designed 32

ones. Thus, this test intents to examine whether the model can help to find out one scenario 33

which leads to best predicted demand. And for both tests, the contribution of cameras is 34 presented as well. 35 36 37 38 39

(13)

4.1 Test One: Stationary Weights within the Designed Scenarios 1

We generate the day-to-day demand of 500 days based on a certain dynamics in Equation 3 2

with the evolution variance of 1 (



_d_₁). The stationary weights are from scenario 2 (0.25 0.25 3

0.25 0.25), which means that the shares of the previous four days to predict the future 4

demand are all 0.25. With these 500-day demand data, the traffic flow of loop detectors and 5

cameras are derived by the linear relation as in Equation 9 and 10. The measurement errors 6

(

v

_d_₁) of flow data are multivariate normal random numbers with the means of zeros and the 7

variances of 50 and 0.1 for loop and cameras, respectively. Thus, the ratio of the evolution 8

variance of demand and the measurement variance of loop flow is 0.02 (1/50), which means 9

the most recent real-time estimate receives relatively small weighting eventually (17). And 10

the ratio for camera is 10, implying that the camera flow data actually play a role with a less 11

randomness and a high accuracy. 12

After running the model in the situation with cameras plus loops and the one with 13

only loops, the probabilities of scenario 2 jump from 40% to 100% immediately and stay in 14

100%, as in FIGURE 4, while other scenarios will not happen although people believe them 15

with 10% and 50% at the beginning. It indicates that the proposed method of the Hierarchical 16

Bayesian Networks with the multi-process trend model is feasible to find out the exact 17

scenario as designed. 18

19

20

FIGURE 4 Probabilities of Three Scenarios in Testing One.

21 22

Meanwhile, comparing the converge of the posterior demands, taking ODT3-4~6-7 23

for instance, the deviation of the posterior and true demands in scenario 2 with the solid lines 24

in FIGURE 5 converge the fastest among the scenarios in both situations with and without 25

cameras. FIGURE 5 also illustrates that the randomness of the deviations without cameras is 26

larger than the one with cameras. And with only loops installed, the posterior demands of 27

ODT3-4~6-7 with Scenario 1 and 3 even cannot converge to the true demand value because 28

of the randomness. 29

(14)

1

2

FIGURE 5 Posterior Demand of ODT3-4~6-7 in 3 Scenarios With and Without Cameras.

3 4 5

In addition, the absolute deviations in TABLE 2 (a) and (b) between true (generated) 6

demand and estimated demand in scenario 2, as the right scenarios to generate demand, are 7

expectedly much lower than the other two scenarios. The summation over the difference 8

ratios of real demand and posterior demand in scenario 2, as 0.04% and 0.05% in the 9

situations with and without cameras, are also much lower than the rest: 1.21% and 6.38% 10

from scenario 1; 0.35% and 1.93% from scenario 3. 11

Furthermore, cameras do play a significant role to predict demands. They help to 12

increase the prediction accuracy. In general, the absolute deviations with cameras and loops 13

are lower than the ones with only loops for all the scenarios. Especially for the ODTs 14

identified by cameras as marked in TABLE 2 (a) and (b), the predicted demands are the same 15

as the real demands in the situation with cameras. It is also indicated in FIGURE 5. 16

In a nutshell, the Hierarchical Bayesian Networks with the multi-process trend model 17

is able to reach the weight scenario with which the 500-day demand is generated. The 18

posterior demand with the right weight converges fastest. Cameras can help to get more 19

accurate results, especially for the ODTs. 20 21 22 23 24 25 26 27 28

(15)

TABLE 2 Absolut Deviations between True and Estimated Demand

1

(a) Weights as Scenario 2 (b) Weights as Scenario 2 2

Cameras + Loops Loops 3

|Dev| Sce 1 Sce 2 Sce 3 |Dev| Sce 1 Sce 2 Sce 3 ODT 1-4 1.34 0.10 0.39 ODT 1-4 1.64 0.09 0.29 ODT 1-5 1.23 0.02 0.24 ODT 1-5 1.22 0.06 0.39 ODT 1-7 1.12 0.05 0.28 ODT 1-7 0.65 0.05 0.20 ODT 1-8 3.83 0.01 1.12 ODT 1-8 3.30 0.00 1.06 ODT 2-4 1.92 0.11 0.64 ODT 2-4 2.07 0.13 0.47 ODT 2-5 2.04 0.05 0.51 ODT 2-5 1.74 0.08 0.69 ODT 2-7 1.70 0.08 0.52 ODT 2-7 0.78 0.06 0.37 ODT 2-8 5.21 0.00 1.42 ODT 2-8 4.76 0.02 1.38 ODT 3-4 0.00 0.01 0.05 ODT 3-4 3.83 0.11 0.97 ODT 3-5 0.01 0.00 0.09 ODT 3-5 6.20 0.08 1.93 ODT 3-7 0.00 0.00 0.09 ODT 3-7 5.44 0.07 1.69 ODT 3-8 3.06 0.27 0.89 ODT 3-8 1.76 0.04 0.62 ODT 6-7 0.01 0.01 0.03 ODT 6-7 5.89 0.00 1.81 ODT 6-8 6.14 0.13 1.56 ODT 6-8 10.02 0.09 3.01 ODT 3-4~6-7 0.00 0.00 0.01 ODT 3-4~6-7 4.02 0.01 1.24 ODT 3-5~6-7 0.01 0.01 0.01 ODT 3-5~6-7 6.10 0.01 1.84 sum|%| 1.21 0.04 0.35 sum|%| 6.38 0.05 1.93 4

(c) Extra Weights (d) Extra Weights 5

Cameras + Loops Loops 6

|Dev| Sce 1 Sce 2 Sce 3 |Dev| Sce 1 Sce 2 Sce 3 ODT 1-4 3.37 1.68 1.39 ODT 1-4 3.73 2.13 1.66 ODT 1-5 2.85 1.70 1.22 ODT 1-5 2.80 1.82 1.27 ODT 1-7 2.64 1.48 1.07 ODT 1-7 1.65 1.08 0.75 ODT 1-8 8.45 4.20 3.11 ODT 1-8 7.92 4.31 3.37 ODT 2-4 4.65 2.20 1.85 ODT 2-4 5.09 2.67 2.30 ODT 2-5 4.54 2.57 1.86 ODT 2-5 4.42 2.60 2.04 ODT 2-7 3.86 2.06 1.49 ODT 2-7 2.41 1.34 1.13 ODT 2-8 11.68 5.92 4.37 ODT 2-8 10.99 6.24 4.63 ODT 3-4 0.08 0.03 0.05 ODT 3-4 8.96 5.04 4.04 ODT 3-5 0.14 0.02 0.11 ODT 3-5 14.59 8.60 6.63 ODT 3-7 0.14 0.04 0.10 ODT 3-7 13.20 7.69 6.00 ODT 3-8 6.96 3.73 2.87 ODT 3-8 4.30 2.19 1.50 ODT 6-7 0.04 0.03 0.04 ODT 6-7 14.21 8.13 6.17 ODT 6-8 14.65 7.43 5.59 ODT 6-8 24.09 13.90 10.51 ODT 3-4~6-7 0.00 0.01 0.01 ODT 3-4~6-7 9.71 5.73 4.53 ODT 3-5~6-7 0.00 0.01 0.00 ODT 3-5~6-7 14.72 8.65 6.85 sum|%| 2.91 1.51 1.15 sum|%| 16.00 9.32 7.29 7 8

(16)

4.2 Test Two: Stationary Weights beyond the Designed Scenarios 1

In reality, people do not know the real weights for the demand prediction, which are most 2

likely beyond the designed scenarios. Keeping the model settings as in the first testing, we 3

generate 500-day demand data with an extra stationary weight of (0.1 0.1 0.7 0.1). 4

First, the probability of the weights in scenario 3 converges to unity, and for the rest 5

to zeros in both situations with and without cameras. It means that the weights in scenario 3 6

are able to lead the posterior demands to the most likely ones. 7

Second, even the designed weights do not include the true weights, scenario 3 with the 8

unity probability is able to lead the low summation of the difference ratios, in TABLE 2 (c) and 9

(d), between the real and posterior demand: 1.15% in the situation with cameras and 7.29% 10

without cameras. Comparing the absolute deviations of the scenario 3 in TABLE 2 (c) and (d), 11

especially for the ODTs which cameras can identify (marked with shadow), the posterior 12

demands can achieve almost real demand with the absolute deviation, while there are more than 4 13

absolute deviations in the situation with only loops. 14

Third, the convergence rates of the posterior demands when cameras are installed are 15

significantly higher than the one without cameras, especially for the ODTs where cameras 16

can identify. An example from ODT 3-4~5-6 is illustrated in FIGURE 6. The dashed line 17

represents the posterior demand along 500 days with only loop data. It takes almost 400 days 18

to a convergence and this converged demand has 4.53 deviations from the real demand as 19

illustrated in TABLE 3 (d). 20

21

22

FIGURE 6 Posterior Demand of ODT3-4~6-7.

23 24

In a nutshell, the Hierarchical Bayesian Networks with the multi-process trend model 25

are able to find the scenario which achieves the lowest deviations between real (generated) 26

and posterior demand, even if the real scenario is beyond the designed ones. Cameras play a 27

significant role in this test. The deviations in the indicated scenario with cameras are much 28

lower than the ones without cameras installed, although they cannot reach zero deviations as 29

in the testing one. The convergence rate in the situation with cameras is much higher than the 30

one without cameras. 31

32 33

4.3 Summing up 34

The Hierarchical Bayesian Networks with the multi-process trend model is a feasible method 35

to find the right weight scenario with the lowest deviation between the real and posterior 36

demands. The path flows by cameras are very essential in the real situation, where the right 37

weights are not in the designed ones. Camera data result in the fast convergence and low 38

deviations. 39

(17)

1

5. CONCLUSION 2

There are three main contributions in our paper, which also answer the research questions in the 3

section 1. First, we propose the new concept of Origin Destination Tuple as the sequential 4

dependence of OD matrix, which fills the gap between the transportation modeling and the 5

activity-based model research. In order to connect these micro and macro levels, the kernel 6

density estimation is applied to smooth the probability density of the a priori demand. 7

Second, we take the advantages of the monitoring systems to identify the trip chain of 8

vehicles. The path flow from the identification devices such as cameras, significantly decrease the 9

uncertainty from the OD tuples, which brings more serious under-specified problem to the 10

estimation and prediction. The case study demonstrates that the path flow leads to the more 11

accurate prediction with the almost zero absolute deviation and also results in the fast 12

convergence of the predicted demand during the long term. 13

Third, the Hierarchical Bayesian Networks with the multi-process trend model is suitable 14

to predict demand. The method is able to find the right weight-scenario with the unity probability 15

and generate the lowest deviation between the ture and posterior demands. 16 17 18 ACKNOWLEDGEMENTS 19

We would like to thank Statistics Netherlands for funding this research. 20

21 22

REFERENCE 23

1. Kitamura, R. 1996. Application of Models of Activity Behavior for Activity based 24

Demand Forecasting. From website: 25

http://media.tmiponline.org/clearinghouse/abtf/kitamura.pdf (June 30, 2012) 26

2. Jones, P., F. Koppelman and J.P. Orfueil (1990) Activity analysis: State-of-the-art and 27

future direction. In P. Jones (ed.) Developments in Dynamic and Activity-Based 28

Approaches to Travel Analysis, Gower Publishing, Aldershot. 29

3. Bowman, J.L. and Ben-Akiva, M.E. Activity-based disaggregate travel demand model 30

system with activity schedules. Transportation Research Part A 35 (2000) 1-28. 31

4. Stavins, R. 1999. The Costs of Carbon Sequestration: A Revealed-Preference Approach. 32

American Economic Association. Vol.89, No.4, pp.994-1009. 33

5. Kroes, E., and Sheldon, R. 1988, Stated Preference Methods: an Introduction. Journal of 34

Transport Economics and Policy, Vol. 22, No. 1, pp11-25. 35

6. Van Zuylen, H.J., and L.G. Willumsen. The most likely Trip matrix Estimated from 36

Traffic Counts, Transportation Research 14B, 1980, 281-293. 37

7. Nguyen, S., Morello, E. and Pallottino, S. Discrete Time Dynamic Estimation Model for 38

Passenger Origin/Destination Matrix on Transit Networks. Transportation Research Part 39

B Vol.22B, No.4. pp.251-260. 1988. 40

8. Nie, Y. and Zhang, H.M. A Vibrational Inequality Formulation for Inferring Dynamic 41

Origin-destination Travel Demands. Transportation Research Part B 42, 635-662. 2008. 42

9. Frederix, R. and Viti, F. A New Gradient Approximation Method for Dynamic Origin-43

destination Matrix Estimation on Congested Networks. Transportation Research Board 44

2011 Annual Meeting CD-ROM. 45

10. Zhang, Y., Qin, X. and Dong, S. Daily OD Matrix Estimation using Cellular Probe Data. 46

Transportation Research Board 2010 Annual Meeting CD-ROM. 47

(18)

11. Cascetta, E. Dynamic Estimators of Origin-destination Matrices Using Traffic Counts. 1

Transportation Science Vol 27, No. 4, November 1993 2

12. Asakura, Y. Hato, E. and Kashiwadani, M. Origin-destination Matrices Estimation Model 3

using Automatic Vehicle Identification Data and Its Application to the Han-shin 4

Expressway Network. Transportation 27: 419-438, 2000 5

13. Zhou, X. and Mahmassani, H. Dynamic Origin-Destination Demand Estimation Using 6

Automatic Vehicle Identification Data. IEEE Transactions on Intelligent Transportation 7

System, Vol. 7, No. 1, March 2006. 8

14. Chang, G. and Wu, J. Recursive Estimation of Time-varying Origin-destination Flows 9

from Traffic Counts in Freeway Corridors. Transpn. Res.-B. Vol 28B, No.2, pp. 141-160, 10

1994. 11

15. Ashok,K. and Ben-akiva, M,E. Alternative Approaches for Real-Time Estimation and 12

Prediction of Time-dependent Origin-Destination Flows. Transportation Science Vol. 34, 13

No. 1, February 2000. 14

16. Dixon, M. and Rilett, L. Real-time OD Estimation Using Automatic Vehicle 15

Identification and Traffic Count Data. Computer-Aided Civil and Infrastructure 16

Engineering 17, 7-21. 2002. 17

17. Zhou, X. and Mahmassani, H. A Structure State Space Model for Real-time Traffic 18

Origin-destination Demand Estimation and Prediction in A Day-to-day Learning 19

Framework. Transportation Research Part B 41, 823-840, 2007. 20

18. Barcelo, J., Montero, L. and Marques, L. Travel Time Forecasting and Dynamic OR 21

Estimation in Freeways based on Bluetooth Traffic Monitoring. TRB 2010 Annual 22

Meeting CD-ROM. 23

19. Van der Zijp, N. Dynamic OD-Matrix Estimation from Traffic Counts and Automated 24

Vehicle Identification Data. Transportation Research Record 1607, 1997. 25

20. Hazelton, M. Statistical Inference for Time Varying Origin-destination Matrices. 26

Transportation Research Part B 42, 542-552, 2008. 27

21. McNally, M. 1996. An Activity-Based Microsimulation Model for Travel Demand 28

Forecasting. From website: http://escholarship.org/uc/item/4z11d19q. (June 6,2012) 29

22. Chorus, C., Arentze, T., and Timmermans, H. 2008. A Random Regret-Minimization 30

model of travel choice. Transportation Research Part B 42 (2008) 1-18. 31

23. Frignami, M., Auld, J. and Mohannadian, A. 2010. Urban Travel Route and Activity 32

Choice Survey: An Internet-based Prompted Recall Activity Travel Survey using GPS 33

data. TRB 2010 Annual Meeting CD-ROM. 34

24. Wang, D. and Cheng, T. 2010. A Spatio-temporal Data Model for Activity-Based 35

Transport Demand Modeling. International Journal of Geographical Information Science. 36

15:6, 561-585. 37

25. Moiseeva, A., Jessurun. J., and Timmemans, H. 2010. Semi-Automatic Imputation od 38

Activity-Travel Diaries using GPS Traces, Prompted Recall and Context-Sensitive 39

Learning Algorithms. TRB 2010 Annual Meeting CD-ROM. 40

26. Silverman, B.W., 1981. Using kernel density estimates to investigate multimodality. 41

Journal of the Royal Statistical Society Series B-Methodological 43 (1), 97–99. 42

27. Gyftodimos, E. and Flach, P. Hierarchical Bayesian Networks: A Probabilistic Reasoning 43

Model for Structured Domains. 44

From http://www.cs.bris.ac.uk/Publications/Papers/1000650.pdf (June 1,2012) 45

28. West, M. and Harrison, J. Bayesian Forecasting and Dynamic Models. 1989. Springer 46

Series in Statistics. 47