Travel Time Prediction Issues in a Community-based Car Navigation System

(1)

Krzysztof Dembczyński2 Przemysław Gaweł1 Andrzej Jaszkiewicz2 Wojciech Kotłowski2 Marek Kubiak1 Robert Susmaga2

Przemysław Wesołek2 _{Piotr Zielniewicz}2

1_{NaviExpert Sp. z o. o., Poland}

(2)

(3)

3 ... Car Navigation Systems Run on Data

4 Explicit Data in Action

5 Eternal Warfare: Static versus Dynamic Model

(4)

• What problems are facing a modern car navigation system?

• How – by clever data acquisition and transformation – can solutions to these problems be found?

(5)

through B, C, ... (unless stated otherwise, the ‘route’ is implied to be of minimal length and, as such, it will be further referred to as ‘the shortest route’)

I Example

• Find a route from Poznań to Copenhagen through Berlin.

I _{Principal features}

• Principal capability: navigating in unfamiliar environment • No direct control over road types

• No capability of finding fastest routes (‘the fastest route’ is the route of minimal travel time)

• No capability of reacting to rapidly changing, non-typical traffic situations

(6)

• Second generation: systems capable of finding a route from A to Z through B, C, ... that satisfies some constraints regarding selected static parameters of the road network

I Example

• Find a route from Poznań to Kraków by motorways (wherever possible)

I Principal features

• Principal capability: navigating in unfamiliar environment

• Direct control over road types (this control gives the user some comfort; it may also result in finding faster or slower routes)

• No capability of finding fastest routes

• No capability of reacting to rapidly changing, non-typical traffic situations

(7)

routes from A to Z through B, C, ... that additionally satisfy pre-defined constraints regarding static parameters of the road network

I _Example

• Find the fastest route from Poznań Coach Station (located close to the city centre) to Poznań Ławica Airport (located a few kilometres west of the city) about 16:00

• Principal capability: navigating in unfamiliar environment • Direct control over road types

• Capability of finding fastest routes (under typical traffic conditions) • No capability of reacting to rapidly changing, non-typical traffic

(8)

shortest and fastest routes from A to Z through B, C, ... that

additionally satisfy pre-defined constraints regarding static parameters of the road network

I Example

• Find the fastest route from Poznań Coach Station (located close to the city centre) to Poznań Ławica Airport (located a few kilometers west of the city) about 16:00 (caution: unexpected jam-generating roadworks in the Bukowska Street!)

• Principal capability: navigating in unfamiliar environment • Direct control over road types (gives the user some comfort;

may result in finding faster or slower routes)

(9)

• Types of input data

I (Static) data on the road network (maps with additional information) I (Dynamic) data on the actual geographical coordinates (current GPS

position)

I (Static + dynamic) data on various aspects of the actual traffic situation ... to be continued!

(10)

• Targets of data processing

I Finding shortest routes (with potentially pre-defined constraints) I Finding fastest routes (with potentially pre-defined constraints)

I ... as far as its complexity (although not in the sense of ‘computational complexity’) is concerned, the latter differs from the former very

(11)

• Practical means of reaching the appointed targets

I _{Finding shortest routes (with potentially pre-defined constraints)}

• Representation of the road network consisting of segments of known lengths (as up-to-date as possible, updates on a day-to-day basis) • (formally: a directed graph, with vertices characterized by...) • The lengths are fairly easy to acquire (and they are stable in time)

I Finding fastest routes (with potentially pre-defined constraints)

• Representation of the road network consisting of segments of known lengths and of known travel times (as up-to-date as possible, updates on a minute-to-minute basis)

• (formally: a directed graph, with vertices characterized by...)

• The travel times are fairly difficult to acquire (and they are changing all the time)

(12)

• Acquiring the segment travel times is something completely different from acquiring the segment lengths, as it requires entirely new types of data (from renewable sources)

• ... and so we are back again with ‘the (static + dynamic) data on various aspects of the actual traffic situation’ (the implicit data)

• Generally, two sources of such data exist

I Stationary sources (various road sensors) I Mobile sources (floating car data)

(13)

• Levels of floating car data processing

I Decoding, storing, encoding

• Basically no problems

I Translating GPS positions to segment passage information

• Problems: GPS fluctuations (inaccuracy and imprecision)

I Converting passage information to travel times (effectively: predicting the travel times)

• Nothing but problems!

• Requires a specialized model (or more specialized models)

• For some time now, a model being a combination of two other distinct models (a static one and a dynamic one) is effectively used

(14)

• The two fundamental models

I static I dynamic

(15)

• Static model

I aimed at predicting traffic patterns I _{well suited for long-term predictions} I fairly useless in unexpected situations

• Dynamic model

I aimed at predicting deviations from traffic patterns (in practice: from the predictions of the static model) I well suited for short-term predictions

(16)

• ‘Historical versus current’ data duality principle:

I historical data are abundant but out-of-date I current data are up-to-date but scarce • In result

I Static model working conditions

• usually much data to process

• seldom recalculated (potentially expensive!)

I Dynamic model working conditions

• often recalculated

(17)

• An attractive solution: the exclusive dynamic model

• Experimentally confirmed, but only with satisfactory (in practice: very large) amounts of data

I Obviously, the exclusive static model (no matter how excellent) does not have such possibilities

• Practically hard to achieve because of too few users (and because of the flow of time...)

(18)

• The lack of data generates some very acute problems

I highly variable/unstable predictions

(19)

• This problem has found a new solution:

Inviting users to submit pieces of information on different traffic situations

• Effectively, these constitute still another kind of data (the explicit data)

• In many respects the new data seems to be an attractive solution – in particular perfectly suited for unpredictable traffic situations

(20)

• Unfortunately:

(21)

1 user spots a camera

2 user notifies the operator giving the exact location of the camera 3 operator warns other users about the camera

(22)

• thousands of notification — no chance of human operator

• different reported positions for the same camera (report latency)

• GPS fluctuations

notifications grouping

• human factor: mistakes, malicious users

(23)

(24)

• variant of k-means

• number of groups (i.e. cameras) not known in advance

• maximal group diameter assumed (ca. 400 m)

(25)

• reputation systems

• simple voting schemes

(26)

• large number of applications

• users rate each other after transaction (eBay, Allegro)

• not well suited to our problem domain

I transaction-like applications only I one-to-one relationship between users

(27)

• Ne independent binary events A1, A2, . . . , ANe

• Ai ∈ {0, 1}, Ai= 1 means there is a camera at place i Observers = users

• No different independent observers O1, O2, . . . , ONo

Reports = user notifications about cameras

• Iij: binary report presence variable

• Iij = 1 if Oj sent a report about Ai, 0 otherwise

(28)

• all users made equal — also malicious ones

• voting for an event

γi= P (Ai= 1) = PNo

j=1Iij· Xij PNo

j=1Iij

• cut-off threshold: we assume Ai = 1 iff P (Ai) > 0.6

• modified to compensate for small number of reports

γ0 = γi×

1 +PNo

(29)

• credibility of the user Oj, i.e. probability that Xij = Ai pOj = p(Xij = 1 | Ai = 1) = p(Xij = 0 | Ai= 0) • a priori probability of an event Ai

(30)

γi = P (Ai = 1 | Xij, Iij) = ai+ bi where ai= P (Ai = 1) · P (Xij| Ai = 1, Iij) = pA No Y j=1 (pXij Oj · (1 − pOj) 1−Xij₎Iij bi= P (Ai = 0) · P (Xij| Ai = 0, Iij) = (1 − pA) No Y ((1 − pOj) Xij_{· p}1−Xij Oj ) Iij

(31)

• Bayes’ theorem

• total probability theorem

• assumption of independent observers Requires knowledge of:

• users’ credibility parameters, pOj • a priori probability of an event, pA Decision maker’s remarks:

(32)

• data on notifications

• a probabilistic model with unknown values of parameters

Question

What is the probability (likelihood) of generating that data with the model?

Answer:

• maximize the likelihood of generating that data

(33)

• simple voting and bayesian approach were compared

• 29 reported places, where speed cameras were physically there

• 14 reported places, where speed cameras were not really present

• 954 user reports about the above places

• quality measure: number of cameras “detected” by the method and MSE

(34)

#warnings MSE #warnings MSE 0.0 34 0.120 35 0.094 0.1 34 0.120 34 0.096 0.2 34 0.120 31 0.080 0.3 34 0.120 31 0.080 0.4 32 0.118 22 0.022 0.5 30 0.115 21 0.014 0.6 29 0.109 20 0.000 0.7 24 0.052 20 0.000 0.8 16 0.013 19 0.000

(35)

Again: static or dynamic?

• As of today, the systems (including the NX-CT system) does not have enough users to cover the voracious needs of the exclusive dynamic model, so the static model survives

• The general trend, however, seems to be changing

I more users, more floating car data I more computing power, easier processing

• It seems that in the nearest future the static model may fade altogether, making the dynamic one exclusive

(36)

• Exercise:

Find a route (and predict its travel time) from Poznań to Opole through Wrocław

• Departure: 07:30

I In Poznań about 07:30–08:00 – prediction: heavy jams! I In Wrocław about 11:30–12:00 – prediction: no jams • Departure: 11:00

I In Poznań about 11:00–11:30 – prediction: no jams I In Wrocław about 15:30–16:00 – prediction: heavy jams! • Departure: 01:00

(37)

• A new target of data processing:

Finding shortest or fastest routes (with potentially pre-defined constraints) of non-negligible duration

or

Finding shortest or fastest routes (with potentially pre-defined constraints) in the (immediate) future

• In both cases

I Representation of the road network consisting of segments of known lengths and of known travel times (time = duration) as function of time (time = time/date)

I (formally: a directed graph, with vertices characterized by...) I The functions of time are very difficult to acquire

(38)

• The new target can be handled with exclusively static models (at least, until clairvoyant models start emerging)

• The practical conclusion for the nearest feature: no escape from the ubiquitous combinations of a static model and a dynamic model!

(39)

• The goal of the problem can be stated as a prediction of an unknown value of a vehicle travel time yst on a particular road segment s ∈ {1, . . . , S} in a given time point t.

• The task is then to learn a function f (s, t) that predicts, in the most accurate way, the true value of yst.

• The accuracy of a single prediction ˆyst = f (s, t) is measured by a loss function L(yst, ˆyst).

• A reasonable loss function in this case is the squared error loss: L(yst, ˆyst) = (yst− ˆyst)2.

(40)

f (s, t)∗ = arg min

f R(f ) = arg minf E(s,t)Ey|(s,t)(y − f (s, t)) 2_.

• Because this is impossible, as the distribution of yst is unknown, we rely on a finite set of training examples, {(y(i), s(i), t(i))}N_i=1, and learn a model that minimizes the empirical risk:

Remp(f ) = 1 N N X i=1

L(y(i), f (s(i), t(i))),

(41)

(42)

• Construction of the static model fs(x) is similar to the typical regression task.

• The training data are represented in a tabular form {(y(i), x(i))}N_i=1.

• We use two rather simple static models that exploit only a limited number of features describing road segments and time points.

(43)

• We estimate a single value that corresponds to the average unit travel time (or average inverse velocity) over all historical observations: ¯ v−1 = PN i=1y(i) PN i=1x (i) l , where xl is the length of the l-th segment.

• The prediction for a given road segment is then given by: ˆ

(44)

• This form of the prediction is reasonable, as the average unit travel time is the solution to the optimization problem:

¯ v−1 = arg min a N X i=1 x(i)_l y (i) x(i)_l − a !2 ,

where the length of the segment is multiplied by the loss for single observations.

• In other words, if we minimize the weighted squared loss on a training set, the average unit travel time is the best possible

(45)

• The segment mean model averages the travel time on each road segment separately: ˆ ysm= fs(x) = P i:x(i)_id=xidy (i) P i:x(i)_id=xid1 .

• The segment/time period mean additionally considers information about a day time of the passage.

• Using expert knowledge on weekly traffic trends, we define five time periods and separately compute the mean travel time for each road segment and each time period.

(46)

• The dynamic model fd consists of two parts: a simple moving average model and a linear adaptation part.

• The simple moving average model treats the incoming observations as time series.

• The linear adaption takes the predictions from the static and time series model and adapts them to the current traffic situation.

(47)

• Prediction ˆyst0 for a given segment s and time point t is computed

using recent observations ysti, ti < t, from segment s.

• We use a simple moving average model that consists in averaging past observations from a road segment s in a time interval T :

ˆ yst= fdT(s, t) = P t−ti<Tysti P t−ti<T 1 .

(48)

• The linear adaptation produces the final travel time estimate as a linear combination of the static and dynamic model:

f (s, t) = a0xl+ a1fs(xst) + a2fd(s, t). where

I a0models the recent increase/decrease of the average travel time

(per unit length),

I _a₁_{adjusts proportionally the static model to the current traffic,}

I a2determines the reliability of the dynamic model.

• This model is trained by linear regression every 5 minutes on the most recent observations (e.g time window of few past hours) from

(49)

• We use real-life floating car data provided by NaviExpert.

• The observations are coming from a specific area within the city of Poznań.

(50)

I Global Mean (GM), I Segment Mean (SM),

I Segment/Time Period Mean (STP), I Dynamic Model (DM)

– adaptation of the moving average and the segment mean. • We use data from the 12th till 25th of September to train and

tune the models.

• We test the models on data from September 26 till October 10.

• The dynamic model additionally uses the most recent observations from the test set (but each prediction is entirely based on earlier

(51)

Table: Results of the four models on test set.

Model MAE[min] MAE[%] RMSE[min] RMSE[%]

GM 0.3307 119.80 0.8556 108.00

SM 0.2761 100.00 0.7922 100.00

STP 0.2649 95.97 0.7776 98.15

DM 0.2556 92.60 0.6415 80.98

• SM and STP improve significantly over GM, although it is DM that achieves the best results, due to its adaptive nature.

(52)

On September 26, 2011, a lorry broke down in the Jana Pawla II street and was removed only about 9 pm, resulting in unusual congestion lasting to late evening hours.

• This incident also coincided with the beginning of a new academic year, during which students return to the city, additionally increasing the traffic.

• We test the performance and behavior of the model on this day (26th), and a day week before (19th – without any unusual traffic conditions).

(53)

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 rmse [min]

(54)

(55)

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a1

(56)

(57)

• Based on gathered NX-CT data (and a few external sources. . . )

• Answers the question:

How much do you really spend in jams, and why

• Ambition: give the clear methodology, put in different data

I different cities

I different „user groups” (e.g. vehicle types) I comparative studies (2011.09 vs. 2012.09)

(58)

http://ophelia.cs.put.poznan.pl/projects/adaptivetraffic/ korki-warszawa-201109.pdf

• We were noticed — a little bit of self-praising

I 45 Internet portals (ca. 20 main, e.g. Gazeta, Onet, WP, Money.pl) I radio „shorts” (Zet, Polskie Radio 3, Antyradio, Planeta FM) I printed press: Super Express, Motor

I TV: Superstacja

(59)

Si – sessions made during i-th hour of day sd, st – session length and time, resp.

vi – average speed between hours i and i + 1 vi = P s∈Sisd P s∈Sist [km/h] ti – average time to travel 1 km

ti = 60

vi

[min/km] ` – average travel time growth

(60)

• What else is needed?

I hourly profile of number of travels I number of travellers

I profile of transport means I number of travellers per car

• NX-CT can’t be treated as a representative sample

• Options for data sources: Warszawskie Badanie Ruchu, GUS

(61)

(dni powsz.) (250 dni powsz.) Dane ze źródeł zewnętrznych

Łączna liczba podróży 1 047 747 261 936 715

Łączna liczba tras samochodów 805 959 201 489 781

Łączny czas spędzany w podróży [h] 576 261 144 065 194

Wyniki badania

Łączny wzrost czasu podróży [h] 184 398 46 099 384

Łączny strata „czasu produkcyjnego” [dni 8h] 17 731 4 432 633

Łączny koszt w przel. na wynagrodzenie [zł] 3 988 200 997 049 889

Łączny koszt paliwa [zł] 808 512 202 128 068

(62)