Modelling travel time perception in transport mode choices

(1)

Modelling travel time perception in transport mode choices 1

Silvia F. Varotto (corresponding author) 2

Dipartimento di Ingegneria e Architettura 3

Università degli Studi di Trieste 4

Via Valerio 6/4, 34127 Trieste, Italy 5

Currently: 6

Department of Transport and Planning 7

Faculty of Civil Engineering and Geosciences 8

Delft University of Technology 9

Stevinweg 1, P.O. Box 5048, 2600 GA Delft, The Netherlands 10

Phone: +31 (0)15 2789575, Fax: +31 (0)15 2783179, Email: s.f.varotto@tudelft.nl 11

Aurélie Glerum 12

Transport and Mobility Laboratory 13

School of Architecture, Civil and Environmental Engineering 14

Ecole Polytechnique Fédérale de Lausanne 15

Station 18, 1015 Lausanne, Switzerland 16

Phone: +41 (0)21 6932435, Fax: +41 (0)21 6938060, Email: aurelie.glerum@epfl.ch 17

Amanda Stathopoulos 18

Currently: 23

McCormick School of Engineering & Applied Science 24

Northwestern University, TECH A335 25

2145 Sheridan Rd, Evanston, IL 60208 26

Phone: +1 847-491-5629, Email: a-stathopoulos@northwestern.edu 27

Michel Bierlaire 28

Phone: +41 (0)21 6932437, Fax: +41 (0)21 6938060, Email: michel.bierlaire@epfl.ch 33

Giovanni Longo 34

Dipartimento di Ingegneria e Architettura 35

Università degli Studi di Trieste 36

Via Valerio 6/4, 34127 Trieste, Italy 37

Phone:+39 040 5583576, Fax: +39 040 5583580, Email: longo@dica.units.it 38 Word count: Abstract 243 Main text 5412 Figures 3 (750) Tables 2 (500) Total 6905 References (n) 31

Paper submitted to TRB Annual Meeting 2015 39

November 14th_{, 2014} 40

(2)

ABSTRACT 1

Travel behaviour models typically rely on data afflicted by errors, in perception (e.g., over/under-estimation 2

by traveller) and measurement (e.g., software or researcher imputation error). Such errors are shown to have 3

a relevant impact on model outputs. So far a comprehensive framework to deal with different types of 4

biases related to travel model inputs is missing. 5

In this paper, focusing on travel time, we include two types of measures (i.e., "calculated" and "reported") 6

as indicators of an unobservable true travel time. The aim of including these travel time indicators is to 7

investigate how the underlying travel time perception on behalf of travellers influences the modal choice, 8

compared to the role of externally obtained measurements. The model framework is a latent variable 9

structure where the different travel time indicators serve as manifestations, characterized by different types 10

of biases, of the true travel time. The model is applied to a mode choice case study from Trieste (Italy). 11

Notably, for this data-set, it is established that the calculated travel time distributions (i.e., measured by 12

devices such as an assignment model developed with the software Visum and Google Maps) do not match 13

the reported travel time distributions (i.e., reported by respondents in the survey). Therefore, a discrete 14

choice model that employs available data and accounts for data limitations is developed. Results from 15

comparing the base model assuming error-free inputs and the integrated models show more consistent and 16

plausible model outputs such as value of time for the latter. 17

Keywords: travel time perception, measurement error, mode choice, discrete choice. 18

(3)

INTRODUCTION 1

It is well established in the travel behaviour field that travellers often overestimate or underestimate the true 2

travel time of their trip. Such perception errors will then influence their travel decisions. Similarly when 3

travel attributes, such as time, are imputed using softwares there may be relevant measurement errors. 4

These measures will typically be used in travel choice models. In addition, the "calculated" travel time 5

distributions (i.e., calculated by softwares) could differ from respondent-declared "reported" travel time 6

distributions. While these issues have been raised in the literature there is still a dearth in understanding as 7

to what constitutes the ideal framework to deal with the different types of biases related to each source of 8

travel time model inputs. There is no clear view of the consequences of disregarding the measurement 9

errors related to each data-source. 10

The subjective nature of temporal judgement is established in psychological research. Hornik (1) 11

for instance finds that good mood leads to retrieving biased memories of time congruent with the mood. In 12

transportation research, Bates et al. (2) argue that it is likely that travellers are maximizing utility according 13

to their own divergent views of the travel time distribution notwithstanding actual measurements. 14

Consequently, travellers will differ in their optimal choices depending on the degree of distortion of their 15

subjective distribution with regards to the actual measurement distribution. In relation to this, Rietveld (3) 16

notes that in travel surveys most respondents apply rounding of departure and arrival times to multiples of 17

5, 15 and 30 minutes. A possible explanation for this effect is that scheduled activities force people to plan 18

their trips in advance which provide them with anchor points for their memory afterwards. Explicitly 19

addressing rounding leads to a considerably better treatment of variances of reported travel times and 20

enables one to avoid biases in the computation of average transport times based on travel surveys. 21

Although there is a broad literature on measurement errors in the econometric literature, few 22

researches are directly addressing measurement errors in transportation modelling and in choice models. 23

McFadden (4) notes that aggregate travel time data are often not sufficient and individual measurements of 24

travel times are fundamental for modelling travel behaviour. 25

In the last decade, the popularity of hybrid choice models has grown considerably in a wide number 26

of disciplines, including transport (5, 6, 7, 8, 9). Integrated Choice and Latent Variable models (ICLV) are 27

primarily employed for including attitudes and perceptions as explanatory variables of the choice, using 28

psychometric scales as indicators of unobservable latent constructs (10, 11, 12). This methodology could 29

potentially be used to deal with any type of variable which affects the choice. Walker et al. (13) focus on 30

how to estimate travel demand models when the underlying quality of level of service data (times and costs) 31

are poor. It is demonstrated that a choice model with measurement errors results in inconsistent estimates of 32

the parameters and therefore, methods to correct measurement errors need to be employed. The authors 33

propose to use the hybrid choice framework to integrate travel time as a latent variable and use the 34

measured travel time as an indicator of the latent true travel time. The ICLV model for true travel time leads 35

to significant shifts in both the travel time and travel cost parameters, resulting eventually in a large increase 36

in the value of time. In the context of their case study, the VOT calculated with the hybrid choice model 37

seems to be more realistic than the VOT calculated with the base model. Hess et al. (14) develop a latent 38

variable approach to deal with missing values and measurement errors relating to income. The reported 39

income is replaced by a latent income variable in a choice model, using the stated income as an indicator of 40

the unobservable true income in a measurement model. In contrast with using imputation of missing values, 41

the simultaneous estimation with the choice model allows the observed choices to affect the latent variable. 42

Furthermore, unlike approaches relying on stated income or on imputed values, the method is directly 43

applicable for forecasting. Indeed, the approach of estimating separate cost sensitivities for respondents 44

with missing income does not easily carry over into forecasting when the forecast population is different 45

from the estimation sample. 46

Turning to examine missing observations, research has shown that imputing such values typically 47

generates additional error. This applies both when missing data is treated analytically (15) or when applying 48

multiple imputation (16) originally proposed by Rubin (17). Multiple imputations can be used when 49

accurate data for a subsample of the observations are available. Bhat (18) imputes a continuous variable for 50

missing values, meaning that the variable is drawn from the observed variables. 51

(4)

The aim of this paper is to propose a new model framework that relies on existing, and potentially biased 1

data (subjective reported or instrument imputed). The scope is to develop demand models that overcome the 2

inherent limitations in the data and can provide more robust estimates and enhanced forecasting accuracy 3

compared to standard models that treat the imprecise values as true. 4

The issues of poorly measured levels of service data and not detailed enough travel behaviour 5

information are explored in a real mode choice case study for a university campus in Trieste (Italy). It 6

appears important to note that the above-mentioned dataset was collected for an assignment model whereas 7

no discrete choice model has previously been estimated. Public transport network levels of service 8

measures are estimated by means of an assignment model developed with the software Visum. The 9

imputation of travel time using such a model could potentially lead to large measurement errors. In 10

addition, the travel time distributions calculated with the help of instruments (i.e., “calculated travel time”) 11

could differ from the travel time distributions reported by respondents (i.e., “reported travel time”). In order 12

to deal with the above-mentioned limitations, the ICLV framework is proposed to correct measurement 13

errors in a mode choice context (13, 14). 14

The paper is structured as follows. The next section presents the transport mode choice case study 15

and the data processing procedure. The section after that provides the methodology and includes the model 16

specification regarding the Multinomial Logit model and the Integrated Choice and Latent Variable models. 17

The section which follows this presents the discussion of the estimation results obtained by using the 18

extended software package BIOGEME (19). This section is followed by the validation and policy analysis. 19

The last section provides the conclusions and suggestions for future research, while also discussing the 20

limitations concerning the dataset used and possible extensions. 21

SURVEY AND DATA COLLECTION 22

A comprehensive data collection campaign was carried out in Trieste between November 2009 and January 23

2010, in the framework of UniMob - a Mobility Management project for the University of Trieste studying 24

staff and students travel demand. The survey consisted of a quantitative part collecting socio-demographic 25

data and information related to the role occupied, the length of service, the frequency of being at the 26

university, their residence status and the means available. In addition, respondents were asked to thoroughly 27

report information about the home-university trips made during the day, including their origins, 28

destinations, chosen modes, arrival and departure times. Information on travel habits, activities during 29

travel, potential reasons for mode switching, perception of the risks associated with each mode and 30

opinions on urban mobility related topics were also gathered. The main data source used in this research 31

consists of revealed preferences (RP) data on mode choice (20). The survey was performed as an on-line 32

questionnaire, which allowed the whole university population (24685 users) to participate in the interview. 33

During November 2009 all regularly registered students (21601 users) and all teaching and administrative 34

staff (3084 users) were invited by email to complete the questionnaire (21). In total 3976 valid 35

questionnaires were collected (response rate: 16.11%). Descriptive statistics are available in Varotto et al. 36

(22). 37

Data processing 38

The available data have to be processed in order to extract all the variables necessary to define the utility 39

functions for the alternative modes. The choice of the transportation mode is assumed to be among four 40

alternatives: car, motorcycle, public transport (PT) and walk. All or some of these modes are available to 41

each user, depending on their availability. 42

Origin-destination matrices (OD) are constructed using as reference a Visum assignment model 43

(23). In this model, each zone was represented through a point placed in the barycentre of the zone and each 44

trip was modelled as a travel between the barycentres of the corresponding OD zones. The OD couples are 45

assigned to each user controlling for the residential status declared, the frequency with which each 46

respondent went to the university, the modes used, the sequence of modes and the reported travel time. 47

(5)

Travel time is imputed for each alternative mode separately. Different devices are used such as the 1

assignment model made by Visum and Google Maps. Distances between each origin and destination are 2

calculated by using the website Google Maps and based on the addresses of origin and destination reported 3

by respondents. Travel costs are calculated for the chosen and the unchosen alternatives. More details about 4

the data processing are available in Varotto et al. (22). In addition, significant differences between the staff 5

and student samples are discovered with formal comparisons and for the purpose of the current study only 6

the more homogeneous sample of staff is selected. 7

Travel time indicators 8

As a first exploratory investigation of the underlying travel time perception, the travel durations reported by 9

respondents for the chosen alternative are analysed and compared to the travel times obtained from the 10

assignment software Visum (PT) and from Google Maps (car and motorcycle). For the chosen alternative 11

two main indicators are used: 12

• Reported arrival and departure times; 13

• Calculated travel time (imputed using the above described procedures). 14

For the unchosen alternatives only the calculated travel time is available. Figure 1 illustrates the difference 15

between the two measures for PT. 16

17

FIGURE 1 Histogram of reported and calculated travel time for public transport. 18

The comparison reveals that the means of reported and calculated travel time do not match for any modes. 19

Specifically, the reported travel time is overestimated compared to the calculated for all modes except walk. 20

Moreover, it can be observed in Figure 1 that the reported values are clustered around multiples of 5, 15, 30 21

and 60 minutes. This preliminary analysis points towards the relevance of developing methods to account 22

for imprecision in travel time reporting. 23 0% 5% 10% 15% 20% 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 Respo nd ent s [%]

Travel time [minutes]

(6)

METHODOLOGY 1

The aim of the present research is to investigate the impact of the underlying travel time perception of 2

individual travellers on mode choices. Therefore, the unobservable true travel time which drives the choices 3

is modelled as a latent variable. This approach aims to understand the consequences of the direct inclusion 4

of assumed error-free time measures in mode choice modelling. In the following, two main models 5

including different travel time indicators within the ICLV framework will be presented: 6

 Specification 1 using calculated time, assumed to be the best revealer of the true travel time and 7

affected by different errors related to the instrument employed for imputation; 8

 Specification 2 using reported travel time, assumed instead to give better insight to the decision 9

process and be affected by perceptual biases. 10

Results will be compared with a reference Multinomial Logit model (MNL) with an equivalent 11

specification but using the calculated travel time directly in the utility function for all the modes. The 12

methodology is employed to correct the travel time of public transport, because the imputation of travel 13

time using the Visum assignment model could potentially lead to large measurement errors. In addition, the 14

gap between the reported and calculated travel time seems to affect more the travellers who choose PT than 15

all remaining alternatives. 16

ICLV specification 1 17

The first specification is an ICLV model assuming that the true travel time is a latent attribute and the 18

calculated travel time can be used as an indicator. The model schematized in Figure 2 is called the ICLV for 19

the true travel time, since a latent attribute (the true travel time) is integrated into the choice model. A 20

similar framework was proposed by Walker et al. (13) to deal with measurement errors in the calculated 21

travel time. Observable variables such as explanatory variables, indicators and choices are represented by 22

rectangular boxes and latent variables such as utilities and latent attributes are represented by ovals. 23

Structural equations are represented by straight arrows while measurement equations are represented by 24

dashed arrows. 25

FIGURE 2 ICLV- 1 for true travel time for public transport. 26

True travel time

Indicator of Travel Time Calculated travel time

Choice: Car,

Moto, PT, Walk. Utility Attributes of the alternatives

Characteristics of the traveller

Choice Model Latent attribute model

 Female;  Residents;  Year of service;  Indirect trip;

 Trips to other faculties;  Faculty in Città Vecchia.  Distance;

 Travel cost:  Travel time;

(7)

Latent Variable model: structural equation for latent attribute 1

In the Latent Variable model, the true travel time 𝑇𝑇_𝑛∗_{(i.e., latent attribute) is assumed to be given by the} 2

equation (1) for the public transport alternative and for each individual n: 3

𝑇𝑇𝑛∗= 𝑐 + 𝜎 ∙ 𝛿𝑛, 𝑤𝑖𝑡ℎ 𝛿𝑛~𝑁(0,1) (1)

4

Where c and 𝜎 are parameter to be estimated. The mean of the true travel time 𝑇𝑇𝑛∗ is represented by the 5

parameter c. The error term has a mean equal to zero and a standard deviation equal to the parameter 𝜎. The 6

distribution of the latent variable TT* is 𝑓₁(𝑇𝑇𝑛∗; 𝑐, 𝜎). 7

Latent Variable model: measurement equation for latent attribute 8

Measurement equations are built with the corresponding indicators of travel time as given in the equation 9

(2). 𝐼_1𝑛 represents the indicator of calculated travel time for the respondent n: 10

𝐼_1𝑛= 𝛼₁+ 𝜆₁∙ 𝑇𝑇_𝑛∗_{+ 𝜎}

1∙ 𝛿1𝑛, 𝑤𝑖𝑡ℎ 𝛿1𝑛~𝑁(0, 𝜎1) (2) 11

Where 𝛼₁and 𝜆₁ are parameters which are fixed for normalization purposes, 𝜎₁ is a parameter to be 12

estimated and 𝑇𝑇_𝑛∗_{is the latent attribute. The error term has a mean equal to zero and a standard deviation} 13

equal to the parameter 𝜎1. The measurement equation is based on a continuous scale since the calculated 14

travel time is continuous. The distribution of the indicator 𝐼₁ is 𝑓₂ (𝐼_1𝑛| 𝑇𝑇_𝑛∗_{; 𝛼, 𝜆, 𝜎).} 15

Discrete choice model 16

The latent attribute 𝑇𝑇𝑛∗ is introduced into the utility function of the public transport alternative in place of 17

the calculated travel time. It is essential to note that including the calculated travel time directly into the 18

utility function assumes that the value is measured without error, while including the latent attribute 𝑇𝑇_𝑛∗ 19

accounts for the distribution of the parameter. The utility 𝑈_𝑖𝑛 of an alternative i for a decision-maker n is 20

expressed as a function V of observed characteristics 𝑋𝑖𝑛, 𝑋𝑛 and of the latent attribute 𝑇𝑇𝑛∗ as given in the 21 equation (3): 22 𝑈𝑖𝑛 = 𝑉( 𝑋𝑖𝑛, 𝑋𝑛, 𝑇𝑇𝑛∗; 𝛽) + 𝜀𝑖𝑛, 𝑤𝑖𝑡ℎ 𝜀𝑖𝑛~𝐸𝑉(0,1) (3) 23 Where 24

𝑋_𝑖𝑛 is a vector representing the attributes of the alternative i; 25

𝑋𝑛 is a vector representing the characteristics of the decision-maker; 26

𝑇𝑇_𝑛∗_{is the unobservable true travel time;} 27

𝛽 is a vector of parameter to estimate; 28

𝜀_𝑖𝑛 is the error term. 29

Integrated model framework 30

Under the assumption that the error terms are independent and 𝑦𝑖𝑛 is the indicator of choice for an 31

individual n (equal to 1 if alternative i is chosen, and 0 otherwise), the likelihood function for an individual 32

n is given by the formula (4): 33 ℒ_𝑛(𝑦_𝑖𝑛,𝐼₁|𝑋_𝑖𝑛, 𝑋_𝑛; 𝛼, 𝜆, 𝛽, 𝜎 ) = ∫ 𝑃(𝑦𝑖𝑛 _𝑇𝑇∗ |𝑋𝑖𝑛, 𝑋𝑛, 𝑇𝑇𝑛∗; 𝛽) ∙ 𝑓2 (𝐼1𝑛| 𝑇𝑇𝑛∗; 𝛼, 𝜆, 𝜎) ∙ 𝑓1 (𝑇𝑇𝑛∗; 𝑐, 𝜎)𝑑𝑇𝑇∗ (4) 34 Where 35

𝑃(𝑦𝑖𝑛 |𝑋𝑖𝑛, 𝑋𝑛, 𝑇𝑇𝑛∗; 𝛽) is the probability that alternative i is chosen by individual n; 36

𝑓₂ (𝐼_1𝑛| 𝑇𝑇_𝑛∗_{; 𝛼, 𝜆, 𝜎) is the density function of the indicator 𝐼} 1; 37

𝑓₁ (𝑇𝑇_𝑛∗_{; 𝑐, 𝜎) is the density function of the latent attribute 𝑇𝑇} 𝑛∗. 38

(8)

The parameters of the integrated model are estimated using maximum likelihood techniques as presented in 1 equation (5): 2 max_{𝛼,𝜆,𝛽,𝜎}∑ log (ℒn _𝑛(𝑦_𝑖𝑛,𝐼₁|𝑋_𝑖𝑛, 𝑋_𝑛; 𝛼, 𝜆, 𝛽, 𝜎 )) (5) 3 ICLV specification 2 4

The second specification employs the reported travel time as an indicator of the true travel time. Aside from 5

this, an important difference from the previous model is the inclusion of elements of travel behaviour in the 6

measurement equation, assuming that they modulate the reported travel time explaining the perceived 7

duration of the trip. For this motivation this inclusion is very central to the scope of more accurate models of 8

travel time that effectively drives the choices. The specification is shown in Figure 3. 9

FIGURE 3 ICLV-2 for true travel time for public transport. 10

Latent Variable model: structural equation for latent attribute 11

In the Latent Variable model, the true travel time 𝑇𝑇_𝑛∗_{(i.e., latent attribute) is assumed to be given by the} 12

equation (6) for the public transport alternative and each individual n: 13

𝑇𝑇_𝑛∗_{= 𝑐 + 𝜆 ∙ 𝑇𝑇}

𝑛+ 𝜎 ∙ 𝛿𝑛, 𝑤𝑖𝑡ℎ 𝛿𝑛~𝑁(0,1) (6) 14

Where 𝑇𝑇_𝑛 is the calculated travel time, c, 𝜆 and 𝜎 are parameter to be estimated. The distribution of the 15

latent variable 𝑇𝑇_𝑛∗_{is 𝑓}

1 (𝑇𝑇𝑛∗|𝑇𝑇𝑛; 𝑐, 𝜆, 𝜎) . 16

Latent attribute model True travel

time

Indicator of Travel Time Reported travel time

Choice: Car, Moto, PT, Walk. Utility Attributes of the alternatives Characteristics of the traveller Choice Model Elements of travel behaviour Calculated travel time

 Female;  Residents;  Year of service;  Indirect trip;  Trips to other faculties;  Faculty in Città Vecchia.  Distance;  Travel cost:  Travel time;  Parking time in Cattinara and Stazione.  Music;  Reading.

(9)

1

Latent Variable model: measurement equation for latent attribute 2

Assuming that elements of travel behaviour 𝑋_𝑛 affect the reported travel time, the measurement equation 3

for the indicator of reported travel time 𝐼_2𝑛 for the respondent n (with departure/arrival time) is built as 4 given in equation (7): 5 𝐼_2𝑛= 𝛼₂+ 𝜆₂∙ 𝑇𝑇_𝑛∗_{+ 𝛽} 2∙ 𝑋𝑛+ 𝜎2 ∙ 𝛿2𝑛, 𝑤𝑖𝑡ℎ 𝛿2𝑛~𝑁(0, 𝜎2)

(7) 6

Where 𝛼2 and 𝜆2 are parameters which are fixed for normalization purposes, 𝛽2 and 𝜎2 are parameters to 7

be estimated, 𝑇𝑇_𝑛∗_{is the latent attribute and 𝑋}

𝑛 are the socio economic variables and elements of travel 8

behaviour of respondent n. The measurement equation is based on a continuous scale. The distribution of 9

the indicator 𝐼_2𝑛 is 𝑓₂ (𝐼_2𝑛| 𝑋𝑛, 𝑇𝑇𝑛∗; 𝛼, 𝜆, 𝛽, 𝜎). 10

For the construction of the measurement equation for the true travel time, which is assumed to be 11

affected by elements of travel behaviour, a principal component analysis (PCA) is performed with the 12

whole set of elements of travel behaviour as an exploratory step. The PCA is performed employing the 13

statistical software R, using the package psych version 1.2.8 developed by Revelle (24). As a result two 14

main travel behaviour features are included, as follows: 15

 ReadingSTAFF, corresponding to the statement “I never read during the home-university trip”

16

(five-point Likert scale); 17

 MusicSTAFF, corresponding to the statement “I never listen to music during the home-university trip”

18

(five-point Likert scale). 19

ESTIMATION RESULTS 20

The maximum likelihood method is used for model estimation, which is done by using the extended 21

software package BIOGEME (19). The ICLV models are estimated using the sequential approach as 22

described by Walker (8). The log-likelihood, the goodness of fit results and the estimation results are 23

reported in Table 1. The final log-likelihood values and rho-bar-squared are calculated for only the choice 24

component of the ICLVs to be comparable with the base model. The fit and the likelihood function of the 25

choice component of the ICLVs are lower than in the MNL model. These results are consistent with recent 26

findings by Vij and Walker (25). Indeed, they observed that any ICLV model can be reduced to a choice 27

model without latent variables that fits the data at least as well as the original ICLV model from which it 28

was obtained. 29

Looking at the utility parameters of the MNL model, explanations are provided discussing the signs 30

and the magnitudes of the parameters related to the attributes and the other explanatory variables. All the 31

parameters are statistically significant. 32

Referring to the parameters which regard the modal attributes of travel time, cost, distance and 33

parking time, it can be observed that they affect the utility negatively, in line with expectations. For PT, the 34

calculated travel time was missing for 14.21% users in this sample. Thus, we introduce a variable 35

MissingTimePT_STAFF which is equal to 1 when the PT travel time is missing and 0 otherwise. For car 36

alternative, a variation in travel time has a different effect on mode choice, depending on whether the 37

distance travelled is short or long. For this motivation, the interaction TTCAR/D is introduced into the utility

38

function, implying that the coefficient of TTCAR (i.e., βTIME_CAR_STAFF/D) is scaled by distance D expressed in

39

kilometres in Table 1. This interaction is significant and improves the fitting for car alternative only. In 40

addition, the parking time parameter is only significant for the staff members working at the faculties 41

located near Ospedale Maggiore and near the railway station, where the parking lots available are limited 42

and the time necessary to find a parking lot is high (e.g., 10 -15 minutes). 43

(10)

TABLE 1 Statistics and estimation results for the ICLV models 1

Base model ICLV specification 1 ICLV specification 2 Statistics Number of parameters 15 18 21 Number of parameters (choice component) 15 15 14 Number of observations 901 901 901 Final log-likelihood -417.961 -490.554 -438.501 Adjusted rho-bar-squared 0.564 0.491 0.544

Parameters Estimate T-test Estimate T-test Estimate T-test

ASCCAR_STAFF 1.129 2.70 2.284 - 1.290 2.08 ASCMOTO_STAFF 0.245 0.49 * 0.150 - 0.079 0.16 ASCPT_STAFF 3.277 5.71 7.730 - 6.860 4.38 ASCSM_STAFF - - -0.170 - - - βCOST_STAFF -4.826 -11.98 -4.690 -8.65 -4.750 -11.43 βTIME_CAR_STAFF -0.893 -5.59 -1.160 -6.85 -0.927 -5.83 βTIME_MOTO_STAFF -0.142 -3.29 -0.110 -2.43 -0.129 -3.05 βTIME_PT_STAFF -0.109 -7.13 -0.347 -4.08 -0.178 -4.59 βTIME_WALK_STAFF -0.083 -9.33 -0.082 -7.72 -0.085 -8.96 βMISSING_TIME_PT_STAFF -5.430 -6.62 - - - - βPARKING_CAR_STAFF -0.300 -4.85 -0.291 -4.71 -0.252 -4.40 βFEMALE_MOTO_STAFF -0.956 -2.64 -0.974 -2.64 -0.936 -2.62 β20YEAR_CAR_STAFF 0.018 2.35 0.018 2.22 0.018 2.35 βINDIRECT_TRIP_CAR_STAFF 1.487 5.16 1.630 5.31 1.590 5.48 βFACULTIES_PT_STAFF 0.787 3.03 0.616 2.12 0.509 2.01 βCITTAVECCHIA_WALK_STAFF 0.780 1.96 0.664 1.56 0.848 2.19 Latent time – 𝑐 - - 19.466 42.36 22.445 5.69 Latent time – 𝜆 - - - - 0.587 6.26 Latent time – 𝜎 - - -0.551 -3.79 -1.322 -1.08 * βREADING_STAFF - - - - -7.900 -2.69 βMUSIC_STAFF - - - - 6.245 2.16 βMISSING_READING_MUSIC - - - - -2.199 -0.20 * Measurement Equation – 𝜎1 - - 2.425 76.53 2.390 34.68 2

Moreover, some socio-economic variables have a significant effect on the choice of transport modes. The 3

parameters of the explanatory variables introduced have the expected signs and further observations are 4

presented below: 5

 Female staff members tend to least prefer motorcycles among all modes; the negative sign of 6

βGENDER_MOTO_STAFF implies that female staff is less likely to choose motorcycles than other modes;

7

 Staff members who reported more than 20 years of service are more likely to choose car than any 8

other mode and this effect increases linearly with the number of years of service (positive 9

β20YEAR_CAR_STAFF). Possible explanations could be: first that older people prefer more to travel by car,

10

second that people with a longer working experience could have a higher income and can afford a 11

car; 12

(11)

 Staff members who reported to have stopped once or more for different purposes during their home 1

– university trip are more likely to choose car. Possible explanations could be that first the necessity 2

to stop for other purposes bring the need for more flexible forms of transport, second that the user 3

may need to drive other passengers to different destinations; 4

 Staff members are more likely to choose PT than all the other modes in their home-university trip 5

when they need to visit other faculties during the day. A possible explanation could be the difficulty 6

to find an empty parking lot near the other faculties that should be visited during the day, than in the 7

morning when the respondent arrives to university. 8

The estimated parameters of the MNL model and the choice component of the ICLV models are close to 9

each other, with the exception of the time parameters. Walker et al. (13) note significant shifts in the time 10

parameter of the ICLV model. Observing the time parameter of PT, which is associated to the latent time 11

estimated, the magnitude increases in both ICLVs compared to the MNL. 12

Looking at the Latent Variable models, the means of the latent travel times are expected to have a 13

magnitude and sign comparable to the mean values of the reported travel time (31.170 minutes) and of the 14

calculated travel time (19.160 minutes) present in the raw dataset. 15

Referring to the ICLV-1, the mean of the latent time 𝑐 is equal to 19.466 minutes and the standard 16

deviation 𝜎 is 0.551 minutes. The measured travel time is assumed to be normally distributed with mean 17

equal to the latent travel time and standard deviation 𝜎₁ equal to 2.425 minutes. This standard deviation is 18

significant, indicating that there is a measurement error inherent in the network derived travel times. 19

Referring to the ICLV-2, the mean of the latent time 𝑐 is equal to 22.445 minutes, the calculated 20

travel time affects the latent time significantly with a magnitude 𝜆 equal to 0.587 and the standard deviation 21

𝜎 is 1.322 minutes. The reported travel time is assumed to be normally distributed with mean equal to the 22

latent travel time and standard deviation 𝜎₁ equal to 2.390 minutes. Moreover, the parameters βREADING_STAFF

23

and βMUSIC_STAFF referring to the elements of travel behaviour introduced in the measurement equation of the

24

reported travel time have the expected signs, accordingly to the results obtained by the exploratory PCA. 25

Clear conclusions for missing values cannot be drawn because the coefficient βMISSING_READING_MUSIC has a

26

low t-statistic. The habit of never listening to music during the trip affects positively the reported time 27

through the measurement equation, meaning that people who usually listen to music during the trip are 28

more likely to report a travel time that is closer to the calculated one. The habit of never reading for leisure 29

during the trip affects negatively the measurement equation of reported travel time, meaning that people 30

who usually read for leisure during the trip are more likely to report a travel time that is far from the 31

calculated one. In general, it could be noted that the reported travel time seems to be an overestimation of 32

the calculated travel time, in accordance with the results obtained through statistics. 33

VALIDATION AND POLICY ANALYSIS 34

First, a validation analysis of the model is provided in order to assess if the model could be applied to other 35

potential data sets. Second, the value of time, which is an indicator of the willingness to pay (WTP) of 36

individuals to reduce the duration of their trip by one hour, is computed for each alternative mode. 37

Validation 38

A proper validation of the model would require its application on a different data set but no other similar 39

dataset is available. Therefore, the dataset available is split into two parts. First, 70% of the observations are 40

selected randomly and the model is estimated on the latter. The ICLV models require a correction of the 41

constants to reproduce correctly the choice probabilities within that sample (26). Second, the models are 42

applied on the remaining 30% of the observations. The average number of alternatives available for each 43

respondent and the corresponding chance level are calculated. Table 2 reports for each model the 44

percentages of choice probabilities higher than 0.33 (chance level), 0.50, 0.70 and 0.90. The choice 45

probabilities are well predicted by all three models. Cross-validation favours the MNL and ICLV-2 models 46

for the “chance” model, while the ICLV-1 predicts choices better at higher levels of accuracy (where one 47

alternative is utility-dominant). 48

(12)

TABLE 2 Percentages of choice probabilities higher than given thresholds and value of time 1

Base model ICLV specification 1 ICLV specification 2 Threshold 33 % 83.08 % 80.07 % 83.08 % 50 % 75.56 % 72.55 % 72.93 % 70 % 60.90 % 61.27 % 59.77 % 90 % 40.60 % 43.98 % 37.59 % Value of Time Car 4.69 €/h 6.27 €/h 4.94 €/h Moto 1.76 €/h 1.40 €/h 1.62 €/h PT 1.36 €/h 4.43 €/h 2.25 €/h

Parking search time 3.73 €/h 3.71 €/h 3.19 €/h

Value of time 2

The value of time (VOT) is presented in Table 2. Referring to the base model, it is important to point out 3

that the VOT is higher for car than for all the other modes. The magnitude could be explained by the very 4

low travel costs for private motorized modes (i.e., cost of fuel) and for public transport (i.e., 1-hour urban 5

bus ticket equal to 1.10 €). In addition, it could be noted that the travel time considered only corresponds to 6

the in-vehicle time for car, motorcycle and PT. 7

The VOT calculated for car seems consistent with the referential values for the VOT in Italy found 8

in the literature. Indeed, many authors report a value of time for urban commuting trips by car around 4.00– 9

5.00 €/h for users who work (27, 28, 29, 30). In addition, Rotaris et al. (31) estimate the VOT for students 10

enrolled at the University of Trieste, developing a methodology which combines revealed and stated 11

preferences: the VOTs vary from 1.40 to 2.80 €/h. The VOT obtained for PT seems to be lower than 12

expected. 13

Referring to the ICLV models, the value of time is calculated for each mode because the correction 14

for measurement errors implemented for public transport affects the whole model. It could be noted that, in 15

both ICLVs estimated, the time parameter for PT shifts and the estimated VOT increases significantly, 16

exactly as in the case study presented by Walker et al. (13). In the ICLV-1, the value of time increases by 17

over 200%, from 1.36 €/h to 4.43 €/h. In the ICLV-2, the value of time increases by over 65%, from 1.36 18

€/h to 2.25 €/h. In addition, the 90% confidence bounds for the VOT of PT are calculated: in the base model 19

the confidence interval is equal to (1.15, 1.55); in ICLV-1 it is (3.57, 5.25); in ICLV-2 it is (1.58; 2.86). The 20

confidence intervals do not overlap between the three models, meaning that different treatments of the time 21

factor can lead to significantly different VOTs. 22

The VOTs obtained for PT within the hybrid choice framework seem to be more consistent with the 23

referential values found in literature, compared to the values obtained for PT with the MNL model. Cherchi 24

(28) points out that using the parameters estimated with a MNL, the VOT are largely underestimated and 25

more realistic results could be obtained accounting for the variations in sensitivity among respondents. 26

Fiorello and Pasti (30) report a value of time for PT trips performed by working commuters around 3.00 – 27

4.00 €/h. This value seems to be consistent with the VOT obtained in the ICLV-1. Cherchi (28) reports a 28

VOT for urban trips performed by working commuters by PT around 2.00 €/h. A value equal to 2.00 €/h 29

seems to be consistent with the VOT obtained in the ICLV-2. 30

(13)

CONCLUSION AND FUTURE RESEARCH 1

The aim of the present research is to develop methods for discrete choice modelling that account for 2

limitations in available data. In order to deal with measurement errors in travel time, the use of the hybrid 3

choice framework proposed by Walker (8) and Walker et al. (13) is explored: travel time is integrated into 4

the choice model as a latent variable. This approach is applied on a data set from a university survey which 5

was collected in Trieste (Italy) for an assignment model developed by Visum for public transport (23). 6

Two Latent Variable models for the value of travel time driving mode choices (i.e., “true travel 7

time”) are integrated into the discrete choice model. The methodology is employed to correct the travel time 8

of public transport, because of the accuracy of network derived level of service (i.e., travel time calculated 9

by the assignment model in Visum) is expected to be lower for this alternative than all the other modes. In 10

addition, the gap between the travel time which is reported by respondents (i.e., “reported travel time”) and 11

the “calculated travel time” seems to affect more the travellers who choose PT. In the first Latent Variable 12

model, the calculated travel time is assumed to be the best revealer and therefore used as an indicator of the 13

true travel time. In the second Latent Variable model, the reported travel time is assumed to give better 14

insight to the decision process and thus used as an indicator of the true travel time. In addition the reported 15

time is assumed to be affected by elements of travel behaviour which influence the perceived duration of the 16

trip, such as the habits of listening to music and reading during the home-university trip. The estimation 17

results obtained by the MNL and by the ICLVs are compared based on statistical significance. 18

Referring to the two ICLVs estimated in the case study analysed, the second specification proposed, 19

where the reported travel time is used as an indicator, fits the data better than the first one, where the 20

calculated travel time is used as an indicator. However, the first specification has a slightly better 21

forecasting power. 22

In terms of the analysis of demand, results indicate that the MNL which does not correct for 23

measurement errors seem to underestimate travellers’ value of time (1.36 €/h). In addition, the value of time 24

computed using the ICLV-1 (4.43 €/h) is higher than the value of time computed using the ICLV-2 (2.25 25

€/h). In comparison to the MNL, the ICLVs appear to produce more consistent parameters for the travel 26

time variable which define more realistic travel demand indicators, closer to the referential values found in 27

the literature for urban public transport in Italy, equal to 2.00 €/h (28) and 3.00-4.00 €/h (30). 28

The key point in this research is that measurement error can cause serious biases and methods that 29

explicitly recognize and correct for such errors are necessary to improve the realism of the resulting 30

analysis. There are many directions for future research: first, the ICLVs should be applied (e.g., calculating 31

market shares and demand elasticities, and testing scenarios in which the calculated travel time by the 32

assignment model is corrected in the LV models) and the implications on the resulting policy analysis 33

should be investigated; second, a mixed discrete-continuous distribution of travel time could be introduced 34

in the measurement equation of the Latent Variable model, explicitly addressing the rounding of reported 35

travel time; third, the methodology could be used to correct for measurement errors for each mode; fourth, 36

this framework has the potentiality to explicitly correct biases related to travel time reporting for the chosen 37

and the unchosen alternatives (if available) using two different measurement equations. 38

In addition, the outcomes obtained suggest that a new survey should be carried out in order to 39

collect more detailed information: first, further data should be collected to analyse more realistically the 40

choice set of each user; second, more detailed disaggregate data regarding the level of service of public 41

transport should be derived (access time, waiting time, number of transfers, in-vehicle time and the egress 42

time); third, the respondent should be asked to report the travel time for both the chosen and the unchosen 43

alternatives, specifying the access time, the in-vehicle time and the egress time; fourth, more detailed 44

information regarding the household structure (such as income, number of people in the household, 45

availability of private parking lot) seem to be necessary to better understand travel behaviour and to 46

improve the predictive power of the choice models. 47

ACKNOWLEDGMENTS 48

The authors would like to thank Evanthia Kazagli at Ecole Polytechnique Fédérale de Lausanne for 49

assistance and advice during the research. 50

(14)

REFERENCES 1

1. Hornik, J. Time estimation and orientation mediated by transient mood. Journal of 2

Socio-Economics, Vol. 21, No. 3, 1992, pp. 209–227. 3

2. Bates, J., J. Polak, P. Jones, and A. Cook. The valuation of reliability for personal travel. 4

Transportation Research Part E, Vol. 37, 2001, pp. 191–229. 5

3. Rietveld, P. Rounding of arrival and departure times in travel surveys: an interpretation in terms of 6

scheduled activities. Journal of Transportation and Statistics, Vol. 5, No.1, 2002, pp. 71-81 7

4. McFadden, D. Disaggregate Behavioral Travel Demand’s RUM Side: A 30-Year Retrospective. 8

Presented at the International Association of Travel Behavior Research, Gold Coast, Queensland, 9

Australia, 2000. 10

5. Ben-Akiva, M., D. McFadden, T. Garling, D. Gopinath, J. Walker, D. Bolduc, A. Boersch-Supan, 11

P. Delquie, O. Larichev, T. Morikawa, A. Polydoropoulou, and V. Rao. Extended framework for 12

modeling choice behaviour. Marketing Letters, Vol. 10, No. 3, 1999, pp. 187–203. 13

6. Ben-Akiva, M., D. McFadden, K. Train, J. Walker, C. Bhat, M. Bierlaire, D. Bolduc, A. 14

Boersch-Supan, D. Brownstone, D. Bunch, A. Daly, A. de Palma, D. Gopinath, A. Karlstrom, and 15

M. A. Munizaga. Hybrid choice models: Progress and challenges. Marketing Letters, Vol. 13, No. 16

3, 2002, pp. 163–175. 17

7. Bolduc, D., M. Ben-Akiva, J. Walker, and A. Michaud. Hybrid choice models with Logit kernel: 18

Applicability to large scale models. In: Lee-Gosselin, M. and S. Doherty. Integrated Land-Use and 19

Transportation Models: Behavioural Foundations. Elsevier, Oxford, 2005, pp. 275-302. 20

8. Walker, J. Extended discrete choice models: integrated framework, flexible error structures, and 21

latent variables. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, 22

2001. 23

9. Walker, J., and M. Ben-Akiva. Generalized random utility model. Mathematical Social Sciences, 24

Vol. 43, 2002, pp. 303 – 343. 25

10. Atasoy, B., A. Glerum, and M. Bierlaire. Mode choice with attitudinal latent class: a swiss 26

case-study. Proceedings of the Second International Choice Modeling Conference, Leeds, UK, 27

2011. 28

11. Glerum, A., B. Atasoy, A. Monticone, and M. Bierlaire. Adjectives qualifying individuals’ 29

perceptions impacting on transport mode preferences. Proceedings of the Second International 30

Choice Modeling Conference, Leeds, UK, 2011. 31

12. Schüssler, N., and K. Axhausen. Psychometric scales for risk propensity, environmentalism and 32

variety seeking. Proceedings of the 9th International Conference on Survey Methods in Transport, 33

2011. 34

13. Walker, J., J. Li, S. Srinivasan, and D. Bolduc. Travel demand models in the developing world: 35

correcting for measurement errors. Transportation Letters, Vol. 2, 2010, pp. 231-243. 36

14. Hess, S., N. Sanko, J. Dumont, and A. Daly. A latent variable approach to dealing with missing or 37

inaccurately measured variables: the case of income. Proceedings of the Third International Choice 38

Modelling Conference, 2013. 39

15. Daly, A., and S. Zachary. The Effect of Free Public Transport on the Journey to Work. Transport 40

and Road Research Laboratory Report SR388, 1977. 41

16. Brownstone, D., and S. Steinmetz. Estimating commuters Value of Time with noisy data. 42

Transportation Research Part B, Vol. 39, No. 10, 2005, pp. 865-889. 43

17. Rubin, D. Multiple Imputation for Nonresponse in Surveys. Wiley, New York, 1987. 44

(15)

18. Bhat, C. Imputing a continuous income variable from grouped and missing income observations. 1

Economics Letters, Vol. 46, 1994, pp. 311-319. 2

19. Bierlaire, M., and M. Fetiarison. Estimation of discrete choice models: extending BIOGEME. 3

Proceedings of the 9th Swiss Transport Research Conference, Ascona, Switzerland, 2009. 4

20. Progetto UniMob. La domanda di mobilità verso l’Università – Indagine quantitativa. Università 5

degli Studi di Trieste, Trieste, Italy, 2009-2010. 6

21. Progetto UniMob. Analisi della popolazione universitaria. Università degli Studi di Trieste, 7

Trieste, Italy, 2009-2010. 8

22. Varotto, S. F., A. Glerum, A. Stathopoulos, and M. Bierlaire. Modelling travel time perception in 9

transport mode choices. Proceedings of the 14th Swiss Transport Research Conference, Ascona, 10

Switzerland, 2014. 11

23. Progetto UniMob. Valutazione del sistema di trasporto pubblico. Università degli Studi di Trieste, 12

Trieste, Italy, 2009-2010. 13

24. Revelle, W. Psych: Procedures for Personality and Psychological Research. R package version 14

1.2.8, North-Western University, Evanston, 2012. 15

25. Vij, A., and J. Walker. Hybrid choice models: holy grail... or not? Proceedings of the 13th 16

International Conference on Travel Behaviour Research, Toronto, Canada, 2012. 17

26. Ben-Akiva, M., and S. R. Lerman. Discrete choice analysis: theory and application to travel 18

demand. MIT Press, Cambridge, Massachusetts, 1985. 19

27. Catalano, M., B. Lo Casto, and M. Migliore. Car sharing demand estimation and urban transport 20

demand modelling using stated preference techniques. European Transport\Trasporti Europei, 21

Vol. 40, 2008, pp. 33-50. 22

28. Cherchi, E. Il Valore del Tempo nella Valutazione dei Sistemi di Trasporto. Franco Angeli, Milano, 23

Italy, 2003. 24

29. de Jong, G. and H. F. Gunn. Recent Evidence on Car cost and Time Elasticities of Travel Demand 25

in Europe. Journal of Transport Economics and Policy, Vol. 35, No. 2, 2001, pp. 137-160. 26

30. Fiorello, D., and G. Pasti. Il valore del tempo di viaggio – Guida teorica e applicativa. Quaderno 27

RT n.5, TRT Trasporti e Territorio, Milano, Italy, 2003. 28

31. Rotaris, L., R. Danielis, and P. Rosato. Value of travel time for university students: a revealed / 29

stated preference analysis. Journal of Environmental Economics and Policy, 2012, pp. 1-21. 30