Prediction interval methodology based on fuzzy numbers and its extension to fuzzy systems and neural networks

(1)

Delft University of Technology

Prediction interval methodology based on fuzzy numbers and its extension to fuzzy

systems and neural networks

Marín, Luis G.; Cruz, Nicolás; Sáez, Doris; Sumner, Mark; Núñez, Alfredo

DOI

10.1016/j.eswa.2018.10.043

Publication date

2019

Document Version

Final published version

Published in

Expert Systems with Applications

Citation (APA)

Marín, L. G., Cruz, N., Sáez, D., Sumner, M., & Núñez, A. (2019). Prediction interval methodology based on

fuzzy numbers and its extension to fuzzy systems and neural networks. Expert Systems with Applications,

119, 128-141. https://doi.org/10.1016/j.eswa.2018.10.043

Important note

To cite this publication, please use the final published version (if applicable).

Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

‘You share, we take care!’ – Taverne project

https://www.openaccess.nl/en/you-share-we-take-care

Otherwise as indicated in the copyright section: the publisher

is the copyright holder of this work and the author uses the

Dutch legislation to make this work public.

(3)

ContentslistsavailableatScienceDirect

Expert

Systems

With

Applications

journalhomepage:www.elsevier.com/locate/eswa

Prediction

interval

methodology

based

on

fuzzy

numbers

and

its

extension

to

fuzzy

systems

and

neural

networks

Luis G. Marín

a

_{, Nicolás Cruz}

a

_{, Doris Sáez}

a

_{, Mark Sumner}

b

_{, Alfredo Núñez}

c,∗

a Department of Electrical Engineering, University of Chile, Santiago, Chile

b Department of Electrical and Electronic Engineering, University of Nottingham, Nottingham NG7 2RD, UK c Section of Railway Engineering, Delft University of Technology, Delft, The Netherlands

a

r

t

i

c

l

e

i

n

f

o

Article history: Received 6 April 2018 Revised 8 September 2018 Accepted 20 October 2018 Available online 24 October 2018

Keywords: Prediction interval Fuzzy number Neuronal network Renewable energy Microgrid

a

b

s

t

r

a

c

t

Prediction interval modelling has been proposed in the literature to characterize uncertain phenomena and provide useful information from a decision-making point of view. In most of the reported studies, assumptions about the data distribution are made and/or the models are trained at one step ahead, which can decrease the quality of the interval in terms of the information about the uncertainty modelled for a higher prediction horizon. In this paper, a new prediction interval modelling methodology based on fuzzy numbers is proposed to solve the abovementioned drawbacks. Fuzzy and neural network prediction interval models are developed based on this proposed methodology by minimizing a novel criterion that includes the coverage probability and normalized average width. The fuzzy number concept is considered because the aﬃne combination of fuzzy numbers generates, by deﬁnition, prediction intervals that can handle uncertainty without requiring assumptions about the data distribution. The developed models are compared with a covariance-based prediction interval method, and high-quality intervals are obtained, as determined by the narrower interval width of the proposed method. Additionally, the proposed prediction intervals are tested by forecasting up to two days ahead of the load of the Huatacondo microgrid in the north of Chile and the consumption of the residential dwellings in the town of Loughborough, UK. The results show that the proposed models are suitable alternatives to electrical consumption forecasting because they obtain the minimum interval widths that characterize the uncertainty of this type of stochastic process. Furthermore, the information provided by the obtained prediction interval could be used to develop robust energy management systems that, for example, consider the worst-case scenario. © 2018 Elsevier Ltd. All rights reserved.

1. Introduction

Nonlinear models can provide excellent insight into complex

real-worldprocesses andsystems. Such models representthe

re-lationshipsamongvariablesandareusefulinplanningand

opera-tionalstagesaswellasintheanalysisofmeasureddata(Rencher& Schaalje,2008).Ingeneral,theaimofpredictive modelsisto ob-taina reliable representationofthe target system(Ghanbari, Ha-davandi, & Abbasian-Naghneh, 2010). In recent decades, several

methodologieshavebeenproposedtosolvenonlinearmodel

iden-tiﬁcationproblemsthatuseaﬁnitenumberofmeasureddataand

consideranoptimalitycriterion(Škrjanc,2011).Manystudieshave

examinedmethodsforimprovingtheaccuracyoftheseapproaches

∗ _{Corresponding author.}

E-mail addresses: luis.marin@ing.uchile.cl (L.G. Marín), nicolas.cruz@ing.uchile.cl

(N. Cruz), dsaez@ing.uchile.cl (D. Sáez), mark.sumner@nottingham.ac.uk

(M. Sumner), a.a.nunezvicencio@tudelft.nl (A. Núñez).

toobtainhigherprecisioninexpectedvalueprediction(Khodayar, Wang,&Manthouri,2018;Kroll&Schulte,2014).

Neural networks and fuzzy systems are eﬃcient for

nonlin-ear modellingbecause they havea highﬁttingaccuracy for

non-linear systems (Veltman, Marín, Sáez, Gutierrez, & Nuñez, 2015; Xu,Zhang,Zhu, & He,2017). Althoughcomputational intelligence

methods exhibit adequate performance inestimation and

predic-tion,uncertaintyisnottypicallyquantiﬁedbythesemodelling

ap-proaches,andonlyexpectedvalueisobtained. However,

informa-tion onthedispersion ofthe outputof themodelprovides more

informationaboutthephenomenamodelledwithuncertaintyand

more useful information from a decision-making point of view

thanthemodelswithonlyexpectedvalue(Kabir,Khosravi,Hosen,

&Nahavandi,2018;Shrivastava,Lohia,&Panigrahi,2016).

Conﬁdence intervals and prediction intervals have been

pro-posedtomodeltheuncertaintiesofasystem.Conﬁdenceintervals

are used to capture uncertaintiesin the unknown parameters of

a model.Conﬁdence intervalsare usually associatedwith

param-eters ratherthan withobservations. Prediction intervalsare used

https://doi.org/10.1016/j.eswa.2018.10.043

(4)

to capture uncertaintiesin random variables yet to be observed andprovideaprobabilitythattherandomvariablewillbewithina giveninterval(Dybowski&Roberts,2001;Heskes,1997;Ramezani, Bashiri,&Atkinson,2011;Rencher&Schaalje,2008).Prediction

in-tervals consider more sources of uncertainty than do conﬁdence

intervals;theseadditionalsourcesofuncertaintyincludemodel

er-ror and noise variance. The predicted outputs are intervals that

represent(withagivencoverageprobability)themostlikelyregion

deﬁned by theupper andlower boundsof theinterval towhich

theoutputoftheuncertainphenomenawillbelong.

Inthispaper,predictioninterval modelsare usedto represent

both nonlinear behaviour and uncertainty derived from

noncon-ventional energy sources and electrical demand. The

uncertain-tiesassociated withwind andphotovoltaic power aredue tothe

stochasticintermittencyoftheprimaryinput(windspeedand

so-larradiation),andtheuncertaintyofthedemandproﬁlesinenergy

communities (microgrids) isdue to minorload variations, which

cangeneratelargechangesinthetotalproﬁle(Parhizi,Lotﬁ, Kho-daei,&Bahramirad,2015).Moreover,forcontrolofmicrogrids,the

uncertaintyassociatedwithintermittentpowersourcesandloadis

typically handledusingarobust modelpredictivecontrol (Robust

MPC)intheformulationoftheenergymanagementsystem(EMS)

(Sáez,Ávila,Olivares,Cañizares,&Marín,2015).TheEMSisa

con-trol strategy that allows coordination of the energy resources to

supplythedemand,guaranteeinganeconomicandreliable

opera-tion.RobustEMShasbecomepopularbecauseitcanbeapplied

us-inguncertaintysetsratherthanprobabilisticmodels,reducingthe diﬃcultiesrelatedtoPDFidentiﬁcation(Lara,Olivares,&Cañizares, 2018).

In the work of Valencia, Collado, Sáez, and Marín (2016), a

wind-basedenergysourcewasmodelledbyafuzzyprediction

in-tervalbasedonthemethodreportedintheworkofŠkrjanc,Blažiˇc, and Agamennoni (2005), and a Robust EMS was achieved using

the convexsum ofthe lower (worstcase) andupper (best case)

boundsoftheavailablewindenergy.Inasimilarway,intheworks ofValenciaetal.(2016)andXiang,Liu,andLiu(2016),prediction

intervalmodelsofthesolarpower,windpower,andelectrical

de-mandofamicrogridweregeneratedtoformulateascenario-based

Robust EMS. InValencia etal. (2016),the combination ofall the

lower and upper bounds of the prediction intervals allowed the

various scenarios forRobustEMS to be deﬁned, andthesolution

wasobtainedusinga second-orderconeoptimizationproblem.In

Xiangetal.(2016),scenariosweregeneratedviaTaguchi’s

orthog-onal array testing method using the prediction intervals of the

uncertain variables modelled, and the optimization problem was

solved usinga search strategy basedon an orthogonal array.The

resultsofthepreviousstudiesshowedthatamoresecureand

re-liableoperationisachievedwithRobustEMSthanwithEMS

with-out uncertainty. However, the performance of a RobustEMS

de-pendsonthequalityofthepredictionintervalmodelsoverthe

fu-ture time horizon; therefore,improved prediction interval model

designs arerequired(Marín,Valencia,& Sáez,2016;Parhizietal., 2015;Sáezetal.,2015).

Several approaches have been proposed that use neural

net-works and fuzzy systems to generate prediction interval models

(seeSection2).Inmanycases,theseapproachescarryhigh

compu-tational costs and/or requiremaking assumptions aboutthedata.

Additionally,inseveralofthereportedapproaches,theprediction

interval models are tuned only one step ahead, which could

de-creasethequalityoftheintervalintermsoftheinformationabout

theuncertaintymodelledforahigherpredictionhorizon.

Thispaperpresentsanewmethodologyfordeveloping

predic-tionintervalmodelsusinganovelcriterionthatincludesthe cover-ageprobabilityandthenormalizedaveragewidthoftheintervalas metricsfortrainingmodelsatfuturesteps.Thus,theprediction

in-tervalmodelsaimtoachievethedesiredcoverageprobabilitywith

thesharpestintervalpossible.Notethatthenarrowerthewidthof

thepredictionintervals, themore accurate theinformationabout

theuncertaintyphenomena. However,a widththatis toonarrow

mightcompromisethecoverageprobability.

Themaincontributionofthisworkisanewmodelling

method-ologyforconstructingpredictionintervalsbasedonfuzzynumbers

anditsextension tofuzzyandneural networkpredictioninterval

models.Thefuzzynumberconceptisusedbecausetheaﬃne

com-binationofintervalfuzzynumbersgenerates,bydeﬁnition,

predic-tionintervalmodelsthatcanhandleuncertaintywithoutrequiring

assumptionstobemadeaboutthedataandthenoisedistribution.

Theproposedpredictionintervalisdevelopedintwostages.First,

model identiﬁcationis performed to tune the parameters

neces-saryforobtainingtheexpectedvalue.Then,thespreadsofthe

pa-rametersof the predictioninterval are found for thefuture step.

Theproposedmethodologycanbeusedtodescribealargefamily

ofuncertainnonlinearfunctions, suchastheelectrical demandin

smallcommunities.

The remainder of this paper is organized as follows:

Section 2 presents a literature review of prediction

inter-val modelling. Section 3 introduces the problem statement.

Section 4 presents prediction interval models based on interval

fuzzy numbers and the extension to fuzzy and neural network

models. Section 5describes the proposed methodfor developing

prediction interval models. Section 6 presents the results of a

benchmarktestandtwocasestudiesinvolvingloadforecastingfor

residential dwellings in the town of Loughborough, UK, and for

the isolated microgridinstalled atHuatacondoin northern Chile.

The last section provides the main conclusions and the focus of

futurework.

2. Literaturereviewforpredictionintervalmodelling

Inthe specialized literature, several methods based onneural

networksandfuzzysystemshavebeenproposedtoobtain

predic-tionintervals. Inthework ofKhosravi,Nahavandi,Creighton,and

Atiya(2011a),traditionalmethodsbasedonneuralnetworkswere

analysed,includingthedelta, Bayesian,mean-varianceestimation,

andbootstrapmethods.Theseapproachesarecomputationally

ex-pensive and/or make assumptions about noise. For instance, the

deltamethodassumesthatnoise ishomogeneous,andthe

calcu-lation ofJacobian matricesis required. The Bayesian method

as-sumes that the parameters are a random set of variables with a

distribution deﬁned a priori, and it requires the computation of

theHessianmatrix.Themean-varianceestimationmethod

consid-ersthatmodelerrorsarenormallydistributedaroundthetrue

tar-get;therefore,themethodrequirestheknownmeanandvariances.

Thisapproachassumesthatthecorrespondingneuralnetwork

ac-curatelyestimatesthetruetargets,whichisnotalwaystrue,

lead-ingto alow coverage probability.The bootstrap methodassumes

thatahigh-precisionestimateofthetruetargetswillbeproduced

byagroupofneuralnetworks.Thismethodisthemost

computa-tionallydemandinginthedevelopmentstagebecauseseveral

neu-ralnetworksarenecessarytoestimatevariance.However,afterthe

modelsaretrainedoﬄine,onlinecomputationsaresimpleanddo

notrequiretheevaluationofcomplexmatricesorderivatives.

Inthe works of Škrjanc(2011) andŠkrjanc etal. (2005),two

predictioninterval methods based on type-1 fuzzysystems were

proposed.InŠkrjanc(2011),theupperandlower boundsthat

de-ﬁnetheinterval areconstructedbased ontheerrorcovarianceof

eachruleofthefuzzymodel.However,inthisapproach,regionally

dependentnoisewithanormaldistribution,withzeromeanvalue

andvariance, is an a priori assumption. In Škrjanc etal. (2005),

anoptimizationprocedurewasusedtoﬁndthelower-and

upper-boundparametersofafuzzymodel.Thismethodforprediction

(5)

boundsare obtainedasthe result, andthe optimizationproblem

doesnotimposea desiredvalue forcoverage probabilityor

inter-valwidth. Inthework ofSáezetal.(2015),one-day-aheadfuzzy

predictioninterval models were developed,supported by the

co-variance methodderived in Škrjanc(2011). Thisprediction

inter-valmodelwasvalidatedusingrenewablegeneration (photovoltaic

andwind)anddemanddatafroman installedmicrogridin

Huat-acondo,Chile.Khosravi,Nahavandi,andCreighton(2011)usedthe

deltamethodtodevelop an adaptiveneuro-fuzzy (ANFIS)

predic-tioninterval model.TheparametersoftheANFISprediction

inter-valwere found by minimizing a nonlinear cost function that

in-cludes the coverage probability and sharpness of the interval. A

simulatedannealingalgorithm wasusedasasolutionmethodfor

theoptimizationproblem.

Khosravi,Nahavandi,Creighton,andAtiya(2011b)proposedthe

lowerupperbound estimation(LUBE)methodforbuildingneural

networkpredictionintervalsasanalternativetosolvingthe

prob-lemregardingtheassumptionsaboutthedataand/ortheintensive

computationalburden.SeveralstudieshaveusedtheLUBEmethod

to develop prediction interval models of distributed energy

re-sourcesandelectricconsumption.Forinstance,Khosraviand Naha-vandi(2013)andKhosravi,Nahavandi,andCreighton(2013)

devel-opedpredictionintervalsforforecastingwindfarmpower

genera-tion.In asimilarway,Quan,Srinivasan, andKhosravi(2014)used

bothelectrical loadandwindpower generationdatato construct

prediction interval models, and Wang, and Jia (2015) used

pho-tovoltaic powerdata to constructprediction intervalsbased ona

radialbasis function(RBF)neural network.However, inthese

ap-proaches,neuralnetworktrainingisbasedonacostfunctionthat

includesthecombinationalcoveragewidth-basedcriterion(CWC).

The problem with this criterion, as established in Shrivastava,

Khosravi, and Panigrahi (2015), Wan, Xu, Østergaard, Dong, and Wong (2014), Pinson and Tastu (2014), and Khosravi,and Naha-vandi (2014b), occurs when an extremely narrow interval width

is obtained: the entireterm (CWC) becomeszero irrespective of

theprediction interval coverage probability; therefore,the cover-ageprobabilitycanbeverylow.Additionally,inKhosravi,and Na-havandi(2014a)anintervaltype-2fuzzysystemwasproposedfor

construction of prediction intervals. The left and right points of

thetype-reducedsetweredeﬁnedasthelowerandupperbounds

ofthepredictioninterval.However, theparameters ofthesystem

wereobtainedusingthesameCWCcriteria.

Other approachesregardingthe predictionintervals of

renew-able resources, the price of energy, and the electricity demand

havebeenreported(Hu,Hu,Yue,Zhang,&Hu,2017;Lietal.,2018; Shrivastavaetal.,2015,2016;Voyantetal.,2018).Intheworksof

Shrivastavaetal.(2016)andShrivastavaetal.(2015),

methodolo-gieswere proposed based on the support vector machine (SVM)

to generatethe prediction intervalsfor wind speed and

electric-itycosts. InShrivastavaetal.(2016),a multi-objectivedifferential

evolutionalgorithmwasusedtotunemodelparameterssuchthat

multiple opposing objectives were achieved to generate

Pareto-optimalsolutions.InShrivastavaetal.(2015),usingparticleswarm

optimization(PSO), theoptimalmodel parameters were obtained

byminimizing the interval widthwhile a desiredcoverage

prob-abilitywas achieved. In both studies, SVMs were used to

gener-atetheupper andlower boundsofthe predictioninterval.

How-ever,theupperandlowervaluesforthetrainingprocesswere

un-known;theywere artiﬁciallygeneratedbymodifying thetraining

values within a given percentage. Hu et al.(2017) used the

ker-nelextremelearningmachine(KELM)methodtodevelop the

pre-dictioninterval forwindpower usingdatafromtwo windfarms.

The artiﬁcial bee colony algorithm wasused to ﬁnd the

param-etersnecessary for the KELMmodels. The optimization was

per-formed using a cost function that included the coverage

proba-bility,the sharpness of the interval andthe average deviation of

the data from the prediction interval asmetrics. In the work of

Voyantetal.(2018),predictionintervalmodels oftheglobal

hor-izontalirradiation using regressiontreemethods were presented.

Several prediction models were tested, including classic, pruned,

bagged andboostedregressiontreeandclassic andsmart

persis-tencemodels. Severalpredictorsbased onsubsets ofthetraining

datawereusedtobuildthepredictioninterval.Then,acumulative

distribution function (CDF) was constructed based on predicted

valuesfromeachregressiontreemodeldeveloped.Intheworkof

Lietal. (2018),an improved bootstrap methodwasproposed for

constructing predictionintervals usingextreme gradientboosting

(XGB) as the basemodel. The approach wascompared with

tra-ditionalbootstrap, LUBEandSVR-2D using solarpowerdata. The

proposed methodin thisstudyachievedthe bestperformance in

termsofthequalityofthepredictioninterval.Althoughthe

predic-tion interval performedwell in previous studies,they considered

onlyshortpredictionhorizons(afewhours ahead),andforsome

applications,forinstance,forEMSsinmicrogridoperationsbased

onrecedinghorizoncontrol, ahigherpredictionhorizoncould be

necessary.

Severaltypes of Takagi–Sugeno fuzzy systems that model the

uncertaintyintheantecedent, intheconsequentorinboth parts

of the fuzzy rules have been discussed in the literature. Type-2

fuzzy sets are used to model the uncertainty in the antecedent,

and fuzzy numbers are used to model the uncertainty in the

consequents (Khosravi, Nahavandi, Creighton, & Srinivasan, 2012; Mendel, 2017). Most reported studies have demonstrated that

these kinds of systems are excellent tools for handling

uncer-tainty andhavea degreeofaccuracysuperiorto traditional

type-1 fuzzy systems. For instance, in the work of Jafarzadeh, Fadali,

andYaman(2013) type-1andinterval type-2fuzzysystems were

proposed for theprediction of solarpower. Type-2 systemswith

type-2antecedentsandcrispconsequentsprovidedthemore

pre-ciseoutput.Khosravietal.(2012)comparedneuralnetworks,

tra-ditional type-1 fuzzy systems and interval type-2 fuzzy systems

for electrical demand prediction. The results show that the

in-terval type-2 fuzzymodel showedimprovedthe prediction

accu-racy compared to the other approaches dueto its additional

de-grees offreedom. In a similar way,in the work ofKhosravi, and

Nahavandi(2014c),an interval type-2fuzzysystemwasproposed

forone-day-aheadloadprediction.Anoptimaltype reducerbased

on a neuralnetwork wasproposed to improveprediction

perfor-mancewithoutincreasingcomputationalburdencomparedto

tra-ditional type reduction. In Begian, Melek, and Mendel (2008) a

novel inference engine for a type-2 fuzzysystem waspresented.

Thisapproachusesaclosedformforinferenceinsteadofthe

type-reductionprocess.Theresultsshowedthattheproposedinference

mechanismoutperformsthetype-1fuzzysystems.

Additionally,severalstudieshavedemonstratedthat theuseof

complexfuzzysetsandlogicinintelligentsystemscanimprovethe predictionoffutureobservationsinatimeseries(Yazdanbakhsh& Dick,2018).IntheworkofChen,Aghakhani,Man,andDick(2011),

an adaptiveneuro-complexfuzzyinferentialsystem(ANCFIS)was

proposed. This system was applied to time-series prediction for

synthetic and real-world datasets. The results showed that

com-plexfuzzysetsareausefultoolinintelligentsystemsdesign.

AN-CFIS achievedgoodperformance usingamaximumofthree rules

forallexperiments;incontrast,thebestANFISnetworkused128

rulesforthesyntheticdataset.Yazdanbakhsh,andDick(2017)

pro-posedanextensionoftheANCFIStothemultivariabletime-series

prediction. The proposed approach was compared with the

re-sults showed in the work of Li, andChiang (2013) forthe

NAS-DAQ dataset. The results showed that ANCFIS has superior

per-formance regarding the complex neuro-fuzzy autoregressive

in-tegrated moving average (CNFS-ARIMA) approach. The work of

(6)

ANFIS, radial basis function network (RBFN) and complex fuzzy

logic(ANCFIS)forphotovoltaicpowerprediction.ANCFISwasmore

accuratethantheotherapproachesregardingone-step-ahead

pre-diction.

However, the previous studies have focused only on

improv-ing theprecision ofthe expectedvalue ratherthanobtaining the

prediction interval. Because these kinds of the systems can

nat-urally provide a predictioninterval, with the type-1 fuzzy set at

theconsequent, somestudies haveincludedthedispersionofthe

output in the design of the system to obtain the prediction

in-terval in an active way. For instance, Veltman et al. (2015)

de-velopedafuzzypredictioninterval modelforanelectricload

ap-plication. The fuzzy interval wasobtained by including only the

uncertainty in the parameters of the consequences. All

parame-ters(premises andconsequences)ofthe fuzzypredictioninterval

modelwerefoundusingan improvedteaching-learning-based

op-timization algorithm (ITLBO) to minimize a multi-objective cost

function. Marín et al. (2016) characterized uncertainty in wind

power and electric load using type-2 fuzzy prediction interval

models. As Mendel (2017) mentioned, this work is arguably the

ﬁrst paper to use the type-reduced set in an active way (rather

than using it only as a means to obtain the expected value)

during parameter identiﬁcation. In a manner similar to that

re-ported in the work of Veltman et al. (2015), all parameters of

the type-2fuzzysystemwere tuned usingsome optimality

crite-riaofthepredictioninterval, suchasthecoverage probability,

in-terval widthandpredictionerror.Consequently,manyparameters

must be tuned in these approaches using an optimization

algo-rithm.ThepredictionintervalmodelsinVeltmanetal.(2015)and

Marín etal.(2016) were validatedforpredictions upto twodays

aheadusingdataontheenergyresourcesofanisolatedmicrogrid

installedinChile.Althoughthedesiredcoverageprobabilityin pre-viousstudiesisﬁxedduringthetrainingprocessoftheprediction interval,thiscoverageprobabilitycoulddecreaseasthenumberof

future steps increasesbecausethesemodels are trainedone step

ahead.Therefore,predictionintervalsthatgeneratethemost

infor-mation intermsoftherelationshipbetweenthewidthofthe

in-tervalandthecoverageprobabilityoverfuturestepsarerequired.

Deep learning has achieved state-of-the-art results in

sev-eral areas,such as computer vision. However, only recentlyhave

deep learning-based algorithms become a popular solution for

time-series forecasting (Chen, Zeng, Zhou, Du, & Lu, 2018; Qiu, Zhang, Ren, Suganthan, & Amaratunga, 2014; Rodrigues, Markou, &Pereira,2018).Severaldeeplearningmethodsexistforthis

pur-pose. The mostpopular is the long short-term memory network

(LSTM), whichis a formofrecurrent neuralnetwork (RNN) with

the ability to model complex patterns in time series due to its

specialized cell architecture. LSTM networks are not affected by

the explodinggradient problem that is common in regular RNN

when trainedtopredict valuesatfuturesteps. These

characteris-ticshavemadeLSTMnetworkspopularfortime-seriesforecasting

(Bao,Yue, & Rao, 2017; Bui,Le, & Cha, 2018;Liu, Wang,Yang, & Zhang,2017).Mostreportedworksindeeplearningfortime-series forecastingaimtopredicttheexpectedvalueratherthana

predic-tion interval. Some development hasalso been madefor the

es-timation of uncertainty using deep learning models. In Gal, and

Ghahramani (2016), a Monte Carlo dropout approach for

repre-sentingmodeluncertaintywaspresented.Thesameapproachwas

usedinZhu,andLaptev(2017),whereaLSTMencoder-decoder

ar-chitecturewasusedtopredictthedailycompletedtripsprocessed

by the Uberplatform using uncertainty estimation based on the

Monte Carlo dropout ofhiddenunits.However, theseapproaches

make several assumptions on the process distribution.

Addition-ally, although LSTM networks are currently used for time-series

forecasting,sometimestheimprovementofthepredictiondoesnot

compensateforthehighercomplexityoftheLSTMnetwork.

3. Problemstatement

In this paper, prediction interval modelling based on fuzzy

numbersprovidesasystematicframeworkforrepresenting

uncer-taintyandnonlineardynamics,whichmakesitusefulfor forecast-ingtheuncertaintyassociatedwithstochasticvariables,suchas re-newableenergy-basedgenerationvariables.Next,thegeneral inter-valmodellingproblemisdetailed.

TheresultofmappinganinputvectorZ(k)ontoanonlinearreal

continuousfunctiongcanbewrittenasfollows:

y

(

k

)

=g

(

Z

(

k

)

,w

)

+

ε

(

k

)

k=1,...,N (1)

whereZ(k)={z1(k),z2(k),…,zp (k)}representstheinputvectorofall

measurementsattimek, y(k)istheoutput obtainedfromtheset

ofmeasured dataattimek, wisthetrueparameterset,and

ran-domvariable

ε

(k) isnoise.The aimofthe modelistoﬁndareal

function g ∈G that belongsto the modelclass G, such that g is

the best representation of the system (Shrivastava et al., 2016).

The condition for selecting the model is

y

(

k

)

_{− ˆ}y

(

k

)

_≤

ε

0 k=

1,...,N, where yˆ

(

k

)

=g

(

Z

(

k

)

,wˆ

)

is the model output attime k, ˆ

waretheestimatedparametersand

ε0

isthedesirederrormodel.

The error maybe dueto unknown or unobserved variables that

affectthe model output yˆ

(

k

)

(Heskes, 1997; Rencher& Schaalje,

2008). When the nonlinear real function g is an uncertain

func-tion,itcanbeassumedthatitisamemberofthefollowingfamily offunctions(Škrjanc,2011;Škrjancetal.,2005):

G=

g:S→R1

|

g

(

Z

(

k

)

=gnom

(

Z

(

k

)

+

g

(

Z

(

k

)

(2)

where gnom represents the nominal function and

g models

theuncertaintyandsatisﬁessupZ∈S

|

g

(

Z

)

|

≤ c,c∈R.Accordingto

Eq.(2), the function g ∈G can be used to predict a new

obser-vation, and its uncertainty based on observed data.This type of

function(g∈G)iscalledapredictioninterval model.Thegoalof predictionintervalmodellingis toﬁndthelower function yˆ_Land theupperfunctionyˆ_Uthatsatisfy:

ˆ

yL

(

k

)

≤ g

(

Z

(

k

)

,w

)

≤ ˆyU

(

k

)

∀

Z

(

k

)

∈ S (3)

Inthisrespect, afunction g fromthe classG can be found in

thebanddeﬁnedbytheupperandlowerfunctions.Theprediction

intervalsaredevelopedwithacertaincoverageprobability(1−

α

)%

thatfutureobservationsoftheuncertainphenomenabelongtothe

interval deﬁnedby thelower yˆ_Landupperyˆ_Ubounds (Aketal., 2013):

P

yˆL

(

k

)

≤ y

(

k

)

≤ ˆyU

(

k

)

≥

(

1−

α

)

% (4)

As in the works of Veltman et al. (2015), Marín et al.

(2016),Shrivastavaetal.(2016),Khosravietal.(2011a,2011b),and

Khosravi, Nahavandi, and Creighton (2010), in this methodology, thepredictionintervalcoverageprobability (PICP)andthe

predic-tionintervalnormalizedaveragewidth(PINAW)arethemetricsto

beincorporatedintheidentiﬁcationprocessofpredictionintervals.

PICP isused toquantify thenumberof measuredvalues thatfall

within the interval deﬁnedby the model,and PINAW is used to

measurethewidthoftheinterval.

Inthispaper,newpredictionintervalmodelsbasedonthe

con-ceptoffuzzynumbersarederivedsuchthatthewidthdeﬁnedby

theupperyˆ_U

(

k

)

andlower yˆ_L

(

k

)

valuesof theinterval isas nar-rowaspossiblewhiletheintervalcontainsacertainpercentageof measured datay(k). This conditionimplies that,to generate

pre-diction intervals, theaverage widthmeasured by PINAWmustbe

minimizedwhileconsidering acertain desiredcoverage

probabil-itymeasuredbyPICP.Inthenextsection,theproposedprediction

intervalmodelsbasedonfuzzyandneuralnetworkmodellingare

(7)

4. Predictionintervalmodelsbasedonfuzzynumbers

Inthissection,a newapproachtodevelopingprediction

inter-valsbasedonfuzzyandneuralnetworkmodelsisderived.In

gen-eral,themodels consider aset ofpinputs (z1(k) ∈Z1,…, zp (k) ∈

Zp )thatrepresenttheinputmeasurementdataattimestepk. Whenan aﬃnelinearmodelisused,themodeloutputyˆ

(

k

)

at timekisdeﬁnedasfollows:

ˆ

y

(

k

)

=

θ

o +

θ

1z1

(

k

)

+· · · +

θ

p zp

(

k

)

, (5) where

θ

_i(i₌0,1,…,p)are theregressioncoeﬃcients. Inthis pa-per,toincludeuncertainty,thecoeﬃcients

θ

_iaredeﬁnedas

inter-val fuzzynumbers (Lee, 2005; Mendel, 2017). Therefore, the

pa-rametersareexpressedasafuzzysetthatdeﬁnesafuzzyinterval forrepresentingthevalueof

θ

_i.

Thus,theparameters

θ

_i(intervalfuzzynumbers)are character-ized by a mean(m) and spread (s). The uncertainty distribution

regardingtheexpectedvalueischaracterizedusingvariousspread

values,i.e.,

θ

_i=[m_i− si ,mi +¯si ]. Thelower bound

(

yˆL

)

andupper bound

(

yˆU

)

that deﬁne the prediction interval are deﬁned based

onthetheoremoftheaﬃnecombinationoftype-1 intervalfuzzy

numbers(seeKarnikandMendel(2001)andMendel(2017)for

de-tailsonthistheorem):

ˆ y_L

(

k

)

= p i =1 m_iz_i

(

k

)

+m0− p i =1

|

z_i

(

k

)

|

s_i (6) ˆ y_U

(

k

)

= p i =1 m_iz_i

(

k

)

+m0+ p i =1

|

z_i

(

k

)

|

¯s_i (7)

BasedonEqs.(6)and(7),theexpectedvalueischaracterizedby themean(m_i).Thelast terminbothequationsisassociatedwith theprediction interval, andit ischaracterized by the parameters

(

¯s_i,si

)

.

In thisinterval modellingapproach,the parametersassociated

with spread

(

¯s_i,s_i

)

are obtained to assure the desired coverage probability(1−

α

)%withthesmallestintervalwidthatthedeﬁned

future prediction horizons. The proposed method for identifying

theseparameters (spreads)is describedinSection 5. Themodels

thusprovidethevaluesoftheupper

(

yˆ_U

)

andlower

(

yˆ_L

)

bounds givenacoverageprobabilityandtheexpectedvalueyˆ

(

k

)

.

The proposed method isused to characterizeuncertainty.

Un-certainty corresponds to the ﬁtting errorbetween the prediction

ˆ

y

(

k

)

and the actual output y(k); thus, uncertainty is deﬁned by theinterval[yˆ_L_,yˆ_U]to whichthepredictedvalue couldbelong. In thenextsection,bothfuzzyandneuralnetworkpredictioninterval

modelsbasedonfuzzynumbersarepresented.

4.1.Fuzzypredictionintervalmodelling

Mathematically, a fuzzysystemis deﬁnedbya setofpinputs

(z1(k)∈Z1,…,zp (k)∈Zp ),asetofrules,andanoutputyˆj

(

k

)

related toeach ruleattimek.TherulesoftheTakagi–Sugenomodelsare expressedasfollows: Rj :i fz1

(

k

)

isF1j and· · · andzp

(

k

)

isF j p then (8) ˆ yj

(

k

)

=

θ

j o +

θ

1j z1

(

k

)

+· · · +

θ

j p zp

(

k

)

j=1,…,M, where M is the number of rules. Let Fj

(

Z

(

k

)

= p

i =1

μ

F j i

(

z_i

(

k

)

be the activation degree of each rule. Then, the normalizedactivationdegree

β

j (Z(k))isdeﬁnedasfollows:

β

j

₍

_Z

₍

_k

₎₎

₌Fj

(

Z

(

k

))

M

j=1Fj

(

Z

(

k

))

(9)

In this paper, singleton fuzziﬁcation, Gaussian membership

functions

(

F_ij

)

, and the t-norm product are used to provide the

outputofthefuzzysystem:

ˆ y

(

k

)

= M j=1

β

j

₍

_Z

₍

_k

₎₎

_y_ˆj

₍

_k

₎

₍₁₀₎

Consideringtheproposed intervalmodellingframework,inthe

fuzzypredictionintervalmodels,theconsequenceparameters

(

θ

_ij

)

ofeachrule(Eq.(8))canbeconsideredasintervalfuzzynumbers with their corresponding means

(

m_ij

)

and spreads

(

¯s_ij _,s_ij

)

. Thus, thelocalinterval outputforeach rule(j) iscalculatedasfollows:

ˆ y_Lj

(

k

)

= p i =1 m_ij zi

(

k

)

+m₀j − p i =1

|

zi

(

k

)

|

s_ij (11) ˆ y_Uj

(

k

)

= p i =1 m_ij z_i

(

k

)

+m₀j + p i =1

|

z_i

(

k

)

|

¯s_ij (12)

Finally,theloweryˆ_L

(

k

)

andupperyˆ_U

(

k

)

boundsarecalculated consideringtheactivationdegreesEq.(9))andthelocaloutputsof eachrule(Eqs.(11)and((12))asfollows:

ˆ yL

(

k

)

= M j=1

β

j

₍

_Z

₍

_k

₎₎

_y_ˆj L

(

k

)

(13) ˆ yU

(

k

)

= M j=1

β

j

₍

_Z

₍

_k

₎₎

_y_ˆj U

(

k

)

(14)

Inthispaper,afuzzyclusteringmethodisconsideredfor

deﬁn-ingtherulenumbersandtheparameters(centreandstandard

de-viation) of the Gaussian membership functions

(

F_ij

)

. The means

(

m_ij

)

of the consequences are estimated by the minimum

least-squaresoptimizationmethod(Babuška,1998).Themethodfor

tun-ingthespreads

(

¯s_ij _,s_ij

)

isexplainedinSection5.

Thelowerandupperboundsofthefuzzypredictionmodelfor

forecastingtheoutputoffuturestepsaredeﬁnedasfollows:

ˆ y_L

(

k+h

)

= ff uzzy

(

Z

(

k+h

)

,

β

j

₍

_Z

₍

_k₊_h

₎₎

_,_mj i ,si j

(

k+h

))

∀

h=1,...,Np (15) ˆ yU

(

k+h

)

= ff uzzy

(

Z

(

k+h

)

,

β

j

(

Z

(

k+h

))

,m_ij,¯s_ij

(

k+h

))

∀

h=1,...,Np (16)

wherej=1,…,Mistherulenumber,i=1,…,pistheinputnumber, andNp isthepredictionhorizon.Notethat theparameterss_ij

(

k+

h

)

and ¯s_ij

(

k+h

)

are thespreads tuned h˜ steps ahead,where ˜h∈

{

1_,_._._._,N_p

}

_, using experimental data with certain coverage

prob-ability atthe future steps. After the tuning process is completed

andthepredictionintervalisobtained,theseparametersare held

constant throughhorizonprediction,i.e., s_ij

(

k₊h

)

₌s_ij

(

k₊h˜

)

and ¯s_ij

(

k+h

)

=¯s_ij

(

k+_h˜

₎

_for_h₌_1,…,_N_p_._More_details _about_the_tuning methodareprovidedinSection5.

4.2. Neuralnetworkpredictionintervalmodelling

Mathematically,aneuralnetworksystemisdeﬁnedbyasetof

pinputs(z1(k)∈Z1,…,zp (k)∈Zp ),asetofweights(w)andbiases

(b)perlayer,andanactivationfunctionperlayer.Iftheneural

net-workusesa hyperbolictangentactivationfunction forthehidden

layerandalinearactivationfunctionfortheoutputlayer,the

out-putoftheneuralnetworkattimekisdeﬁnedasfollows:

ˆ y_l

(

k

)

= L j=1 w0 j,l

tanh

_p i =1 wh _j_,iz_i

(

k

)

+bh _j

+b0 l (17)

(8)

j=1,…,L,whereListhenumberofhiddenlayerunitsandlisthe

number ofoutput units; in thispaper, l=1. The hiddenweights,

hiddenbias,output weightsandoutput biasare wh _j_,i,bh _j, w0

j,l and

b0

l respectively. Theneural network inEq.(17)can be written as

follows: ˆ y

(

k

)

= L j=1 w0 j Z˜j

(

k

)

+b0 (18) where: ˜ Zj

(

k

)

=tanh

_p i =1 wh_j,izi

(

k

)

+bhj

(19)

Inthispaper,Bayesianregularizationisusedtotraintheneural

network. Bayesian regularization consistsofa paradigm designed

tominimizeoverﬁttingofneuralnetworks.Themethodprovidesa

Bayesian criterion forterminatingtraining, thus generatingbetter resultsforthetestdataset(Gençay&Qi,2001).

Inthisapproach, theneuralnetwork predictioninterval is

de-veloped such that the output weights

(

w0

j

)

are considered inter-valfuzzynumberswiththeir means(m_j)andspreads

(

s_j,¯s_j

)

.The

lower and upperbounds ofthe prediction interval can be

calcu-latedasfollows: ˆ yL

(

k

)

= L j=1 mj Z˜j

(

k

)

+b0− L j=1

_Z˜_j

₍

_k

₎

_s_j ₍₂₀₎ ˆ y_U

(

k

)

= L j=1 m_jZ˜_j

(

k

)

+b0₊ L j=1

Z˜_j

(

k

)

¯s_j (21)

Theneuralnetworkcanbedeﬁnedasaneuralnetworkwhose

outputsaretheupperandlowerboundsandthetargetprediction.

Asfuzzypredictionintervalmodels,neuralnetworkprediction

in-terval models are used to forecast the output of future steps as

follows: ˆ y_L

(

k+h

)

=fNN

(

Z˜_j

(

k+h

)

,m_j,s_j

(

k+h

))

∀

h=1,...,Np (22) ˆ yU

(

k+h

)

=fNN

(

Z˜j

(

k+h

)

,mj,¯sj

(

k+h

))

∀

h=1,...,Np (23)

wherej=1,…,ListhenumberofhiddenlayerunitsandNp isthe predictionhorizon. Note thatthe parameters s_j

(

k₊h

)

and ¯s_j

(

k₊ h

)

arethespreadstunedh˜steps ahead,where˜h∈

{

1,...,Np

}

,

us-ing experimental data with a certain coverage probability at the

future steps. After the tuning process is completed andthe

pre-diction interval is obtained, these parameters are held constant

throughhorizonprediction.

Next,themethodforidentifyingtheparameters ofthe

predic-tion interval based on fuzzy systems andneural networks is

ex-plained.

5. Proposedmethodfordevelopingpredictionintervalsbased onfuzzysystemsandneuralnetworks

Theidentiﬁcationprocedureforderivingthepredictioninterval

models isshownin Fig.1.The ﬁrst partofthisprocedure

corre-sponds tothe identiﬁcationmethod ofthefuzzy andneural

net-workmodelsforobtainingtheexpectedvalue,andthesecondpart

isthemethodforpredictioninterval parameter(spreads)

identiﬁ-cation.

Regarding model identification (Fig. 1), the first step involves datacollectionfortraining,validationandtesting;sufficient

infor-mation iscollectedto representthevarious operationalpointsof

theprocesstobe modelled.Thetrainingdatasetisusedtoobtain

Fig. 1. Methodology for developing prediction intervals.

themodelparameters. The validationdatasetis notdirectly used

inthe trainingprocess; however,it allows the model

generaliza-tioncapacitygivenbythemodelbehaviourtobeevaluatedunder

anewdataset.Finally,thetestdatasetisusedtoevaluatethe

per-formanceoftheobtainedmodel.

Inthisprocedure,astructuraloptimizationismade.The

struc-turaloptimizationoffuzzyandneuralnetworkmodelsconsistsof

proposingseveralstructures.Speciﬁcally,severalfuzzymodelsare

obtainedwhenthenumberofclusters(rules)ismodiﬁed,and

sev-eralneuralnetworkmodelsareobtainedbymodifyingthehidden

neuronnumber.Then,relevantinputvariablesareselectedvia sen-sitivityanalysis,andastructuraloptimizationismade.Finally,the

parameters necessaryforobtaining theexpectedvalue are

calcu-latedusingtherelevantinputvariables,theoptimalstructureand

thetrainingdataset.As proposedin Sáez,andZuñiga(2004),the

best structure is deﬁned when the validation error is either

in-creasedorstabilized incomparisonwiththe trainingerror when

thestructureofthemodelincreasesincomplexity.

Forthefuzzymodels,theGustafson–Kesselclusteringalgorithm

isusedtoobtainthepremiseparameters,andtheconsequence

pa-rameters are estimated by the minimum least-squares

optimiza-tionmethod.Bayesianregularization isusedtoobtainthe

param-eters of the neural network models. Finally, the model is

evalu-atedusingatestdatasettoverifymodelperformance.Then,ifthe

performanceofthemodelisnot suitable,themodelidentiﬁcation

procedureinpreviousstepsmustbereviewed;otherwise,this

pro-cedureiscompleted(Sáez&Zuñiga,2004).

Afterthe modelidentiﬁcationprocedure,theparameters

asso-ciated with providing the expected value are obtained. In fuzzy

models, the standard deviationand centre ofthe Gaussian

func-tions

(

F_ij

)

ofthefuzzymodelare found,wherep(i=0,1,…,p) are therelevantinputsidentiﬁed andM(j₌1,…,M)istherules

num-ber.These parametersare necessaryforobtainingthe normalized

activation degree (

β

j (Z(k))) of the premises. The identiﬁed con-sequenceparameters

(

θ

_ij

)

(see Eq.(8)) areassigned tothe mean values

(

m_ij =

θ

_ij

)

requiredinEqs.(11) and(12),andtheexpected value(Eq.(10))canbeobtained.

Regarding neural network models, hidden weights

(

wh _j_,i

)

and hidden biases

(

bh _j

)

are found. With these parameters, the term ˜

Zj

(

k

)

inEq.(19)iscalculated,wherep(i=1,…,p)aretherelevant inputsidentiﬁed andL (j=1,…,L) is the number of hidden layer

(9)

units.Additionally,theoutput weights

(

w0

j

)

andoutput bias(b0)

are identiﬁed. Finally,the output weights are used toobtain the

expectedvalue,wherem_j=w0

j (used inEqs.(20)and(21)).After

themodel identiﬁcationstage, thespreads of the parameters for

developingthepredictioninterval atfuturestepsmust be identi-ﬁed(seeFig.1).Thismethodisdescribedinthefollowingsection.

5.1.Parametersidentiﬁcationforpredictionintervals

This identiﬁcation method stage obtains the parameters

(spreads) of the prediction interval models such that the upper

andlowervaluesoftheintervalareasnarrowaspossibleandthe

intervalcontainsacertain percentageofmeasured data.The

pre-dictioninterval modelsderived inthispapercan include

endoge-nousy(k) and exogenous variables u(k), where Z(k)=[y(k− 1),…,

y(k− q1),u(k− 1),u(k− 2),…,u(k− q2)]T is the vector of regressors

associatedwiththe outputandinput variables. Then,the

predic-tioninterval is a function of thereal and/or prediction data, de-pendingonthenumberoffuturesteps (Sáezetal., 2015). Inthis paper,thespreadsfordevelopingthepredictionintervalaretuned

accordingtotherequiredstepsahead.Then, basedonthe

formu-lationdescribed inthe previous sectionsfor developingthe

pre-dictioninterval modelsandthemetrics forevaluatingthe

perfor-manceofthe predictioninterval, the spreadidentiﬁcation

proce-dureconsistsofthesolutiontothefollowingoptimizationproblem

(24):

min

s (k +˜_h₎_,¯s(k +_h˜₎PINAW

st.PICP=1−

α

(24)

whereh˜∈

{

1,...,Np

}

is thenumber of steps ahead and(1−

α

)% isthedesiredcoverageprobability.Thepredictioninterval

normal-ized averagewidth (PINAW) andthe prediction interval coverage

probability(PICP)forNp stepsaheadaredeﬁnedasfollows:

PICP=_N1 N k =1

δ

k +h × 100% (25) PINAW = 1 N· R N k =1

ˆ yU

(

k+h

)

− ˆyL

(

k+h

)

× 100%

∀

h=1,...,Np (26) where

δ

_k₊_h₌1 if y

(

k₊h

)

_∈[yˆ_L

(

k₊h

)

_,yˆ_U

(

k₊h

)

]_{; otherwise,}

δ

k +h =0.Theparameters

(

s

(

k+h˜

)

,¯s

(

k+h˜

))

arethedecision

vari-ables in the optimization problem, and the dimensionalities of

theseparametersdependonthemodelselected.Forfuzzymodels,

2pM parameters that correspond to the spreads

(

s_ij

(

k+˜_h

₎

_,_¯sj i

(

k+ ˜

h

))

and 2Lparameters for theneural network model that

corre-sponds to the spreads

(

s_j

(

k₊h˜

)

_,¯s_j

(

k₊h˜

))

should be identiﬁed. Then,the parameters

(

s

(

k+˜_h

₎

_,_¯s

₍

_k₊_h˜

₎₎

₍_Eq.₍₂₄₎₎ _must_be com-putedsuchthati)PICPisgreaterthanorequaltothedesired cov-erageprobability (1−

α

)%andii)PINAWisassmallaspossibleat futuresteps.TheequalityconstraintPICP₌(1₋

α

)%inEq.(24)isa hardconstraintandisthereforeincludedintheoptimization prob-lem as a barrier function to relax this constraint. Therefore, the

solutionoftheminimization problem(24)iscomputedfollowing

theprocedurefortheunconstrainedminimizationproblem:

min

s (k +˜_h₎_,¯s(k +_h˜₎ J=

η

1PINAW+exp

−η2(P ICP −(1−α)) ₍₂₇₎

In(27),

η1

isaweightingfactorand

η2

isapenaltyfactor.These parametersarechosen such that,ifPICPislessthan(1−

α

)%,the termexp−η2 (PICP −(1−α))_is_the_dominant_term_in_the_cost_function;

otherwise,PINAWisdominant.Finally,thesolutiontothe

nonlin-ear optimizationproblem(27) iscomputed usingparticle swarm

optimization(PSO),asoutlinedinthenextsection.

5.2. Solutionmethod

TosolvethenonlinearoptimizationprobleminEq.(27),

tradi-tionalalgorithms,such asgradientdescentmethods, arenot

ade-quate.Thesemethods entailarisk offallingintoalocaloptimum

whensolving non-convexoptimizationproblems.Therefore,other

optimizationmethodsareneeded(Quanetal.,2014).Inthispaper,

PSOisusedtosolvetheproblembecauseitgenerallyoutperforms

other algorithms interms ofsuccessrateandsolutionquality,as

reportedinthework ofElbeltagi,Hegazy,andGrierson(2005).In PSO,thegeneratedsolutionsarecalledparticles,andeachparticle hasapositionvectorwithanassociatedvelocityvector(Tran, Wu & Nguyen, 2013). The ﬁrst step in the algorithm consists of the initializationofparticlepositionsxi,j andvelocitiesvi,j forthej-th dimensionofthei-thparticle.Inthispaper,theparticlepositions are all the spread parameters

(

s

(

k+_h˜

₎

_,¯s

(

k+_h˜

₎₎

_required _to de-velopthepredictionintervalmodel,asexplainedinSection4.

Thevelocityv_i,jandpositionx_i,jinthej-th dimensionofevery

i-thparticleareupdatedaccordingtothefollowingrelations:

v_{i, j}

(

t+1

)

=Wv_{i, j}

(

t

)

+c1rand

()(

Pbesti, j

(

t

)

− xi, j

(

t

))

+c2rand

()(

gbestj

(

t

)

− xi, j

(

t

))

xi, j

(

t+1

)

=xi, j

(

t

)

+vi, j

(

t+1

)

(28)

i=1,2,…,NPwhereNPisthenumberofparticles,andj=1,2,…,N0

isthetotalnumberofparameterstobeidentiﬁed,whichdepends

onthe type ofmodelused(fuzzy orneural). Wisan inertia

fac-tor,Pbestisthebestprevioussolutionoftheparticle,andgbestis

thebestsolution oftheswarm upto thecurrentstep.Theterms

c1 andc2arecalledthecognitiveandsocialaccelerationconstants,

andrand()isarandomnumberbetween0and1.Thetraining

ter-minationcriterionissetwhenaminimumerrororadeﬁned

max-imumnumberofiterationsisachieved.Oncethetrainingprocess

terminates, the gbest value is chosen as thespread parameter to

generatethepredictionintervalmodel.

PINAWand PICP areused asmetrics for theevaluation ofthe

quality of the interval. Additionally, the root mean square error

(RMSE) and the mean absolute error (MAE) are included as

per-formanceindicestoevaluatetheaccuracyofthepredictionmodel

associated with the expectedvalue. All indices are evaluated for

several prediction horizons with the test dataset. In this paper,

thepredictionintervalmodelsbasedonfuzzysystemsandneural

networksare used to representthe nonlinear behaviour and

un-certainty derivedfromelectricitydemand;however,theproposed

methodologycanbeusedtodescribealargefamilyofuncertainty

nonlinearfunctions.

6. Results

Acomparativeanalysisbetweentheproposed prediction

inter-val models based oninterval fuzzynumbers (PI-IFN)and

covari-ance predictionintervalmodels ispresentedfollowingthe

deﬁni-tionpresentedinRencherandSchaalje(2008)andŠkrjanc(2011).

Thepredictionintervalbasedonthecovarianceestablishesthe in-tervalbasedontheerrorbetweentheobserveddatay(k) andthe modeloutput yˆ

(

k

)

. Thismethodisbased ontheassumption that

thenoiseisnormallydistributedwithazeromeanvalueand

vari-ance

σ

2 _that_is_expressed_as_e₌_N_(0,

_σ

2₎₍_Škrjanc,₂₀₁₁_).

As indicated in Eq. (18), the neural network model is a

linear model of the parameters. Therefore, the prediction

in-terval based on the covariance method can be developed

us-ing Eqs. (29) and (30), following the deﬁnition presented in

RencherandSchaalje(2008):

ˆ y_U=Z˜∗TW0₊_b0₊_t α

σ

e

1+Z˜∗T

Z˜T Z˜

−1Z˜∗

1_/ 2 (29)

(10)

100 150 200 250 300 350 400 450 500 step k -6 -4 -2 0 2 4 6 y(k) y(k) noise(k) u(k)

Fig. 2. Modiﬁed Chen series for 400 training data .

ˆ y_L=Z˜∗TW0₊_b0_{− t} α

σ

e

1+Z˜∗T

Z˜T Z˜

−1Z˜∗

1_/ 2 (30)

where

σ

e isthevarianceoftheerror,t_αistheparameterrelatedto theintervalwidth,Z˜∗isthenewdatumusedtopredictthefuture observationandZ˜isthematrixthatconsidersalldatausedinthe

trainingprocessinwhichtheoutputweightsandoutputbiaswere

determined.

Finally,thefuzzypredictionintervalmodelbasedoncovariance

proposed inŠkrjanc(2011)isusedtoobtainthe upperandlower

boundsofthelocallinearmodelasfollows:

ˆ y_Uj =

ψ

∗T j

θ

j +tα

σ

j

1+

ψ

∗T j

ψ

j

ψ

T j

−1

ψ

∗ j

1_/ 2 j=1,...,M (31) ˆ y_Lj =

ψ

∗T j

θ

j− tα

σ

j

1+

ψ

∗T j

(

ψ

j

ψ

T _j

)

−1

ψ

∗_j

1/ 2 j=1,...,M (32)

j=1,…,M, where M is the total number of rules composing the

fuzzy system,

σ

_jis the local variance of the error, and

ψ

T _j ₌

β

j

₍

_Z

₎

_[1_ZT _]_is_the_matrix_that _considers_all_the_values_used_in_the trainingprocess.

Forallthemodels(fuzzyandneural),t_α istunedusing exper-imental datato achievethe desiredcoverageprobability (1−

α

)%, asexplainedin Sáezetal.(2015).Inthe next section,theresults

ofabenchmarkandtheloadforecastwiththeproposedprediction

intervalmodelsarepresented.

6.1. Benchmark

In this paper, the original Chen series in Chen, Billings, and

Grant (1990)is modiﬁedand usedto evaluate the prediction in-tervalmodels:

y

(

k

)

=

(

0.8− 0.5exp

(

−y2

₍

_k_{− 1}

₎₎₎

_y

₍

_k_{− 1}

₎

−

(

0.3+0.9exp

(

−y2

₍

_k_{− 1}

₎₎₎

_y

₍

_k_{− 2}

₎

₊_u

₍

_k_{− 1}

₎

+0.2u

(

k− 2

)

+0.1u

(

k− 1

)

u

(

k− 2

)

+e

(

k

)

(33)

wherethesystemnoisee(k)=0.5exp(− y2₍_k_{− 1))}

_γ

₍_k₎_depends_on

thepreviousstateoftheoutputmodeland

γ

(k)iswhitenoise.The system inputu(k) is band-limitedGaussian white noise.The

sys-temis simulated,and10,000datapointsare generated.Thedata

aredividedintotraining,validationandtestingsetsaccountingfor

55%, 25% and 20% of the total dataset,respectively. Fig. 2 shows

theinput,outputandnoiseofthemodiﬁedChenseriessimulation

for400 trainingdata. Asshown inFig. 2,the noise level ishigh whentheoutputy(k)isclosetozero.

The regressors u(k− 1), u(k− 2), y(k− 1) and y(k− 2) are de-ﬁnedastheinputsforderivingthepredictionintervalmodels.

Re-garding the structure of the fuzzymodel, ﬁve rules are deﬁned,

whereas eight hiddenlayer units are deﬁnedfor the neural

net-workmodel.Withthesestructuresdeﬁned,theparameters

associ-atedwithprovidingtheexpectedvalue areobtainedasexplained

in Section 5. The PSO algorithm is used to identify the spread parameters for generatingthe prediction interval at futuresteps. Thedesiredcoverage probability (1−

α

)=90%, theweighting fac-tor

η1

=250andthepenaltyfactor

η2

=150inEq.(27)aredeﬁned. Aparticle sizeof 50andthe parameters c1=2.5 andc2=1.5are

used.Finally, Wrunsfrom 0.9to 0.3duringoﬄine optimization.

ThenumberofiterationsforPSOissetto5000,theoptimizations areexecuted severaltimes,andthe bestsolution isselected.The cost function value (J) in Eq.(27) andthe developed metrics are

reportedinTable1 forthetest datasetusingvarious numbersof

stepsahead.

AsshowninTable1,thefuzzyandneuralnetworkmodels

pro-videcostfunctionvalues(J)lowerthanthoseofthelinearmodel,

whichisconsistentwiththenonlinearbenchmarkstructure.These

resultsareexpectedbecauseoftheabilityofthefuzzyandneural

network models to better ﬁt the dynamics andnonlinearities of

thesystems,whicharemorenotableforlongerfuture-step

predic-tions.Additionally,itcanbeobservedthatthecostfunctionofthe

proposed method (Eq. (27)) is lower than that of the covariance

methodforallmodels(i.e.,linear,fuzzyandneuralnetwork).The

RMSE andMAE values are equal in the proposed and covariance

methodsbecausetheidentiﬁcationmethodisthe same.However,

the predictionerror increasesfor a largerhorizon prediction be-causetheaccumulativeerrorofthemodelislargerwhenthesteps ofthehorizonincrease,asshowninTable1.

Furthermore,it canbe observedthat thePICP termiscloseto

90%becausetheintervalmodelsaretrainedtomaintainPICPnear

thedesiredvalueforvariousstepsahead.Intermsofprediction

in-tervals,theproposedmethod(PI-IFN)provides narrowerintervals

for all step-ahead forecasts. While the covariancemethod

main-tains a constant width for the interval (see Figs. 3(a), 4(a) and

5(a)),theproposed methodachievesanarrowerinterval instates

withlittlenoiseandanintervalwithawidthsimilartothatofthe

covariancemethodinstateswithhighnoise.

Importantly,theinformationleveldeliveredbyaprediction in-tervalisdirectlyrelatedtoits width(Marínetal., 2016;Xuetal., 2017);thus,theproposedmethodyieldsabetterinformationlevel

regardingthecovariancemethod(smallerwidths).Widerintervals

couldproduceahigherPICP,buttheseintervalsprovidelessuseful

informationabouttheuncertaintyofthemodelledphenomena.In

thisrespect,theneuralnetworkmodelsexhibitsharperprediction

intervalsthanthelinearandfuzzymodels.

Figs.3–5showsixteen-step-aheadforecastsofthelinear,fuzzy

andneural network predictioninterval models. The ﬁgures show

thatnearlyall thedataare includedintheinterval;onlythe out-liersofthetime seriesare left outsidetheregion constructedby thepredictioninterval.Additionally,theintervalsproducedbythe

proposedmethod(PI-IFN)arenarrowerthanthoseobtainedbythe

methodusedforcomparison,asshownintheﬁgures.

6.2.Applicationforloadforecasting

Inthissection, twocasestudies involvingthe implementation

ofthepredictionintervalmodelsarepresentedtoaddressthe un-certaintyassociatedwithaload.Theﬁrstcaseisfromtheisolated

microgridinthevillageofHuatacondoinChile,andthesecond is

from20 residential dwellings in the town of Loughborough, UK.

(11)

Table 1

Performance indices.

Prediction horizon Performance indices Linear models Fuzzy models Neural models Covariance PI-IFN Covariance PI-IFN Covariance PI-IFN

One step ahead J 52.23 32.62 26.26 23.60 20.93 18.20

RMSE 0.4312 0.4312 0.3517 0.3517 0.2372 0.2372

MAE 0.2883 0.2883 0.2253 0.2253 0.1241 0.1241

PINAW (%) 20.89 12.40 10.09 9.37 7.90 6.85

PICP (%) 98.15 89.68 89.98 91.18 89.89 89.95

Four steps ahead J 57.71 47.47 40.27 35.56 36.58 24.36

RMSE 0.6124 0.6124 0.5266 0.5266 0.4076 0.4076

MAE 0.5010 0.5010 0.3765 0.3765 0.2446 0.2446

PINAW (%) 23.05 17.35 15.14 13.73 14.18 9.18

PICP (%) 91.62 89.06 89.41 89.86 89.92 89.77

Eight steps ahead J 57.98 49.32 43.39 39.48 39.94 32.62

RMSE 0.6761 0.6761 0.5621 0.5621 0.5105 0.5105

MAE 0.4647 0.4647 0.4015 0.4015 0.3127 0.3127

PINAW (%) 23.19 18.16 17.08 15.29 15.13 11.72

PICP (%) 93.26 89.09 90.25 89.85 89.50 89.20

Sixteen steps ahead J 58.16 50.75 45.21 41.40 45.93 36.15

RMSE 0.6814 0.6814 0.5839 0.5839 0.5937 0.5937

MAE 0.4717 0.4717 0.4167 0.4167 0.3685 0.3685

PINAW (%) 23.26 18.97 17.85 15.98 18.20 14.10

PICP (%) 93.19 89.20 90.36 89.75 90.57 90.07

Fig. 3. Sixteen-step-ahead linear prediction interval model: (a) covariance and (b) proposed methods .

on the concept of interval fuzzy numbers (PI-IFN) were used to

developthe loadforecastingmodelssupported bythe results

ob-tainedwiththebenchmark.

6.2.1. Huatacondomicrogrid

In thissection, the proposed prediction interval model based

ontheconceptofintervalfuzzynumbers(PI-IFN)isidentiﬁed us-ingdatafromanisolatedmicrogridinthevillageofHuatacondoin

theAtacama Desert,Chile.Thedatasetused correspondsto a

pe-riod of147 days, spanningNovember 1st of2012 to March27th

of2013,withasamplingtimeof15min,andthepeakpowerload

is27.54kW. Thetrainingdataset correspondsto theperiod

span-ningNovember1stof2012toJanuary19thof2013,thevalidation

datasetcorrespondstotheperiodspanningJanuary20thto

Febru-ary26thof 2013, andthe test datasetcorresponds to the period

spanningFebruary27thtoMarch27thof2013.

Based on the obtaineddata andusing the identiﬁcation

pro-cedure describedin Section 5, an optimalstructure consisting of

threerulesandnineregressorsisobtainedforthefuzzymodel: ˆ

pL(k) = ff uzzy(pL(k− 1 ),pL(k− 2 ),pL(k− 3 ),pL(k− 4 ),pL(k− 92 ),...

pL(k− 93 ),pL(k− 95 ),pL(k− 96 ),pL(k− 100 )) (34)

Similarly,eightneuronsinthehiddenlayerandtenregressors

are obtained for the neural network model asan optimal

struc-ture: ˆ

pL(k) = fNN(pL(k− 1),pL(k− 2),pL(k− 3),pL(k− 4),pL(k− 92),...

pL(k− 93),pL(k− 95),pL(k− 96),pL(k− 97),pL(k− 100)) (35)

Notethat exogenous variables are not included inthe models

used to represent the behaviour of the load. During the model

identiﬁcation stage, a prediction horizonof Np =1 is considered, asexplainedinSection5.However,thespreadparametersfor

gen-eratingthepredictionintervalmodels areidentiﬁedusingPSOby

consideringvarioussteps aheadwithadesiredcoverage