Delft University of Technology
Prediction interval methodology based on fuzzy numbers and its extension to fuzzy
systems and neural networks
Marín, Luis G.; Cruz, Nicolás; Sáez, Doris; Sumner, Mark; Núñez, Alfredo
DOI
10.1016/j.eswa.2018.10.043
Publication date
2019
Document Version
Final published version
Published in
Expert Systems with Applications
Citation (APA)
Marín, L. G., Cruz, N., Sáez, D., Sumner, M., & Núñez, A. (2019). Prediction interval methodology based on
fuzzy numbers and its extension to fuzzy systems and neural networks. Expert Systems with Applications,
119, 128-141. https://doi.org/10.1016/j.eswa.2018.10.043
Important note
To cite this publication, please use the final published version (if applicable).
Please check the document version above.
Copyright
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy
Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.
This work is downloaded from Delft University of Technology.
‘You share, we take care!’ – Taverne project
https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher
is the copyright holder of this work and the author uses the
Dutch legislation to make this work public.
ContentslistsavailableatScienceDirect
Expert
Systems
With
Applications
journalhomepage:www.elsevier.com/locate/eswa
Prediction
interval
methodology
based
on
fuzzy
numbers
and
its
extension
to
fuzzy
systems
and
neural
networks
Luis G. Marín
a, Nicolás Cruz
a, Doris Sáez
a, Mark Sumner
b, Alfredo Núñez
c,∗a Department of Electrical Engineering, University of Chile, Santiago, Chile
b Department of Electrical and Electronic Engineering, University of Nottingham, Nottingham NG7 2RD, UK c Section of Railway Engineering, Delft University of Technology, Delft, The Netherlands
a
r
t
i
c
l
e
i
n
f
o
Article history: Received 6 April 2018 Revised 8 September 2018 Accepted 20 October 2018 Available online 24 October 2018
Keywords: Prediction interval Fuzzy number Neuronal network Renewable energy Microgrid
a
b
s
t
r
a
c
t
Prediction interval modelling has been proposed in the literature to characterize uncertain phenomena and provide useful information from a decision-making point of view. In most of the reported studies, as- sumptions about the data distribution are made and/or the models are trained at one step ahead, which can decrease the quality of the interval in terms of the information about the uncertainty modelled for a higher prediction horizon. In this paper, a new prediction interval modelling methodology based on fuzzy numbers is proposed to solve the abovementioned drawbacks. Fuzzy and neural network prediction in- terval models are developed based on this proposed methodology by minimizing a novel criterion that includes the coverage probability and normalized average width. The fuzzy number concept is considered because the affine combination of fuzzy numbers generates, by definition, prediction intervals that can handle uncertainty without requiring assumptions about the data distribution. The developed models are compared with a covariance-based prediction interval method, and high-quality intervals are obtained, as determined by the narrower interval width of the proposed method. Additionally, the proposed predic- tion intervals are tested by forecasting up to two days ahead of the load of the Huatacondo microgrid in the north of Chile and the consumption of the residential dwellings in the town of Loughborough, UK. The results show that the proposed models are suitable alternatives to electrical consumption forecast- ing because they obtain the minimum interval widths that characterize the uncertainty of this type of stochastic process. Furthermore, the information provided by the obtained prediction interval could be used to develop robust energy management systems that, for example, consider the worst-case scenario. © 2018 Elsevier Ltd. All rights reserved.
1. Introduction
Nonlinear models can provide excellent insight into complex
real-worldprocesses andsystems. Such models representthe
re-lationshipsamongvariablesandareusefulinplanningand
opera-tionalstagesaswellasintheanalysisofmeasureddata(Rencher& Schaalje,2008).Ingeneral,theaimofpredictive modelsisto ob-taina reliable representationofthe target system(Ghanbari, Ha-davandi, & Abbasian-Naghneh, 2010). In recent decades, several
methodologieshavebeenproposedtosolvenonlinearmodel
iden-tificationproblemsthatuseafinitenumberofmeasureddataand
consideranoptimalitycriterion(Škrjanc,2011).Manystudieshave
examinedmethodsforimprovingtheaccuracyoftheseapproaches
∗ Corresponding author.
E-mail addresses: luis.marin@ing.uchile.cl (L.G. Marín), nicolas.cruz@ing.uchile.cl
(N. Cruz), dsaez@ing.uchile.cl (D. Sáez), mark.sumner@nottingham.ac.uk
(M. Sumner), a.a.nunezvicencio@tudelft.nl (A. Núñez).
toobtainhigherprecisioninexpectedvalueprediction(Khodayar, Wang,&Manthouri,2018;Kroll&Schulte,2014).
Neural networks and fuzzy systems are efficient for
nonlin-ear modellingbecause they havea highfittingaccuracy for
non-linear systems (Veltman, Marín, Sáez, Gutierrez, & Nuñez, 2015; Xu,Zhang,Zhu, & He,2017). Althoughcomputational intelligence
methods exhibit adequate performance inestimation and
predic-tion,uncertaintyisnottypicallyquantifiedbythesemodelling
ap-proaches,andonlyexpectedvalueisobtained. However,
informa-tion onthedispersion ofthe outputof themodelprovides more
informationaboutthephenomenamodelledwithuncertaintyand
more useful information from a decision-making point of view
thanthemodelswithonlyexpectedvalue(Kabir,Khosravi,Hosen,
&Nahavandi,2018;Shrivastava,Lohia,&Panigrahi,2016).
Confidence intervals and prediction intervals have been
pro-posedtomodeltheuncertaintiesofasystem.Confidenceintervals
are used to capture uncertaintiesin the unknown parameters of
a model.Confidence intervalsare usually associatedwith
param-eters ratherthan withobservations. Prediction intervalsare used
https://doi.org/10.1016/j.eswa.2018.10.043
to capture uncertaintiesin random variables yet to be observed andprovideaprobabilitythattherandomvariablewillbewithina giveninterval(Dybowski&Roberts,2001;Heskes,1997;Ramezani, Bashiri,&Atkinson,2011;Rencher&Schaalje,2008).Prediction
in-tervals consider more sources of uncertainty than do confidence
intervals;theseadditionalsourcesofuncertaintyincludemodel
er-ror and noise variance. The predicted outputs are intervals that
represent(withagivencoverageprobability)themostlikelyregion
defined by theupper andlower boundsof theinterval towhich
theoutputoftheuncertainphenomenawillbelong.
Inthispaper,predictioninterval modelsare usedto represent
both nonlinear behaviour and uncertainty derived from
noncon-ventional energy sources and electrical demand. The
uncertain-tiesassociated withwind andphotovoltaic power aredue tothe
stochasticintermittencyoftheprimaryinput(windspeedand
so-larradiation),andtheuncertaintyofthedemandprofilesinenergy
communities (microgrids) isdue to minorload variations, which
cangeneratelargechangesinthetotalprofile(Parhizi,Lotfi, Kho-daei,&Bahramirad,2015).Moreover,forcontrolofmicrogrids,the
uncertaintyassociatedwithintermittentpowersourcesandloadis
typically handledusingarobust modelpredictivecontrol (Robust
MPC)intheformulationoftheenergymanagementsystem(EMS)
(Sáez,Ávila,Olivares,Cañizares,&Marín,2015).TheEMSisa
con-trol strategy that allows coordination of the energy resources to
supplythedemand,guaranteeinganeconomicandreliable
opera-tion.RobustEMShasbecomepopularbecauseitcanbeapplied
us-inguncertaintysetsratherthanprobabilisticmodels,reducingthe difficultiesrelatedtoPDFidentification(Lara,Olivares,&Cañizares, 2018).
In the work of Valencia, Collado, Sáez, and Marín (2016), a
wind-basedenergysourcewasmodelledbyafuzzyprediction
in-tervalbasedonthemethodreportedintheworkofŠkrjanc,Blažiˇc, and Agamennoni (2005), and a Robust EMS was achieved using
the convexsum ofthe lower (worstcase) andupper (best case)
boundsoftheavailablewindenergy.Inasimilarway,intheworks ofValenciaetal.(2016)andXiang,Liu,andLiu(2016),prediction
intervalmodelsofthesolarpower,windpower,andelectrical
de-mandofamicrogridweregeneratedtoformulateascenario-based
Robust EMS. InValencia etal. (2016),the combination ofall the
lower and upper bounds of the prediction intervals allowed the
various scenarios forRobustEMS to be defined, andthesolution
wasobtainedusinga second-orderconeoptimizationproblem.In
Xiangetal.(2016),scenariosweregeneratedviaTaguchi’s
orthog-onal array testing method using the prediction intervals of the
uncertain variables modelled, and the optimization problem was
solved usinga search strategy basedon an orthogonal array.The
resultsofthepreviousstudiesshowedthatamoresecureand
re-liableoperationisachievedwithRobustEMSthanwithEMS
with-out uncertainty. However, the performance of a RobustEMS
de-pendsonthequalityofthepredictionintervalmodelsoverthe
fu-ture time horizon; therefore,improved prediction interval model
designs arerequired(Marín,Valencia,& Sáez,2016;Parhizietal., 2015;Sáezetal.,2015).
Several approaches have been proposed that use neural
net-works and fuzzy systems to generate prediction interval models
(seeSection2).Inmanycases,theseapproachescarryhigh
compu-tational costs and/or requiremaking assumptions aboutthedata.
Additionally,inseveralofthereportedapproaches,theprediction
interval models are tuned only one step ahead, which could
de-creasethequalityoftheintervalintermsoftheinformationabout
theuncertaintymodelledforahigherpredictionhorizon.
Thispaperpresentsanewmethodologyfordeveloping
predic-tionintervalmodelsusinganovelcriterionthatincludesthe cover-ageprobabilityandthenormalizedaveragewidthoftheintervalas metricsfortrainingmodelsatfuturesteps.Thus,theprediction
in-tervalmodelsaimtoachievethedesiredcoverageprobabilitywith
thesharpestintervalpossible.Notethatthenarrowerthewidthof
thepredictionintervals, themore accurate theinformationabout
theuncertaintyphenomena. However,a widththatis toonarrow
mightcompromisethecoverageprobability.
Themaincontributionofthisworkisanewmodelling
method-ologyforconstructingpredictionintervalsbasedonfuzzynumbers
anditsextension tofuzzyandneural networkpredictioninterval
models.Thefuzzynumberconceptisusedbecausetheaffine
com-binationofintervalfuzzynumbersgenerates,bydefinition,
predic-tionintervalmodelsthatcanhandleuncertaintywithoutrequiring
assumptionstobemadeaboutthedataandthenoisedistribution.
Theproposedpredictionintervalisdevelopedintwostages.First,
model identificationis performed to tune the parameters
neces-saryforobtainingtheexpectedvalue.Then,thespreadsofthe
pa-rametersof the predictioninterval are found for thefuture step.
Theproposedmethodologycanbeusedtodescribealargefamily
ofuncertainnonlinearfunctions, suchastheelectrical demandin
smallcommunities.
The remainder of this paper is organized as follows:
Section 2 presents a literature review of prediction
inter-val modelling. Section 3 introduces the problem statement.
Section 4 presents prediction interval models based on interval
fuzzy numbers and the extension to fuzzy and neural network
models. Section 5describes the proposed methodfor developing
prediction interval models. Section 6 presents the results of a
benchmarktestandtwocasestudiesinvolvingloadforecastingfor
residential dwellings in the town of Loughborough, UK, and for
the isolated microgridinstalled atHuatacondoin northern Chile.
The last section provides the main conclusions and the focus of
futurework.
2. Literaturereviewforpredictionintervalmodelling
Inthe specialized literature, several methods based onneural
networksandfuzzysystemshavebeenproposedtoobtain
predic-tionintervals. Inthework ofKhosravi,Nahavandi,Creighton,and
Atiya(2011a),traditionalmethodsbasedonneuralnetworkswere
analysed,includingthedelta, Bayesian,mean-varianceestimation,
andbootstrapmethods.Theseapproachesarecomputationally
ex-pensive and/or make assumptions about noise. For instance, the
deltamethodassumesthatnoise ishomogeneous,andthe
calcu-lation ofJacobian matricesis required. The Bayesian method
as-sumes that the parameters are a random set of variables with a
distribution defined a priori, and it requires the computation of
theHessianmatrix.Themean-varianceestimationmethod
consid-ersthatmodelerrorsarenormallydistributedaroundthetrue
tar-get;therefore,themethodrequirestheknownmeanandvariances.
Thisapproachassumesthatthecorrespondingneuralnetwork
ac-curatelyestimatesthetruetargets,whichisnotalwaystrue,
lead-ingto alow coverage probability.The bootstrap methodassumes
thatahigh-precisionestimateofthetruetargetswillbeproduced
byagroupofneuralnetworks.Thismethodisthemost
computa-tionallydemandinginthedevelopmentstagebecauseseveral
neu-ralnetworksarenecessarytoestimatevariance.However,afterthe
modelsaretrainedoffline,onlinecomputationsaresimpleanddo
notrequiretheevaluationofcomplexmatricesorderivatives.
Inthe works of Škrjanc(2011) andŠkrjanc etal. (2005),two
predictioninterval methods based on type-1 fuzzysystems were
proposed.InŠkrjanc(2011),theupperandlower boundsthat
de-finetheinterval areconstructedbased ontheerrorcovarianceof
eachruleofthefuzzymodel.However,inthisapproach,regionally
dependentnoisewithanormaldistribution,withzeromeanvalue
andvariance, is an a priori assumption. In Škrjanc etal. (2005),
anoptimizationprocedurewasusedtofindthelower-and
upper-boundparametersofafuzzymodel.Thismethodforprediction
boundsare obtainedasthe result, andthe optimizationproblem
doesnotimposea desiredvalue forcoverage probabilityor
inter-valwidth. Inthework ofSáezetal.(2015),one-day-aheadfuzzy
predictioninterval models were developed,supported by the
co-variance methodderived in Škrjanc(2011). Thisprediction
inter-valmodelwasvalidatedusingrenewablegeneration (photovoltaic
andwind)anddemanddatafroman installedmicrogridin
Huat-acondo,Chile.Khosravi,Nahavandi,andCreighton(2011)usedthe
deltamethodtodevelop an adaptiveneuro-fuzzy (ANFIS)
predic-tioninterval model.TheparametersoftheANFISprediction
inter-valwere found by minimizing a nonlinear cost function that
in-cludes the coverage probability and sharpness of the interval. A
simulatedannealingalgorithm wasusedasasolutionmethodfor
theoptimizationproblem.
Khosravi,Nahavandi,Creighton,andAtiya(2011b)proposedthe
lowerupperbound estimation(LUBE)methodforbuildingneural
networkpredictionintervalsasanalternativetosolvingthe
prob-lemregardingtheassumptionsaboutthedataand/ortheintensive
computationalburden.SeveralstudieshaveusedtheLUBEmethod
to develop prediction interval models of distributed energy
re-sourcesandelectricconsumption.Forinstance,Khosraviand Naha-vandi(2013)andKhosravi,Nahavandi,andCreighton(2013)
devel-opedpredictionintervalsforforecastingwindfarmpower
genera-tion.In asimilarway,Quan,Srinivasan, andKhosravi(2014)used
bothelectrical loadandwindpower generationdatato construct
prediction interval models, and Wang, and Jia (2015) used
pho-tovoltaic powerdata to constructprediction intervalsbased ona
radialbasis function(RBF)neural network.However, inthese
ap-proaches,neuralnetworktrainingisbasedonacostfunctionthat
includesthecombinationalcoveragewidth-basedcriterion(CWC).
The problem with this criterion, as established in Shrivastava,
Khosravi, and Panigrahi (2015), Wan, Xu, Østergaard, Dong, and Wong (2014), Pinson and Tastu (2014), and Khosravi,and Naha-vandi (2014b), occurs when an extremely narrow interval width
is obtained: the entireterm (CWC) becomeszero irrespective of
theprediction interval coverage probability; therefore,the cover-ageprobabilitycanbeverylow.Additionally,inKhosravi,and Na-havandi(2014a)anintervaltype-2fuzzysystemwasproposedfor
construction of prediction intervals. The left and right points of
thetype-reducedsetweredefinedasthelowerandupperbounds
ofthepredictioninterval.However, theparameters ofthesystem
wereobtainedusingthesameCWCcriteria.
Other approachesregardingthe predictionintervals of
renew-able resources, the price of energy, and the electricity demand
havebeenreported(Hu,Hu,Yue,Zhang,&Hu,2017;Lietal.,2018; Shrivastavaetal.,2015,2016;Voyantetal.,2018).Intheworksof
Shrivastavaetal.(2016)andShrivastavaetal.(2015),
methodolo-gieswere proposed based on the support vector machine (SVM)
to generatethe prediction intervalsfor wind speed and
electric-itycosts. InShrivastavaetal.(2016),a multi-objectivedifferential
evolutionalgorithmwasusedtotunemodelparameterssuchthat
multiple opposing objectives were achieved to generate
Pareto-optimalsolutions.InShrivastavaetal.(2015),usingparticleswarm
optimization(PSO), theoptimalmodel parameters were obtained
byminimizing the interval widthwhile a desiredcoverage
prob-abilitywas achieved. In both studies, SVMs were used to
gener-atetheupper andlower boundsofthe predictioninterval.
How-ever,theupperandlowervaluesforthetrainingprocesswere
un-known;theywere artificiallygeneratedbymodifying thetraining
values within a given percentage. Hu et al.(2017) used the
ker-nelextremelearningmachine(KELM)methodtodevelop the
pre-dictioninterval forwindpower usingdatafromtwo windfarms.
The artificial bee colony algorithm wasused to find the
param-etersnecessary for the KELMmodels. The optimization was
per-formed using a cost function that included the coverage
proba-bility,the sharpness of the interval andthe average deviation of
the data from the prediction interval asmetrics. In the work of
Voyantetal.(2018),predictionintervalmodels oftheglobal
hor-izontalirradiation using regressiontreemethods were presented.
Several prediction models were tested, including classic, pruned,
bagged andboostedregressiontreeandclassic andsmart
persis-tencemodels. Severalpredictorsbased onsubsets ofthetraining
datawereusedtobuildthepredictioninterval.Then,acumulative
distribution function (CDF) was constructed based on predicted
valuesfromeachregressiontreemodeldeveloped.Intheworkof
Lietal. (2018),an improved bootstrap methodwasproposed for
constructing predictionintervals usingextreme gradientboosting
(XGB) as the basemodel. The approach wascompared with
tra-ditionalbootstrap, LUBEandSVR-2D using solarpowerdata. The
proposed methodin thisstudyachievedthe bestperformance in
termsofthequalityofthepredictioninterval.Althoughthe
predic-tion interval performedwell in previous studies,they considered
onlyshortpredictionhorizons(afewhours ahead),andforsome
applications,forinstance,forEMSsinmicrogridoperationsbased
onrecedinghorizoncontrol, ahigherpredictionhorizoncould be
necessary.
Severaltypes of Takagi–Sugeno fuzzy systems that model the
uncertaintyintheantecedent, intheconsequentorinboth parts
of the fuzzy rules have been discussed in the literature. Type-2
fuzzy sets are used to model the uncertainty in the antecedent,
and fuzzy numbers are used to model the uncertainty in the
consequents (Khosravi, Nahavandi, Creighton, & Srinivasan, 2012; Mendel, 2017). Most reported studies have demonstrated that
these kinds of systems are excellent tools for handling
uncer-tainty andhavea degreeofaccuracysuperiorto traditional
type-1 fuzzy systems. For instance, in the work of Jafarzadeh, Fadali,
andYaman(2013) type-1andinterval type-2fuzzysystems were
proposed for theprediction of solarpower. Type-2 systemswith
type-2antecedentsandcrispconsequentsprovidedthemore
pre-ciseoutput.Khosravietal.(2012)comparedneuralnetworks,
tra-ditional type-1 fuzzy systems and interval type-2 fuzzy systems
for electrical demand prediction. The results show that the
in-terval type-2 fuzzymodel showedimprovedthe prediction
accu-racy compared to the other approaches dueto its additional
de-grees offreedom. In a similar way,in the work ofKhosravi, and
Nahavandi(2014c),an interval type-2fuzzysystemwasproposed
forone-day-aheadloadprediction.Anoptimaltype reducerbased
on a neuralnetwork wasproposed to improveprediction
perfor-mancewithoutincreasingcomputationalburdencomparedto
tra-ditional type reduction. In Begian, Melek, and Mendel (2008) a
novel inference engine for a type-2 fuzzysystem waspresented.
Thisapproachusesaclosedformforinferenceinsteadofthe
type-reductionprocess.Theresultsshowedthattheproposedinference
mechanismoutperformsthetype-1fuzzysystems.
Additionally,severalstudieshavedemonstratedthat theuseof
complexfuzzysetsandlogicinintelligentsystemscanimprovethe predictionoffutureobservationsinatimeseries(Yazdanbakhsh& Dick,2018).IntheworkofChen,Aghakhani,Man,andDick(2011),
an adaptiveneuro-complexfuzzyinferentialsystem(ANCFIS)was
proposed. This system was applied to time-series prediction for
synthetic and real-world datasets. The results showed that
com-plexfuzzysetsareausefultoolinintelligentsystemsdesign.
AN-CFIS achievedgoodperformance usingamaximumofthree rules
forallexperiments;incontrast,thebestANFISnetworkused128
rulesforthesyntheticdataset.Yazdanbakhsh,andDick(2017)
pro-posedanextensionoftheANCFIStothemultivariabletime-series
prediction. The proposed approach was compared with the
re-sults showed in the work of Li, andChiang (2013) forthe
NAS-DAQ dataset. The results showed that ANCFIS has superior
per-formance regarding the complex neuro-fuzzy autoregressive
in-tegrated moving average (CNFS-ARIMA) approach. The work of
ANFIS, radial basis function network (RBFN) and complex fuzzy
logic(ANCFIS)forphotovoltaicpowerprediction.ANCFISwasmore
accuratethantheotherapproachesregardingone-step-ahead
pre-diction.
However, the previous studies have focused only on
improv-ing theprecision ofthe expectedvalue ratherthanobtaining the
prediction interval. Because these kinds of the systems can
nat-urally provide a predictioninterval, with the type-1 fuzzy set at
theconsequent, somestudies haveincludedthedispersionofthe
output in the design of the system to obtain the prediction
in-terval in an active way. For instance, Veltman et al. (2015)
de-velopedafuzzypredictioninterval modelforanelectricload
ap-plication. The fuzzy interval wasobtained by including only the
uncertainty in the parameters of the consequences. All
parame-ters(premises andconsequences)ofthe fuzzypredictioninterval
modelwerefoundusingan improvedteaching-learning-based
op-timization algorithm (ITLBO) to minimize a multi-objective cost
function. Marín et al. (2016) characterized uncertainty in wind
power and electric load using type-2 fuzzy prediction interval
models. As Mendel (2017) mentioned, this work is arguably the
first paper to use the type-reduced set in an active way (rather
than using it only as a means to obtain the expected value)
during parameter identification. In a manner similar to that
re-ported in the work of Veltman et al. (2015), all parameters of
the type-2fuzzysystemwere tuned usingsome optimality
crite-riaofthepredictioninterval, suchasthecoverage probability,
in-terval widthandpredictionerror.Consequently,manyparameters
must be tuned in these approaches using an optimization
algo-rithm.ThepredictionintervalmodelsinVeltmanetal.(2015)and
Marín etal.(2016) were validatedforpredictions upto twodays
aheadusingdataontheenergyresourcesofanisolatedmicrogrid
installedinChile.Althoughthedesiredcoverageprobabilityin pre-viousstudiesisfixedduringthetrainingprocessoftheprediction interval,thiscoverageprobabilitycoulddecreaseasthenumberof
future steps increasesbecausethesemodels are trainedone step
ahead.Therefore,predictionintervalsthatgeneratethemost
infor-mation intermsoftherelationshipbetweenthewidthofthe
in-tervalandthecoverageprobabilityoverfuturestepsarerequired.
Deep learning has achieved state-of-the-art results in
sev-eral areas,such as computer vision. However, only recentlyhave
deep learning-based algorithms become a popular solution for
time-series forecasting (Chen, Zeng, Zhou, Du, & Lu, 2018; Qiu, Zhang, Ren, Suganthan, & Amaratunga, 2014; Rodrigues, Markou, &Pereira,2018).Severaldeeplearningmethodsexistforthis
pur-pose. The mostpopular is the long short-term memory network
(LSTM), whichis a formofrecurrent neuralnetwork (RNN) with
the ability to model complex patterns in time series due to its
specialized cell architecture. LSTM networks are not affected by
the explodinggradient problem that is common in regular RNN
when trainedtopredict valuesatfuturesteps. These
characteris-ticshavemadeLSTMnetworkspopularfortime-seriesforecasting
(Bao,Yue, & Rao, 2017; Bui,Le, & Cha, 2018;Liu, Wang,Yang, & Zhang,2017).Mostreportedworksindeeplearningfortime-series forecastingaimtopredicttheexpectedvalueratherthana
predic-tion interval. Some development hasalso been madefor the
es-timation of uncertainty using deep learning models. In Gal, and
Ghahramani (2016), a Monte Carlo dropout approach for
repre-sentingmodeluncertaintywaspresented.Thesameapproachwas
usedinZhu,andLaptev(2017),whereaLSTMencoder-decoder
ar-chitecturewasusedtopredictthedailycompletedtripsprocessed
by the Uberplatform using uncertainty estimation based on the
Monte Carlo dropout ofhiddenunits.However, theseapproaches
make several assumptions on the process distribution.
Addition-ally, although LSTM networks are currently used for time-series
forecasting,sometimestheimprovementofthepredictiondoesnot
compensateforthehighercomplexityoftheLSTMnetwork.
3. Problemstatement
In this paper, prediction interval modelling based on fuzzy
numbersprovidesasystematicframeworkforrepresenting
uncer-taintyandnonlineardynamics,whichmakesitusefulfor forecast-ingtheuncertaintyassociatedwithstochasticvariables,suchas re-newableenergy-basedgenerationvariables.Next,thegeneral inter-valmodellingproblemisdetailed.
TheresultofmappinganinputvectorZ(k)ontoanonlinearreal
continuousfunctiongcanbewrittenasfollows:
y
(
k)
=g(
Z(
k)
,w)
+ε
(
k)
k=1,...,N (1)whereZ(k)={z1(k),z2(k),…,zp (k)}representstheinputvectorofall
measurementsattimek, y(k)istheoutput obtainedfromtheset
ofmeasured dataattimek, wisthetrueparameterset,and
ran-domvariable
ε
(k) isnoise.The aimofthe modelistofindarealfunction g ∈G that belongsto the modelclass G, such that g is
the best representation of the system (Shrivastava et al., 2016).
The condition for selecting the model is
y(
k)
− ˆy(
k)
≤ε
0 k=1,...,N, where yˆ
(
k)
=g(
Z(
k)
,wˆ)
is the model output attime k, ˆwaretheestimatedparametersand
ε0
isthedesirederrormodel.The error maybe dueto unknown or unobserved variables that
affectthe model output yˆ
(
k)
(Heskes, 1997; Rencher& Schaalje,2008). When the nonlinear real function g is an uncertain
func-tion,itcanbeassumedthatitisamemberofthefollowingfamily offunctions(Škrjanc,2011;Škrjancetal.,2005):
G=
g:S→R1|
g(
Z(
k)
)
=gnom(
Z(
k)
)
+g
(
Z(
k)
)
(2)where gnom represents the nominal function and
g models
theuncertaintyandsatisfiessupZ∈S
|
g
(
Z)
|
≤ c,c∈R.AccordingtoEq.(2), the function g ∈G can be used to predict a new
obser-vation, and its uncertainty based on observed data.This type of
function(g∈G)iscalledapredictioninterval model.Thegoalof predictionintervalmodellingis tofindthelower function yˆL and theupperfunctionyˆU thatsatisfy:
ˆ
yL
(
k)
≤ g(
Z(
k)
,w)
≤ ˆyU(
k)
∀
Z(
k)
∈ S (3)Inthisrespect, afunction g fromthe classG can be found in
thebanddefinedbytheupperandlowerfunctions.Theprediction
intervalsaredevelopedwithacertaincoverageprobability(1−
α
)%thatfutureobservationsoftheuncertainphenomenabelongtothe
interval definedby thelower yˆL andupperyˆU bounds (Aketal., 2013):
P
yˆL(
k)
≤ y(
k)
≤ ˆyU(
k)
≥(
1−α
)
% (4)As in the works of Veltman et al. (2015), Marín et al.
(2016),Shrivastavaetal.(2016),Khosravietal.(2011a,2011b),and
Khosravi, Nahavandi, and Creighton (2010), in this methodology, thepredictionintervalcoverageprobability (PICP)andthe
predic-tionintervalnormalizedaveragewidth(PINAW)arethemetricsto
beincorporatedintheidentificationprocessofpredictionintervals.
PICP isused toquantify thenumberof measuredvalues thatfall
within the interval definedby the model,and PINAW is used to
measurethewidthoftheinterval.
Inthispaper,newpredictionintervalmodelsbasedonthe
con-ceptoffuzzynumbersarederivedsuchthatthewidthdefinedby
theupperyˆU
(
k)
andlower yˆL(
k)
valuesof theinterval isas nar-rowaspossiblewhiletheintervalcontainsacertainpercentageof measured datay(k). This conditionimplies that,to generatepre-diction intervals, theaverage widthmeasured by PINAWmustbe
minimizedwhileconsidering acertain desiredcoverage
probabil-itymeasuredbyPICP.Inthenextsection,theproposedprediction
intervalmodelsbasedonfuzzyandneuralnetworkmodellingare
4. Predictionintervalmodelsbasedonfuzzynumbers
Inthissection,a newapproachtodevelopingprediction
inter-valsbasedonfuzzyandneuralnetworkmodelsisderived.In
gen-eral,themodels consider aset ofpinputs (z1(k) ∈Z1,…, zp (k) ∈
Zp )thatrepresenttheinputmeasurementdataattimestepk. Whenan affinelinearmodelisused,themodeloutputyˆ
(
k)
at timekisdefinedasfollows:ˆ
y
(
k)
=θ
o +θ
1z1(
k)
+· · · +θ
p zp(
k)
, (5) whereθ
i (i=0,1,…,p)are theregressioncoefficients. Inthis pa-per,toincludeuncertainty,thecoefficientsθ
i aredefinedasinter-val fuzzynumbers (Lee, 2005; Mendel, 2017). Therefore, the
pa-rametersareexpressedasafuzzysetthatdefinesafuzzyinterval forrepresentingthevalueof
θ
i .Thus,theparameters
θ
i (intervalfuzzynumbers)are character-ized by a mean(m) and spread (s). The uncertainty distributionregardingtheexpectedvalueischaracterizedusingvariousspread
values,i.e.,
θ
i =[mi − si ,mi +¯si ]. Thelower bound(
yˆL)
andupper bound(
yˆU)
that define the prediction interval are defined basedonthetheoremoftheaffinecombinationoftype-1 intervalfuzzy
numbers(seeKarnikandMendel(2001)andMendel(2017)for
de-tailsonthistheorem):
ˆ yL
(
k)
= p i =1 mi zi(
k)
+m0− p i =1|
zi(
k)
|
si (6) ˆ yU(
k)
= p i =1 mi zi(
k)
+m0+ p i =1|
zi(
k)
|
¯si (7)BasedonEqs.(6)and(7),theexpectedvalueischaracterizedby themean(mi ).Thelast terminbothequationsisassociatedwith theprediction interval, andit ischaracterized by the parameters
(
¯si ,si)
.In thisinterval modellingapproach,the parametersassociated
with spread
(
¯si ,si)
are obtained to assure the desired coverage probability(1−α
)%withthesmallestintervalwidthatthedefinedfuture prediction horizons. The proposed method for identifying
theseparameters (spreads)is describedinSection 5. Themodels
thusprovidethevaluesoftheupper
(
yˆU)
andlower(
yˆL)
bounds givenacoverageprobabilityandtheexpectedvalueyˆ(
k)
.The proposed method isused to characterizeuncertainty.
Un-certainty corresponds to the fitting errorbetween the prediction
ˆ
y
(
k)
and the actual output y(k); thus, uncertainty is defined by theinterval[yˆL ,yˆU ]to whichthepredictedvalue couldbelong. In thenextsection,bothfuzzyandneuralnetworkpredictionintervalmodelsbasedonfuzzynumbersarepresented.
4.1.Fuzzypredictionintervalmodelling
Mathematically, a fuzzysystemis definedbya setofpinputs
(z1(k)∈Z1,…,zp (k)∈Zp ),asetofrules,andanoutputyˆj
(
k)
related toeach ruleattimek.TherulesoftheTakagi–Sugenomodelsare expressedasfollows: Rj :i fz1(
k)
isF1j and· · · andzp(
k)
isF j p then (8) ˆ yj(
k)
=θ
j o +θ
1j z1(
k)
+· · · +θ
j p zp(
k)
j=1,…,M, where M is the number of rules. Let Fj
(
Z(
k)
)
= pi =1
μ
F j i(
zi
(
k)
)
be the activation degree of each rule. Then, the normalizedactivationdegreeβ
j (Z(k))isdefinedasfollows:β
j(
Z(
k))
=Fj(
Z(
k))
M
j=1Fj
(
Z(
k))
(9)
In this paper, singleton fuzzification, Gaussian membership
functions
(
Fi j)
, and the t-norm product are used to provide theoutputofthefuzzysystem:
ˆ y
(
k)
= M j=1β
j(
Z(
k))
yˆj(
k)
(10)Consideringtheproposed intervalmodellingframework,inthe
fuzzypredictionintervalmodels,theconsequenceparameters
(
θ
i j)
ofeachrule(Eq.(8))canbeconsideredasintervalfuzzynumbers with their corresponding means(
mi j)
and spreads(
¯si j ,si j)
. Thus, thelocalinterval outputforeach rule(j) iscalculatedasfollows:ˆ yL j
(
k)
= p i =1 mi j zi(
k)
+m0j − p i =1|
zi(
k)
|
si j (11) ˆ yU j(
k)
= p i =1 mi j zi(
k)
+m0j + p i =1|
zi(
k)
|
¯si j (12)Finally,theloweryˆL
(
k)
andupperyˆU(
k)
boundsarecalculated consideringtheactivationdegreesEq.(9))andthelocaloutputsof eachrule(Eqs.(11)and((12))asfollows:ˆ yL
(
k)
= M j=1β
j(
Z(
k))
yˆj L(
k)
(13) ˆ yU(
k)
= M j=1β
j(
Z(
k))
yˆj U(
k)
(14)Inthispaper,afuzzyclusteringmethodisconsideredfor
defin-ingtherulenumbersandtheparameters(centreandstandard
de-viation) of the Gaussian membership functions
(
Fi j)
. The means(
mij)
of the consequences are estimated by the minimumleast-squaresoptimizationmethod(Babuška,1998).Themethodfor
tun-ingthespreads
(
¯si j ,si j)
isexplainedinSection5.Thelowerandupperboundsofthefuzzypredictionmodelfor
forecastingtheoutputoffuturestepsaredefinedasfollows:
ˆ yL
(
k+h)
= ff uzzy(
Z(
k+h)
,β
j(
Z(
k+h))
,mj i ,si j(
k+h))
∀
h=1,...,Np (15) ˆ yU(
k+h)
= ff uzzy(
Z(
k+h)
,β
j(
Z(
k+h))
,mi j,¯si j(
k+h))
∀
h=1,...,Np (16)wherej=1,…,Mistherulenumber,i=1,…,pistheinputnumber, andNp isthepredictionhorizon.Notethat theparameterssi j
(
k+h
)
and ¯si j(
k+h)
are thespreads tuned h˜ steps ahead,where ˜h∈{
1,...,Np}
, using experimental data with certain coverageprob-ability atthe future steps. After the tuning process is completed
andthepredictionintervalisobtained,theseparametersare held
constant throughhorizonprediction,i.e., si j
(
k+h)
=si j(
k+h˜)
and ¯si j(
k+h)
=¯si j(
k+h˜)
forh=1,…,Np .Moredetails aboutthetuning methodareprovidedinSection5.4.2. Neuralnetworkpredictionintervalmodelling
Mathematically,aneuralnetworksystemisdefinedbyasetof
pinputs(z1(k)∈Z1,…,zp (k)∈Zp ),asetofweights(w)andbiases
(b)perlayer,andanactivationfunctionperlayer.Iftheneural
net-workusesa hyperbolictangentactivationfunction forthehidden
layerandalinearactivationfunctionfortheoutputlayer,the
out-putoftheneuralnetworkattimekisdefinedasfollows:
ˆ yl
(
k)
= L j=1 w0 j,l tanh p i =1 wh j,i zi(
k)
+bh j +b0 l (17)j=1,…,L,whereListhenumberofhiddenlayerunitsandlisthe
number ofoutput units; in thispaper, l=1. The hiddenweights,
hiddenbias,output weightsandoutput biasare wh j,i ,bh j , w0
j,l and
b0
l respectively. Theneural network inEq.(17)can be written as
follows: ˆ y
(
k)
= L j=1 w0 j Z˜j(
k)
+b0 (18) where: ˜ Zj(
k)
=tanh p i =1 whj,i zi(
k)
+bhj (19)Inthispaper,Bayesianregularizationisusedtotraintheneural
network. Bayesian regularization consistsofa paradigm designed
tominimizeoverfittingofneuralnetworks.Themethodprovidesa
Bayesian criterion forterminatingtraining, thus generatingbetter resultsforthetestdataset(Gençay&Qi,2001).
Inthisapproach, theneuralnetwork predictioninterval is
de-veloped such that the output weights
(
w0j
)
are considered inter-valfuzzynumberswiththeir means(mj )andspreads(
sj ,¯sj)
.Thelower and upperbounds ofthe prediction interval can be
calcu-latedasfollows: ˆ yL
(
k)
= L j=1 mj Z˜j(
k)
+b0− L j=1Z˜j
(
k)
sj (20) ˆ yU
(
k)
= L j=1 mj Z˜j(
k)
+b0+ L j=1Z˜j
(
k)
¯sj (21)
Theneuralnetworkcanbedefinedasaneuralnetworkwhose
outputsaretheupperandlowerboundsandthetargetprediction.
Asfuzzypredictionintervalmodels,neuralnetworkprediction
in-terval models are used to forecast the output of future steps as
follows: ˆ yL
(
k+h)
=fNN(
Z˜j(
k+h)
,mj ,sj(
k+h))
∀
h=1,...,Np (22) ˆ yU(
k+h)
=fNN(
Z˜j(
k+h)
,mj,¯sj(
k+h))
∀
h=1,...,Np (23)wherej=1,…,ListhenumberofhiddenlayerunitsandNp isthe predictionhorizon. Note thatthe parameters sj
(
k+h)
and ¯sj(
k+ h)
arethespreadstunedh˜steps ahead,where˜h∈{
1,...,Np}
,us-ing experimental data with a certain coverage probability at the
future steps. After the tuning process is completed andthe
pre-diction interval is obtained, these parameters are held constant
throughhorizonprediction.
Next,themethodforidentifyingtheparameters ofthe
predic-tion interval based on fuzzy systems andneural networks is
ex-plained.
5. Proposedmethodfordevelopingpredictionintervalsbased onfuzzysystemsandneuralnetworks
Theidentificationprocedureforderivingthepredictioninterval
models isshownin Fig.1.The first partofthisprocedure
corre-sponds tothe identificationmethod ofthefuzzy andneural
net-workmodelsforobtainingtheexpectedvalue,andthesecondpart
isthemethodforpredictioninterval parameter(spreads)
identifi-cation.
Regarding model identification (Fig. 1), the first step involves datacollectionfortraining,validationandtesting;sufficient
infor-mation iscollectedto representthevarious operationalpointsof
theprocesstobe modelled.Thetrainingdatasetisusedtoobtain
Fig. 1. Methodology for developing prediction intervals.
themodelparameters. The validationdatasetis notdirectly used
inthe trainingprocess; however,it allows the model
generaliza-tioncapacitygivenbythemodelbehaviourtobeevaluatedunder
anewdataset.Finally,thetestdatasetisusedtoevaluatethe
per-formanceoftheobtainedmodel.
Inthisprocedure,astructuraloptimizationismade.The
struc-turaloptimizationoffuzzyandneuralnetworkmodelsconsistsof
proposingseveralstructures.Specifically,severalfuzzymodelsare
obtainedwhenthenumberofclusters(rules)ismodified,and
sev-eralneuralnetworkmodelsareobtainedbymodifyingthehidden
neuronnumber.Then,relevantinputvariablesareselectedvia sen-sitivityanalysis,andastructuraloptimizationismade.Finally,the
parameters necessaryforobtaining theexpectedvalue are
calcu-latedusingtherelevantinputvariables,theoptimalstructureand
thetrainingdataset.As proposedin Sáez,andZuñiga(2004),the
best structure is defined when the validation error is either
in-creasedorstabilized incomparisonwiththe trainingerror when
thestructureofthemodelincreasesincomplexity.
Forthefuzzymodels,theGustafson–Kesselclusteringalgorithm
isusedtoobtainthepremiseparameters,andtheconsequence
pa-rameters are estimated by the minimum least-squares
optimiza-tionmethod.Bayesianregularization isusedtoobtainthe
param-eters of the neural network models. Finally, the model is
evalu-atedusingatestdatasettoverifymodelperformance.Then,ifthe
performanceofthemodelisnot suitable,themodelidentification
procedureinpreviousstepsmustbereviewed;otherwise,this
pro-cedureiscompleted(Sáez&Zuñiga,2004).
Afterthe modelidentificationprocedure,theparameters
asso-ciated with providing the expected value are obtained. In fuzzy
models, the standard deviationand centre ofthe Gaussian
func-tions
(
Fi j)
ofthefuzzymodelare found,wherep(i=0,1,…,p) are therelevantinputsidentified andM(j=1,…,M)istherulesnum-ber.These parametersare necessaryforobtainingthe normalized
activation degree (
β
j (Z(k))) of the premises. The identified con-sequenceparameters(
θ
ij)
(see Eq.(8)) areassigned tothe mean values(
mi j =θ
i j)
requiredinEqs.(11) and(12),andtheexpected value(Eq.(10))canbeobtained.Regarding neural network models, hidden weights
(
wh j,i)
and hidden biases(
bh j)
are found. With these parameters, the term ˜Zj
(
k)
inEq.(19)iscalculated,wherep(i=1,…,p)aretherelevant inputsidentified andL (j=1,…,L) is the number of hidden layerunits.Additionally,theoutput weights
(
w0j
)
andoutput bias(b0)are identified. Finally,the output weights are used toobtain the
expectedvalue,wheremj =w0
j (used inEqs.(20)and(21)).After
themodel identificationstage, thespreads of the parameters for
developingthepredictioninterval atfuturestepsmust be identi-fied(seeFig.1).Thismethodisdescribedinthefollowingsection.
5.1.Parametersidentificationforpredictionintervals
This identification method stage obtains the parameters
(spreads) of the prediction interval models such that the upper
andlowervaluesoftheintervalareasnarrowaspossibleandthe
intervalcontainsacertain percentageofmeasured data.The
pre-dictioninterval modelsderived inthispapercan include
endoge-nousy(k) and exogenous variables u(k), where Z(k)=[y(k− 1),…,
y(k− q1),u(k− 1),u(k− 2),…,u(k− q2)]T is the vector of regressors
associatedwiththe outputandinput variables. Then,the
predic-tioninterval is a function of thereal and/or prediction data, de-pendingonthenumberoffuturesteps (Sáezetal., 2015). Inthis paper,thespreadsfordevelopingthepredictionintervalaretuned
accordingtotherequiredstepsahead.Then, basedonthe
formu-lationdescribed inthe previous sectionsfor developingthe
pre-dictioninterval modelsandthemetrics forevaluatingthe
perfor-manceofthe predictioninterval, the spreadidentification
proce-dureconsistsofthesolutiontothefollowingoptimizationproblem
(24):
min
s (k +˜h ), ¯s(k +h ˜)PINAW
st.PICP=1−
α
(24)whereh˜∈
{
1,...,Np}
is thenumber of steps ahead and(1−α
)% isthedesiredcoverageprobability.Thepredictionintervalnormal-ized averagewidth (PINAW) andthe prediction interval coverage
probability(PICP)forNp stepsaheadaredefinedasfollows:
PICP=N1 N k =1
δ
k +h × 100% (25) PINAW = 1 N· R N k =1 ˆ yU(
k+h)
− ˆyL(
k+h)
× 100%∀
h=1,...,Np (26) whereδ
k +h =1 if y(
k+h)
∈[yˆL(
k+h)
,yˆU(
k+h)
]; otherwise,δ
k +h =0.Theparameters(
s(
k+h˜)
,¯s(
k+h˜))
arethedecisionvari-ables in the optimization problem, and the dimensionalities of
theseparametersdependonthemodelselected.Forfuzzymodels,
2pM parameters that correspond to the spreads
(
si j(
k+˜h)
,¯sj i(
k+ ˜h
))
and 2Lparameters for theneural network model thatcorre-sponds to the spreads
(
sj(
k+h˜)
,¯sj(
k+h˜))
should be identified. Then,the parameters(
s(
k+˜h)
,¯s(
k+h˜))
(Eq.(24)) mustbe com-putedsuchthati)PICPisgreaterthanorequaltothedesired cov-erageprobability (1−α
)%andii)PINAWisassmallaspossibleat futuresteps.TheequalityconstraintPICP=(1−α
)%inEq.(24)isa hardconstraintandisthereforeincludedintheoptimization prob-lem as a barrier function to relax this constraint. Therefore, thesolutionoftheminimization problem(24)iscomputedfollowing
theprocedurefortheunconstrainedminimizationproblem:
min
s (k +˜h ), ¯s(k +h ˜) J=
η
1PINAW+exp−η2(P ICP −(1−α)) (27)
In(27),
η1
isaweightingfactorandη2
isapenaltyfactor.These parametersarechosen such that,ifPICPislessthan(1−α
)%,the termexp−η2 (PICP −(1−α))isthedominantterminthecostfunction;otherwise,PINAWisdominant.Finally,thesolutiontothe
nonlin-ear optimizationproblem(27) iscomputed usingparticle swarm
optimization(PSO),asoutlinedinthenextsection.
5.2. Solutionmethod
TosolvethenonlinearoptimizationprobleminEq.(27),
tradi-tionalalgorithms,such asgradientdescentmethods, arenot
ade-quate.Thesemethods entailarisk offallingintoalocaloptimum
whensolving non-convexoptimizationproblems.Therefore,other
optimizationmethodsareneeded(Quanetal.,2014).Inthispaper,
PSOisusedtosolvetheproblembecauseitgenerallyoutperforms
other algorithms interms ofsuccessrateandsolutionquality,as
reportedinthework ofElbeltagi,Hegazy,andGrierson(2005).In PSO,thegeneratedsolutionsarecalledparticles,andeachparticle hasapositionvectorwithanassociatedvelocityvector(Tran, Wu & Nguyen, 2013). The first step in the algorithm consists of the initializationofparticlepositionsxi,j andvelocitiesvi,j forthej-th dimensionofthei-thparticle.Inthispaper,theparticlepositions are all the spread parameters
(
s(
k+h˜)
,¯s(
k+h˜))
required to de-velopthepredictionintervalmodel,asexplainedinSection4.Thevelocityvi,j andpositionxi,j inthej-th dimensionofevery
i-thparticleareupdatedaccordingtothefollowingrelations:
vi, j
(
t+1)
=Wvi, j(
t)
+c1rand()(
Pbesti, j(
t)
− xi, j(
t))
+c2rand()(
gbestj(
t)
− xi, j(
t))
xi, j
(
t+1)
=xi, j(
t)
+vi, j(
t+1)
(28)i=1,2,…,NPwhereNPisthenumberofparticles,andj=1,2,…,N0
isthetotalnumberofparameterstobeidentified,whichdepends
onthe type ofmodelused(fuzzy orneural). Wisan inertia
fac-tor,Pbestisthebestprevioussolutionoftheparticle,andgbestis
thebestsolution oftheswarm upto thecurrentstep.Theterms
c1 andc2arecalledthecognitiveandsocialaccelerationconstants,
andrand()isarandomnumberbetween0and1.Thetraining
ter-minationcriterionissetwhenaminimumerrororadefined
max-imumnumberofiterationsisachieved.Oncethetrainingprocess
terminates, the gbest value is chosen as thespread parameter to
generatethepredictionintervalmodel.
PINAWand PICP areused asmetrics for theevaluation ofthe
quality of the interval. Additionally, the root mean square error
(RMSE) and the mean absolute error (MAE) are included as
per-formanceindicestoevaluatetheaccuracyofthepredictionmodel
associated with the expectedvalue. All indices are evaluated for
several prediction horizons with the test dataset. In this paper,
thepredictionintervalmodelsbasedonfuzzysystemsandneural
networksare used to representthe nonlinear behaviour and
un-certainty derivedfromelectricitydemand;however,theproposed
methodologycanbeusedtodescribealargefamilyofuncertainty
nonlinearfunctions.
6. Results
Acomparativeanalysisbetweentheproposed prediction
inter-val models based oninterval fuzzynumbers (PI-IFN)and
covari-ance predictionintervalmodels ispresentedfollowingthe
defini-tionpresentedinRencherandSchaalje(2008)andŠkrjanc(2011).
Thepredictionintervalbasedonthecovarianceestablishesthe in-tervalbasedontheerrorbetweentheobserveddatay(k) andthe modeloutput yˆ
(
k)
. Thismethodisbased ontheassumption thatthenoiseisnormallydistributedwithazeromeanvalueand
vari-ance
σ
2 thatisexpressedase=N(0,σ
2)(Škrjanc,2011).As indicated in Eq. (18), the neural network model is a
linear model of the parameters. Therefore, the prediction
in-terval based on the covariance method can be developed
us-ing Eqs. (29) and (30), following the definition presented in
RencherandSchaalje(2008):
ˆ yU =Z˜∗TW0+b0+t α
σ
e 1+Z˜∗TZ˜T Z˜−1Z˜∗1/ 2 (29)
100 150 200 250 300 350 400 450 500 step k -6 -4 -2 0 2 4 6 y(k) y(k) noise(k) u(k)
Fig. 2. Modified Chen series for 400 training data .
ˆ yL =Z˜∗TW0+b0− t α
σ
e 1+Z˜∗TZ˜T Z˜−1Z˜∗1/ 2 (30)
where
σ
e isthevarianceoftheerror,tαistheparameterrelatedto theintervalwidth,Z˜∗isthenewdatumusedtopredictthefuture observationandZ˜isthematrixthatconsidersalldatausedinthetrainingprocessinwhichtheoutputweightsandoutputbiaswere
determined.
Finally,thefuzzypredictionintervalmodelbasedoncovariance
proposed inŠkrjanc(2011)isusedtoobtainthe upperandlower
boundsofthelocallinearmodelasfollows:
ˆ yU j =
ψ
∗T jθ
j +tασ
j 1+ψ
∗T jψ
jψ
T j −1ψ
∗ j1/ 2 j=1,...,M (31) ˆ yL j =
ψ
∗T jθ
j− tασ
j 1+ψ
∗T j(
ψ
jψ
T j)
−1ψ
∗j1/ 2 j=1,...,M (32)
j=1,…,M, where M is the total number of rules composing the
fuzzy system,
σ
j is the local variance of the error, andψ
T j =β
j(
Z)
[1ZT ]isthematrixthat considersallthevaluesusedinthe trainingprocess.Forallthemodels(fuzzyandneural),tα istunedusing exper-imental datato achievethe desiredcoverageprobability (1−
α
)%, asexplainedin Sáezetal.(2015).Inthe next section,theresultsofabenchmarkandtheloadforecastwiththeproposedprediction
intervalmodelsarepresented.
6.1. Benchmark
In this paper, the original Chen series in Chen, Billings, and
Grant (1990)is modifiedand usedto evaluate the prediction in-tervalmodels:
y
(
k)
=(
0.8− 0.5exp(
−y2(
k− 1)))
y(
k− 1)
−
(
0.3+0.9exp(
−y2(
k− 1)))
y(
k− 2)
+u(
k− 1)
+0.2u
(
k− 2)
+0.1u(
k− 1)
u(
k− 2)
+e(
k)
(33)wherethesystemnoisee(k)=0.5exp(− y2(k− 1))
γ
(k)dependsonthepreviousstateoftheoutputmodeland
γ
(k)iswhitenoise.The system inputu(k) is band-limitedGaussian white noise.Thesys-temis simulated,and10,000datapointsare generated.Thedata
aredividedintotraining,validationandtestingsetsaccountingfor
55%, 25% and 20% of the total dataset,respectively. Fig. 2 shows
theinput,outputandnoiseofthemodifiedChenseriessimulation
for400 trainingdata. Asshown inFig. 2,the noise level ishigh whentheoutputy(k)isclosetozero.
The regressors u(k− 1), u(k− 2), y(k− 1) and y(k− 2) are de-finedastheinputsforderivingthepredictionintervalmodels.
Re-garding the structure of the fuzzymodel, five rules are defined,
whereas eight hiddenlayer units are definedfor the neural
net-workmodel.Withthesestructuresdefined,theparameters
associ-atedwithprovidingtheexpectedvalue areobtainedasexplained
in Section 5. The PSO algorithm is used to identify the spread parameters for generatingthe prediction interval at futuresteps. Thedesiredcoverage probability (1−
α
)=90%, theweighting fac-torη1
=250andthepenaltyfactorη2
=150inEq.(27)aredefined. Aparticle sizeof 50andthe parameters c1=2.5 andc2=1.5areused.Finally, Wrunsfrom 0.9to 0.3duringoffline optimization.
ThenumberofiterationsforPSOissetto5000,theoptimizations areexecuted severaltimes,andthe bestsolution isselected.The cost function value (J) in Eq.(27) andthe developed metrics are
reportedinTable1 forthetest datasetusingvarious numbersof
stepsahead.
AsshowninTable1,thefuzzyandneuralnetworkmodels
pro-videcostfunctionvalues(J)lowerthanthoseofthelinearmodel,
whichisconsistentwiththenonlinearbenchmarkstructure.These
resultsareexpectedbecauseoftheabilityofthefuzzyandneural
network models to better fit the dynamics andnonlinearities of
thesystems,whicharemorenotableforlongerfuture-step
predic-tions.Additionally,itcanbeobservedthatthecostfunctionofthe
proposed method (Eq. (27)) is lower than that of the covariance
methodforallmodels(i.e.,linear,fuzzyandneuralnetwork).The
RMSE andMAE values are equal in the proposed and covariance
methodsbecausetheidentificationmethodisthe same.However,
the predictionerror increasesfor a largerhorizon prediction be-causetheaccumulativeerrorofthemodelislargerwhenthesteps ofthehorizonincrease,asshowninTable1.
Furthermore,it canbe observedthat thePICP termiscloseto
90%becausetheintervalmodelsaretrainedtomaintainPICPnear
thedesiredvalueforvariousstepsahead.Intermsofprediction
in-tervals,theproposedmethod(PI-IFN)provides narrowerintervals
for all step-ahead forecasts. While the covariancemethod
main-tains a constant width for the interval (see Figs. 3(a), 4(a) and
5(a)),theproposed methodachievesanarrowerinterval instates
withlittlenoiseandanintervalwithawidthsimilartothatofthe
covariancemethodinstateswithhighnoise.
Importantly,theinformationleveldeliveredbyaprediction in-tervalisdirectlyrelatedtoits width(Marínetal., 2016;Xuetal., 2017);thus,theproposedmethodyieldsabetterinformationlevel
regardingthecovariancemethod(smallerwidths).Widerintervals
couldproduceahigherPICP,buttheseintervalsprovidelessuseful
informationabouttheuncertaintyofthemodelledphenomena.In
thisrespect,theneuralnetworkmodelsexhibitsharperprediction
intervalsthanthelinearandfuzzymodels.
Figs.3–5showsixteen-step-aheadforecastsofthelinear,fuzzy
andneural network predictioninterval models. The figures show
thatnearlyall thedataare includedintheinterval;onlythe out-liersofthetime seriesare left outsidetheregion constructedby thepredictioninterval.Additionally,theintervalsproducedbythe
proposedmethod(PI-IFN)arenarrowerthanthoseobtainedbythe
methodusedforcomparison,asshowninthefigures.
6.2.Applicationforloadforecasting
Inthissection, twocasestudies involvingthe implementation
ofthepredictionintervalmodelsarepresentedtoaddressthe un-certaintyassociatedwithaload.Thefirstcaseisfromtheisolated
microgridinthevillageofHuatacondoinChile,andthesecond is
from20 residential dwellings in the town of Loughborough, UK.
Table 1
Performance indices.
Prediction horizon Performance indices Linear models Fuzzy models Neural models Covariance PI-IFN Covariance PI-IFN Covariance PI-IFN
One step ahead J 52.23 32.62 26.26 23.60 20.93 18.20
RMSE 0.4312 0.4312 0.3517 0.3517 0.2372 0.2372
MAE 0.2883 0.2883 0.2253 0.2253 0.1241 0.1241
PINAW (%) 20.89 12.40 10.09 9.37 7.90 6.85
PICP (%) 98.15 89.68 89.98 91.18 89.89 89.95
Four steps ahead J 57.71 47.47 40.27 35.56 36.58 24.36
RMSE 0.6124 0.6124 0.5266 0.5266 0.4076 0.4076
MAE 0.5010 0.5010 0.3765 0.3765 0.2446 0.2446
PINAW (%) 23.05 17.35 15.14 13.73 14.18 9.18
PICP (%) 91.62 89.06 89.41 89.86 89.92 89.77
Eight steps ahead J 57.98 49.32 43.39 39.48 39.94 32.62
RMSE 0.6761 0.6761 0.5621 0.5621 0.5105 0.5105
MAE 0.4647 0.4647 0.4015 0.4015 0.3127 0.3127
PINAW (%) 23.19 18.16 17.08 15.29 15.13 11.72
PICP (%) 93.26 89.09 90.25 89.85 89.50 89.20
Sixteen steps ahead J 58.16 50.75 45.21 41.40 45.93 36.15
RMSE 0.6814 0.6814 0.5839 0.5839 0.5937 0.5937
MAE 0.4717 0.4717 0.4167 0.4167 0.3685 0.3685
PINAW (%) 23.26 18.97 17.85 15.98 18.20 14.10
PICP (%) 93.19 89.20 90.36 89.75 90.57 90.07
Fig. 3. Sixteen-step-ahead linear prediction interval model: (a) covariance and (b) proposed methods .
on the concept of interval fuzzy numbers (PI-IFN) were used to
developthe loadforecastingmodelssupported bythe results
ob-tainedwiththebenchmark.
6.2.1. Huatacondomicrogrid
In thissection, the proposed prediction interval model based
ontheconceptofintervalfuzzynumbers(PI-IFN)isidentified us-ingdatafromanisolatedmicrogridinthevillageofHuatacondoin
theAtacama Desert,Chile.Thedatasetused correspondsto a
pe-riod of147 days, spanningNovember 1st of2012 to March27th
of2013,withasamplingtimeof15min,andthepeakpowerload
is27.54kW. Thetrainingdataset correspondsto theperiod
span-ningNovember1stof2012toJanuary19thof2013,thevalidation
datasetcorrespondstotheperiodspanningJanuary20thto
Febru-ary26thof 2013, andthe test datasetcorresponds to the period
spanningFebruary27thtoMarch27thof2013.
Based on the obtaineddata andusing the identification
pro-cedure describedin Section 5, an optimalstructure consisting of
threerulesandnineregressorsisobtainedforthefuzzymodel: ˆ
pL(k) = ff uzzy(pL(k− 1 ),pL(k− 2 ),pL(k− 3 ),pL(k− 4 ),pL(k− 92 ),...
pL(k− 93 ),pL(k− 95 ),pL(k− 96 ),pL(k− 100 )) (34)
Similarly,eightneuronsinthehiddenlayerandtenregressors
are obtained for the neural network model asan optimal
struc-ture: ˆ
pL(k) = fNN(pL(k− 1),pL(k− 2),pL(k− 3),pL(k− 4),pL(k− 92),...
pL(k− 93),pL(k− 95),pL(k− 96),pL(k− 97),pL(k− 100)) (35)
Notethat exogenous variables are not included inthe models
used to represent the behaviour of the load. During the model
identification stage, a prediction horizonof Np =1 is considered, asexplainedinSection5.However,thespreadparametersfor
gen-eratingthepredictionintervalmodels areidentifiedusingPSOby
consideringvarioussteps aheadwithadesiredcoverage