• Nie Znaleziono Wyników

vol. XX (XXXX) No. X

N/A
N/A
Protected

Academic year: 2021

Share "vol. XX (XXXX) No. X"

Copied!
16
0
0

Pełen tekst

(1)

vol. XX (XXXX) No. X

Sele tion of Prototypes with the E

k

P System by

Karol Grudzi«ski

InstituteofPhysi s

KazimierzWielkiUniversity

Bydgosz z,Poland.

e-mail: grudzinski.kgmail. om

Abstra t: A ompletelynewsystemforasele tionofreferen e

instan es,whi h is alledE

k

P(Exa tly

k

Prototypes)hasbeenin- trodu ed by us re ently. In this paper we study a suitability of

theE

k

Pmethod fortrainingdataredu tiononseventeendatasets.

As theunderlying lassier the well known IB1 system (1-Nearest

Neighbor lassier) has been hosen. We ompare generalization

ability of our method to performan e of IB1 trained onthe entire

trainingdata and performan e ofLVQforwhi h thesamenumber

of odebooks has been hosen asthe number of prototypes whi h

has been sele ted by the E

k

P system. The results indi ate, that

evenwithonlyafewprototypeswhi hhavebeen hosenbytheE

k

P

method,onnearlyallseventeendatasets statisti allyindistinguish-

ableresultsfrom these attained with IB1have been obtained. On

many datasets generalization ability of the E

k

P system has been

largerthantheoneattainedwithLVQ.

1. Introdu tion

Dataminingis ommonlyemployedinmanydomains. A ase-basedwayofdata

explanationisverypopularamongresear hers. Su hanapproa htoknowledge

dis overyandunderstanding isparti ularlyoftenemployedin medi ine,where

a medi al do tor makes a diagnosis by referring to other similar ases in a

databaseof patients.

Interestinginstan eve tors,knownasreferen e ases, anbeeithersele ted

from training data or an be generated out of a training set. In the latter

aseinstan es' features havein generaldierentvaluesthan theones that are

storedin theoriginaltrainingset. Both te hniques(i.e. instan e sele tionand

prototypegeneration)oftenleadtoasigni anttrainingsetsize redu tion.

Thispaper on ernstherstabovementionedproblem, i.e. `instan e sele -

tion', `trainingdata ompression,redu tionorpruning'. Theideabehind this

ma hinelearningparadigmisthatonlyasmallfra tionofausuallymu hlarger,

(2)

M.,Mi halski,R.,2000;MartinezT.,WilsonD.,1997,2000;Gro howskiM.,

2003;Gro howskiM.,JankowskiN.,2004-1,-;Du h. W.,Grudzinski. K,2000;

GrudzinskiK.,2004,2008).

Prototypesele tionisanextremelyimportantproblemwhi hhasbeenfre-

quentlystudiedbyma hinelearningandpatternre ognitionresear hers. Sele -

tionofreferen einstan es ansigni antlyspeedup lassi ationand analysis

of data later and usually leads to better data understanding and may lower

sensitivitytonoiseofsome lassiers. Strongtrainingsetredu tionmaysome-

timesresultinstatisti allysigni antdegradationofthe lassi ationa ura y

attained on unseen samples, howeveras many experiments illustrate often it

is the other way around, i.e. data pruning improves generalization ability of

lassiers. Samples sele tedwith the E

k

P system an be used forexample to

buildprototype-basedrules,whi hhadbeenintrodu edbyDu het. al. (Du h

W.,GrudzinskiK., 2001;Bla hnikM.,Du hW. ,2004)andwhi hareavery

interestingalternativeto lassi logi alrules.

The a ronym E

k

P is short for Exa tly-

k

-Prototypes. We want to stress here that our new system diers ompletely from our earlier model, PM-M

(GrudzinskiK.,2004).

2. Methodologies for Referen e Instan es Sele tion

Beforewepro eedto presentationoftheE

k

Psystemand theresultsobtained

with this method, avery on isereview of someofthe known te hniques em-

ployed in sele tionof thereferen e asesis provided. This presentation draws

heavilyontheex ellentworkofGro howski ontainedinhisM.S . thesis(Gro-

howskiM.,2003).

2.1. Problem Formulation

Theproblemof sele tionof thereferen einstan es anbedened asapro ess

of nding thesmallestset

S

of ases representingthe samepopulationasthe originaltrainingset

T

andleadingto orre t lassi ationofthesamplesfrom notonly

T

butmoreimportantlyoftheunseen aseswithminimaldegradation ofthegeneralizationabilityoftheunderlying lassier. Inotherwords,referen e

sele tionisamethodforsele tionorgenerationofthemostinformativesamples

from

T

and reje tion ofthe noisy asesorofthese instan es that degradethe generalization when the original training set

T

is used for learning. Thus, restri ting ourselves to prototype sele tion by whi h we understand sele tion

of referen e asesin whi h

S

is asubset of

T

, the problem is to nd optimal subset

S

ofallpossible

2 n − 1

subsetswith respe tto generalizationability of theunderlying lassier. By

n

, thenumberof samplesof theoriginaltraining

set

T

isdenoted.

Thereferen eve torssele tionalgorithms anbedividedintoafewnumber

(3)

2.1.1. Noise Filters

This ategory of methods, known also as editing rules, is based on reje ting

noisy ases oroutlayiersfrom

T

. Therateof data pruningisusually low and thesete hniquesareusuallyemployedastherstdataprepro essingstepwhi h

isthenfollowedbyothermethods. ENN,RENN(WilsonD. , 1972),All

k

-NN

(Tomek I.,1976)and ENRBF(JankowskiN. , 2000)are thekeyexamplesof

thealgorithmsthatbelongtothisgroup.

2.1.2. Data Condensation Algorithms

Thisgroupofmethodsisalsoknownasdatapruningordata ompressionte h-

niques. Themain ideabehindthis approa his to a hievethehighest possible

trainingdataredu tionwithoutorwithminimumsa ri ationofgeneralization

oftheemployedunderlying lassiers. CNN(HartP.,1968),RNN(GatesG.,

1972),GA,RNGE(Bhatta haryaB.K.,PoulsenR.S.,ToussaintG.T.,1981),

ICF(BrightonH.,MellishC.,2002)andDROP15(MartinezT.,WilsonD. ,

2000)arethemainsystemsthat fellintothis ategory.

2.1.3. Prototype Methods

Thefamilyofreferen esele tionalgorithmsthatareaimedatndingextremely

low number of highly informative superve tors, arrying parti ularly large

amountof information and apable ofrepresenting largenumberof ases, are

known asprototypes methods. However the dieren e between data onden-

sation algorithmsand prototypemethods is verysubtle, in our understanding

prototypesele tionandgenerationalgorithmspushtheredu tionofthetraining

datato theextremetakingsometimestheriskofslightlylargerdegradationof

generalizationoftheunderlying lassiers. Thus,howeverbothgroupsofmeth-

ods try to arriveat the smallestset

S

, the stress in data pruning te hniques isput ongeneralization,whilstin the aseof prototypealgorithms itisonthe

extremelylowamountofsamplesthataresele ted. Itshouldnotbesurprising,

that some of the algorithms, parti ularly these in whi h one has the ontrol

overtheamountofthesamplessele ted,maybetreatedeitherasdatapruning

methodsorasprototypesele tionmodels. LVQ(KaskiS.,KohonenT.,OjaM.

,2003),MC1andRMHC(SkalakD.,1994),IB3(AhaD.,AlbertM.,KiblerD.

, 1991),ELH, ELGrowand Expolore(Cameron-JonesR. ,1995) andour own

models PM-M (Grudzinski K., 2004)andE

k

P (GrudzinskiK. ,2008) anbe in ludedintotheprototypesele tiongroupof methods.

3. The E

k

P System

The E

k

P system is based ona minimization of a ost fun tion whi h returns the numberof errorsthe lassier makes. Despite ofthis, theE

k

P method is

(4)

training set is onstru ted out of only the preset number of

k

instan es. It

takesse ondsforthe E

k

P methodto perform10-fold ross-validationonmost ommonUCIdatasets. Inourimplementationweusedthewellknownsimplex

method (Nelder J., MeadR. , 1965) forfun tion minimization whi h wehave

takenfromtheInternet(LamptonM.,2004).

The simplex must be initialized rst before a minimization pro edure is

started. The E

k

P system is very sensitive to the way in whi h the simplex

is initializedand thereforewehavede idedto providetheE

k

P's initialization algorithm whi h is given below. We have found in lusion of this pseudo ode

veryimportantfortherepli ationofthismethod.

Algorithm1TheE

k

P'ssimplexinitialization algorithm Require: A trainingset trainInstan es

Require: A ve tor

p

[℄ of optimization parameters (numProtoPerClass * numClasses*numAttributesdimensional)

Require: A matrixsimplexto onstru tasimplex

LetnumPointsdenotethenumberofpointstobuild simplexon

for

i

=0tonumPoints-1do

for

j

=0tonumClasses*numProtoPerClass-1do for

k

=0tonumAttributes-1do

p

[

k

+numAttributes*

j

:=trainInstan es[

i

℄[

k

endfor

simplex[

k

℄[numAttributes℄:= ostFun tion(

p[]

)

endfor

endfor

Twovariantsofthe ostfun tion algorithm havebeenimplementedin our

system. Therst variant isbased onthe internal ross-validation learningon

trainingpartitionswhilst inthese ond algorithmvarianta lassieristrained

by ondu tingaplaintest(the prunedtrainingpartitionsareusedforlearning

and the test on the entire training partition is used for estimating training

a ura y). Thedetails about both variantsof the ost fun tion algorithm are

givenin thepseudo odelistingswhi h aregivenbelow.

OurimplementationoftheE

k

Pmethod isnotthesimplest oneasour ode

willbe omeabasisforanextendedversionofthisalgorithm. Inordertogivea

shortdes riptionofthealgorithminthetextofthepaper,itisworthmentioning

that thearrayofoptimizationparametersis(numProtoPerClass*numClasses

* numAttributes) dimensional but the instan es stored in this ve tor are not

involved in any parameter modi ation. They are simply extra ted from the

parameterve torandareaddedtothetrainingpartitioninevery ostfun tion

evaluation. Inotherwordsthetrainingpartitionsarebuiltbyextra tingsamples

from a parameter ve tor whi h always ontains numProtoPerClass examples

from every lass o urringin aproblem domain. Inasimplerimplementation

(5)

Algorithm 2TheE

k

P-1 ostfun tion algorithm (learningviainternal ross-

validation)

Require: A trainingset trainInstan es

Require: A ve tor

p

[℄ of optimization parameters (numProtoPerClass * numClasses*numAttributesdimensional)

for

k

=1tonumCrossValidationLearningFoldsdo Createtheemptytrainingset vTrain

Build the

k

-thtest partition vTest

for

i

=0tonumClasses*numProtoPerClass-1do for

j

=0tonumAttributes-1do

Addtheprototypestored in

p

[℄starting from

p

[j +numAttributes

*

i

and endingin

p

[numAttributes - 1+ numAttributes*

i

to

vTrain

endfor

endfor

Build (train)the lassieron vTrainandtestiton vTest

endfor

Rememberthe optimal

p

[℄ valueand the asso iated with it lowest value of

numClassi ationErrors

return numClassi ationErrors

Algorithm 3 The E

k

P-2 ost fun tion algorithm (learning via test on the

entiretrainingpartitiontakingprunedtrainingpartitionforbuilding(training)

a lassier)

Require: A trainingset trainInstan es

Require: A ve tor

p

[℄ of optimization parameters (numProtoPerClass * numClasses*numAttributesdimensional)

Createtheemptytrainingset tmpTrain

for

i

=0tonumClasses*numProtoPerClass-1do for

j

=0tonumAttributes-1do

Add the prototype stored in

p

[℄ starting from

p

[j + numAttributes

*

i

and ending in

p

[numAttributes - 1 + numAttributes *

i

to

tmpTrain

endfor

endfor

Build (train)the lassier ontmpTrainandtestitontrainInstan es

Rememberthe optimal

p

[℄ valueand the asso iated with it lowest value of

numClassi ationErrors

return numClassi ationErrors

(6)

Table1. Datasetsusedin ourexperiments

numProtoPerClass * numClasses ve tors themselves in the parameter array.

Note that numAttributes denotes the total number of attributes in adataset

in ludingthe lassattribute.

4. Numeri al Experiments

In order to verify suitability of the E

k

P system for data analysis the lassi-

ation experiments on seventeen real-world problems (mainly taken from the

well-known UCI repositoryof ma hine-learning databases(Mertz C., Murphy

P.)) have been performed. The information about the datasets used an be

foundin Table1. TheE

k

P system anbebasedonanarbitrary lassier, i.e.

it anbeaneural-network,support-ve torma hine orade ision-treemethod,

et . InourexperimentstheIB1(AhaD.,AlbertM., KiblerD. ,1991)system

hasbeenused bothastheunderlying lassierfor theE

k

P systemandasthe

referen emethod. Thereasonforsele tingtheIB1systemis thatthismethod

requires verysmall training datasets whi h may onsist of just afew samples

in order to make lassi ationpossible. Other lassiers, in luding IB

k

(Aha

D.,AlbertM.,KiblerD., 1991)requireslightlylargertrainingsetsinorder to

operate. Ouraimwhenwewere ondu tingtheexperimentsforthispaperwas

toshowthateventhe al ulationswiththeextremelylownumberofprototypes

sele ted may lead to attaining ex ellent results on unseen samples. The well

knownLVQmethod(Hyninen,Kangas,Kohonen,Laaksonnen,Torkolla,1996;

KohonenT., 2001; KaskiS.,KohonenT., OjaM., 2003),whi his howevera

prototype-generationsystem,hasalsobeentakenasthereferen emodelinour

experiments. These ondreasonfor hoosingtheIB1 lassierastheunderlying

methodfortheE

k

Psystemisthefa tthattheLVQmethodusesthe

k

-Nearest

(7)

Generalizationabilityof the E

k

P systemwith only one,twoand three in-

stan esper lasssele tedfromatrainingsethasbeen omparedtothe lassi-

ationperforman eofLVQforwhi h thesamenumberof odebookshasbeen

used. Additionally,theresultsobtainedwiththeIB1(1-NearestNeighbor)sys-

tem whi h has been trained on the entire ross-validation trainingpartitions

(i.e. alltrainingsamplesfromeverylearningfoldhavebeenused)areprovided.

Ten-fold stratied ross-validation test has been performed for all seven-

teen domains. In the experiments ondu ted with the E

k

P system, in ea h

ross-validation fold, the trainingpartition has been pruned so that only the

prototype asesremained,theE

k

P'sunderlying lassierhasbeentrainedand

it'sgeneralizationabilityhasbeenestimatedonthe ross-validationtestparti-

tion. Afterthe ompletion ofthe al ulationonalltenfolds thetesthasbeen

repeatedtentimesandtheaverage lassi ationa ura yanditsstandardde-

viationwhi hweretakenovertheallavailablehundredpartialresultshavebeen

reported.

Thesingle orre tedre-sampledT-Test(FrankE.WittenI.,,2000;Dobosz

K.,2006)hasbeenusedto al ulatestatisti alsigni an eoftheresults(with

thefa torof0.05)inordertohelpmakingthede isionwhethertheE

k

Psystem

performedbetter,thesameorworsethanthereferen emodels.

TheLVQWekaimplementationoftheLVQmethodthathasbeenemployed

inour al ulationswaswrittenbyJasonBrownlee(BrownleeJ.,2004). Finally,

whatremainstobementionedis,thattheE

k

Psystemhasbeenwrittenbythe

author intheJavaprogramminglanguageasapluginto thewellknownWeka

ma hine learningworkben h(FrankE.WittenI.,, 2000).

4.1. Experiment1: Generalization Ability E

k

Pvs. IB1

Intherstexperimentoursystemunderstudyhasbeen omparedto theper-

forman eof IB1 onall seventeen domains. Theresults of thestatisti al tests

againstthe majority lassier, bothof IB1and E

k

P, havenotbeen ontained

in ourpaper. Thebaserateresultshowever,whi h arethevaluesobtainedby

themajority lassier 1

onalltested datasets arelisted inTable1. Itis worth

mentioningthatIB1appearedtooutperformthemajority lassieronthirteen

domains. On appendi its,breast- an er, german- reditand hepatitisdatasets

theresultshavebeenstatisti allyinsigni ant.

The E

k

P system has been used mainly with the samedefault settings for

allseventeenproblemsbe ausethe al ulationshavebeenperformedinabat h

modewhi hmadeperformingnumeri alexperimentsand olle tingtheresults

forthepapermu h easier. Thesimplex ostfun tion toleran ehasbeenset to

1E-16andthemaximumnumberof ostfun tionevaluationshasbeenrestri ted

to 300 alls ex ludinga ertainnumberoftargetfun tion evaluationsrequired

to initializethesimplex. Thislattervalueistheparameterwhi his alled the

1

Themajority lassierinthe Weka systemwhi hhad been usedinourexperimentsis

(8)

numberofsimplexpointsonwhi h asimplex isspanned. Thus,themaximum

numberofthe ostfun tionevaluationsvaluehastobein reasedbythenumber

ofsimplexpointsinordertoattainthetotalnumberoftargetfun tion alls. For

allexperimentsthathavebeen ondu tedin ourpaperwehavesetthenumber

ofsimplexpointsto fty. Theupperlimitationonthevalueof thisparameter

is the number of samples in the training partition. Therefore, be ause the

smallestproblemoutofthestudied seventeen domains onsistsofhardly sixty

samples,thesele tedbyusvalueforthisparameterseemstobeagood hoi e.

Themaximumnumberof ost alls settingof 300wastaken asthedefault for

the datasets of the size of a oupleof hundred ases and this hoi e is based

onour earlierexperien ewith similarminimization-based learningsystemswe

had been working on. What on ernstheE

k

P's form of learningused forthe

Experiment 1, both the rst variant of the ost fun tion algorithm involving

leave-one-out ross-validation learningas well as the se ond variant hasbeen

employed. TheIB1 lassierhasbeen hosenastheE

k

P's lassi ationengine.

Tables2and3summarizetheresultsof theExperiment1. Itiseasyto noti e

that generalization abilityof the E

k

P system trainedwith the rst algorithm

variantdepends stronglyon thenumberof prototypessele ted. Choosing one

prototype per lass to be sele ted by E

k

P-1 statisti ally degraded the results withrespe ttoones obtainedwiththeIB1systemonlyonthreeoutoftheall

seventeendomains. Thisistheex ellentresult. Whentwoprototypesper lass

havebeensele ted, thenumberof timestraining dataredu tiondegraded the

resultsdroppedtoonlytwo. Withthreeprototypesper lass hosentheresults

have been statisti ally insigni ant from these attained with IB1 on sixteen

problems. Therst variantof theE

k

Palgorithm that hasbeentaken forour

experimentswastrainedwithleaveoneout rossvalidation. Theinuen e of

thevalueofthe rossvalidationlearningfoldonthegeneralizationhasnotbeen

yetfullyinvestigated. Leaveone-out rossvalidationseemstoleadtoobtaining

verystable models and the best generalization at the expense of signi antly

lenghtenningthe al ulationtime. In aseofthese ondalgorithmversion(E

k

P-

2)statisti allysigni antdegradationofthegeneralizationresultswithrespe t

to onesattained withtheIB1system ouldhavebeennotedonthreedatasets

independentlyonthenumberofprototypesper lass hosen.

4.2. Experiment2: Generalization Ability LVQ vs. IB1 and LVQ

vs. E

k

P

Forthis experiment, LVQversion1 with 'random trainingdata proportional'

as well as 'simple

k

-means' initialization, learning rate of 0.3, total training iterationsof1000, linearde aylearningfun tion and disabledvotinghasbeen

used. GeneralizationabilityofLVQagainstIB1hasbeentestedrst. Be ause

themethod of initializationof thepositions of odebooks seemednotto make

anystatisti allysigni antinuen eongeneralizationoftheLVQsystem,only

(9)

Table2. A omparisonof generalizationresults attainedwiththeE

k

P systemwith

one,twoandthreeprototypesper lasssele tedvs. thegeneralization obtainedwith

the IB1 lassier. E

k

P has been trainedwith the rst version of the ost fun tion

algorithmwhi hisdenotedasE

k

P-1. Fiftysimplexpointshavebeenusedtotrainthe E

k

P system. Thestatisti al degradation ofthe resultswithrespe tto thereferen e ones(i.e. theseofIB1)ismarkedwithaboldfont.

Table3. A omparison of thegeneralization results attainedwith theE

k

P system withone,twoand threeprototypesper lasssele tedvs. thegeneralizationobtained

with the IB1 lassier. E

k

P has been trained with the se ond version of the ost fun tionalgorithmwhi hisdenotedasE

k

P-2. Fiftysimplexpointshavebeenusedto train theE

k

P system. Thestatisti al degradationof theresults withrespe tto the referen eones(i.e. theseofIB1)ismarkedwithaboldfont.

(10)

Table4. A omparisonofthegeneralizationresultsattainedwiththeLVQ-1system

(with thelinear de ay learning andthe trainingdataproportional initializationset-

tings)with2,4and6 odebookssetvs. thegeneralizationresults obtainedwiththe

IB1 lassier. Thestatisti al degradationoftheresultswithrespe ttothereferen e

ones(i.e. theseofIB1)ismarkedbyusingaboldfont.

the'randomtrainingdataproportional'initialization.

As it anbe seenfrom Table 4, the LVQ systemperformed rather poorly

and on seventeen problems with two odebooks set twelve times statisti ally

signi ant degradationof the results with respe t to these attained with the

IB1 lassierhas beennoted. In reasingthe numberof odebooks tofour has

ledtoaminorimprovementofthegeneralizationoftheLVQsystemandonten

domainstheresultshavebeenstillworsethantheseobtainedwithIB1. Sele tion

ofsix odebookshasledtostatisti allysigni antdegradationoftheresultswith

respe ttothereferen eonesonnineproblemsoutofseventeenstudied. Inthis

experiment also no improvement over IB1's generalization ability ould have

beenobserved.

Inthe se ond experiment in this se tion thetest estimating generalization

ability of LVQ against E

k

P has been performed. This test is made only on

two- lassproblemsto assurethatthenumberof LVQ odebooksaswellasthe

prototypessele tedbytheE

k

Psystemisthesame. Re allthat E

k

P takesthe

numberofprototypesper lassasitsadaptiveparameterwhilsttheLVQsystem

requiresatotalnumberof odebooks tobespe ied. Sin eallthe al ulations

havebeenperformedinabat hmodewiththesamesettingsforall lassi ation

domains, thelistofdatasets hadtoberestri tedtotwo lassproblems. What

anbenotedbytakinga loserlookatTable5is,thattheresultsofLVQmore

stronglydependonthenumberof odebookssele tedthanitisin aseofE

k

P

1. Theaverage lassi ationa ura y ofE

k

P1 takenoveralltwelvedomains

(11)

Table5. A omparisonofthegeneralizationresultsattainedwiththeLVQ-1system

withtwo,fourandsix odebooksvs.thegeneralizationobtainedwiththeE

k

P lassi-

er. E

k

Phasbeentrainedwiththerstversionofthe ostfun tionalgorithmwhi his denotedasE

k

P-1.FiftysimplexpointshavebeenusedtotraintheE

k

Psystem. The statisti aldegradation oftheresults oftheLVQsystemwithrespe ttothereferen e

onesismarkedwithaboldfont.

Table6. A omparisonofthegeneralizationresultsattainedwiththeLVQ-1system

withtwo,four andsix odebooksvs. the generalizationobtainedwiththeE

k

P las- sier. E

k

P has beentrained withthe se ond version ofthe ostfun tionalgorithm whi h is denoted as E

k

P-2. Fifty simplexpointshave been used to train the E

k

P system. Thestatisti aldegradation oftheresultsoftheLVQsystemwithrespe tto

thereferen eonesismarkedwithaboldfont.

(12)

only64%. Going withthe numberof odebooks to four andsix, in reasesthe

averageLVQ'sgeneralizationabilitytoabout70%and72%respe tively. Similar

trends anbeobservedwhenLVQisput againsttheE

k

P2(seeTable6).

4.3. Experiment3: Time Requirements

ThetrainingtimesoftheE

k

Psystem,whi harehoweverallstatisti allyworse thantheseofIB1(itisnotasurprise),arequiteshortandinaverageareequal

to about 1s(E

k

P1) and 0.2s (E

k

P-2) for learning on a singlepartition of a

typi alUCIdatasetofasizeofa oupleofhundred ases(seeTable7and8).

2

ThetrainingtimesofLVQareevenshorterthantheseobtainedwithoursystem.

As anbeseenfrom Table9,LVQhasbeatenup ompletely bothvariantsof

theE

k

Pmethodonallseventeen lassi ationproblems. Itturnedoutthatthe LVQsystem anbetrainedintimewhi hisofthreeordersofmagnitudeshorter

thanthe oneobtainedbymeasuringthe E

k

P's learningtime. Fortunatelythe E

k

Ptestingtimes areshorterthanthese ofIB1bythree ordersof magnitude.

Table10 ontainsthesummaryoftheresultsofthemeasurementsofthetesting

time. Itisnothardto seethat ittakesmu h lessthanaminutefortheentire

10-fold ross-validationtest that is ondu ted with oursystemto omplete on

most ommonUCIdatasets. Thisisa eptableresult. Itshouldbenotedthat

training the E

k

P method with lower-fold ross-validation than leaveoneout leadstoasigni antredu tionofthetimerequirementsforthis algorithm.

5. Con lusions

Weare lu kythat wehavemanaged to reate quitea fastprototypesele tion

systemdespiteofemployingthesimplexminimization routinewhi his usually

expensive. Theinitial experiments indi ate that the method may turn out to

be ompetitiveto otherdata pruningsystems. In thepreliminary al ulations

themethod dis ussedin this paperhaveshownstatisti al insigni an eofthe

generalizationabilitywith respe t toIB1almost on all lassi ationproblems

and sometimesturned outto besuperior to theLVQsystemver. 1. However

theE

k

PtrainingtimesarelongerthattheseofIB1andofLVQbutthetesting

times areshorterthan theones obtainedbytiming IB1. After all,oneshould

remember about the general idea laying behind the sele tion of prototypes:

on e the instan es are initially found (training sets are pruned), the tests on

unseensampleswhi hareusuallyfrequentlyperformed anbe ondu tedmu h

faster. Before the E

k

P system is not onfronted with many other prototype

sele tion algorithms and before further experiments with our method are not

performed it will be hard to estimate a real value of our ontribution to the

patternre ognitioneld.

2

The al ulationshavebeenperformedonalaptopequippedwitha2.4GHzIntelCore2

Duopro essorrunning64bitUbuntuLinuxOperatingSystemunder64bitOpenJVMJava

(13)

Table7. ThetrainingtimesoftheE

k

Pmethodattainedonone rossvalidationfold inse onds. E

k

Phasbeentrainedwiththerstversionofthe ostfun tionalgorithm whi h is denoted as E

k

P-1. Fifty simplexpointshave been used to train the E

k

P system. Thestatisti al degradation of the results of the E

k

P system withtwo and three prototypesper lass sele ted with respe tto the referen e ones (i.e. theseof

E

k

P1withonereferen einstan eper lass hosen)ismarkedwithaboldfont.

Table 8. The training times of the E

k

P method attained on one rossvalidation foldin se onds. E

k

P has beentrained with these ond version of the ostfun tion algorithmwhi hisdenotedasE

k

P-2. Fiftysimplexpointshavebeenusedtotrainthe E

k

P system. Thestatisti al degradation ofthe results of theE

k

P system withtwo

and threeprototypesper lasssele tedwithrespe ttothereferen e ones(i.e. these

ofE

k

P2withonereferen e instan eper lass hosen)ismarkedwithaboldfont.

(14)

Table9. ThetrainingtimesoftheE

k

Pmethodattainedonone rossvalidationfold inse onds. E

k

P has been trainedwith the rstand these ond version of the ost fun tionalgorithmwhi hisdenotedasE

k

P1andE

k

P-2respe tively.Two odebooks /prototypeshavebeen hosen. FiftysimplexpointshavebeenusedtotraintheE

k

P system. Thestatisti al degradationofthe resultsoftheE

k

P systemwithrespe tto thereferen eones(i.e. theseofLVQ)ismarkedwithaboldfont.

Table10. ThetestingtimesoftheE

k

Pmethodattainedonone rossvalidationtest foldinse onds. E

k

P has beentrained with these ond version of the ostfun tion algorithm whi h isdenoted asE

k

P-2. Fifty simplexpoints have beenused totrain theE

k

Psystem. Thestatisti alimprovementoftheresultsoftheE

k

Psystemwith

respe ttothereferen eones(i.e. theseofIB1)ismarkedwithabold,itali font.

(15)

Referen es

Aha D., Kibler D.and Albert M.: Instan e-basedlearning algorithms.Ma hine

Learning,6,(1991),37-66

Bhatta haryaB.,PoulsenR.,ToussaintG.: Appli ationofproximitygraphsto

editingnearestneighborde isionrule.In:InternationalSymposiumonInformation

Theory,SantaMoni a,(1981)

BrightonH.,MellishC.:Advan esininstan esele tionforinstan ebasedlearning

algorithms.DataMiningandKnowledgeDis overy6,(2002),153-172

Brownlee J.: A java implementation of the SOM-LVQ PAK.

http://www.it.swin.edu.au/personal/jbrownlee/

Cameron-Jones R.: Instan e sele tion by en oding length heuristi with random

mutationhill limbing.In: Pro eedings oftheEighthAustralian JointConferen e

onArti ialIntelligen e,(1995),99-106

DoboszK.: Statisti alSigni an eTestsinEstimationoftheResultsObtainedwith

VariousSystemsthatLearn.M.S .thesis,Ni holausCoperni usUniversity,Toru«,

Poland,(2006)(InPolish)

Du hW.,Bla hnik M.: Fuzzyrule-basedsystemsderivedfromsimilaritytoproto-

types.Le tureNotesinComputerS ien e,Vol.3316, (2004),912-917

Du hW.,GrudzinskiK.: Prototypebasedrules-newwaytounderstandthedata.

IEEEInternationalJointConferen eonNeuralNetworks,WashingtonD.C,(2001),

1858-1863

Gates G.: The redu ed nearest neighbor rule. IEEE Transa tions on Information

Theory18,(1972),665-669

Gro howskiM.: Sele tingReferen eVe torsinSele tedMethodsforClassi ation.

M.S .thesis,Ni holausCoperni usUniversity,DepartmentofAppliedInformati s,

Toru«,Poland,(2003)(InPolish)

Gro howski M., Jankowski N.: Comparison ofInstan e Sele tionAlgorithmsII:

ResultsandComments.Arti ialIntelligen eandSoftComputingICAISC2004,in

Le tureNotesinArti ialIntelligen e(LNAI3070),580-585.

GrudzinskiK.,Du hW.: SBL-PM:ASimpleAlgorithmfor Sele tionofReferen e

Instan es for Similarity-Based Methods. Intelligent Information Systems, Bystra,

Poland,2000, inAdvan esinSoftComputing,Physi a-Verlag,(2000),99-108

GrudzinskiK.: SBL-PM-M:ASystemforPartialMemoryLearning.Arti ialIntel-

ligen eandSoftComputingICAISC2004,inLe tureNotesinArti ialIntelligen e

(LNAI3070),586-591

GrudzinskiK.: E

k

P:Afastminimizationbasedprototypesele tionalgorithm.Pro- eedingsoftheInternationalIIS'08Conferen e,Zakopane,Poland,2008.In: Chal-

lengingProblemsofS ien e,ComputerS ien e.A ademi PublishingHouseEXIT,

(16)

Hart P.: The ondensednearest neighbor rule. IEEE Transa tionsonInformation

Theory14,(1968),515-516

Hyninen,Kangas,Kohonen,Laaksonnen,Torkolla.:LVQ_PAK:TheLearning

Ve torQuantizationProgramPa kage,(1996)

JankowskiN.: Dataregularization.InRutkowski,L.,Tadeusiewi z,R.,eds.:Neural

NetworksandSoftComputing,Zakopane,Poland,(2000),209-214

Jankowski N., Gro howski M.: Comparison of Instan es Sele tion Algorithms

I:Algorithms Survey.Arti ial Intelligen e andSoft ComputingICAISC2004, in

Le tureNotesinArti ialIntelligen e(LNAI3070),598-603

Kohonen T.: Self-Organizing Maps. Thirded. Berlin Heidelberg. Springer-Verlag,

(2001). (Thomas SHuang; TeuvoKohonen, and ManfredR. S hroeder. Springer

SeriesinInformationS ien es,30).

Lampton M.: neldermead.java (http://www. ea.berkeley.edu/ mlamp-

ton/neldermead.java)

Maloof M., Mi halskiR.: Sele tingExamplesforPartial MemoryLearning.Ma-

hineLearning,41,(2000),27-52

Mertz C., Murphy P.: UCI repository of ma hine learning databases.

http://www.i s.u i.edu/pub/ma hine-learning-data-bases.

NelderJ.,MeadR.: Asimplexmethodforfun tionminimization.ComputerJour-

nal7,(1965),308-313

Oja M., Kaski S.,Kohonen T.: Bibliographyof Self-Organizing Map(SOM)Pa-

pers: 1998-2001 Addendum,NeuralComputingSurveys,3,(2003),1-156

Skalak D.: Prototype andfeaturesele tion by samplingandrandommutationhill

limbingalgorithms.In:InternationalConferen eonMa hineLearning(1994),293-

301

Tomek I.: Anexperimentwiththeeditednearestneighborrule.IEEETransa tions

onSystems,Man,andCyberneti s6,(1976),448-452

WilsonD.: Asymptoti propertiesofnearestneighborrulesusingediteddata.IEEE

Transa tionsosSystems,Man,andCyberneti s2,(1972)408-421

Wilson D., Martinez T.: Instan e Pruning Te hniques. In Fisher, D.: Ma hine

Learning: Pro eedingsof the FourteenthInternationalConferen e. Morgan Kauf-

mannPublishers, SanFran is o,CA.,(1997),404-417

WilsonD.,MartinezT.: Redu tionTe hniquesforInstan e-BasedLearningAlgo-

rithms.Ma hineLearning,38,(2000),257-286

Witten I., Frank E.: DataMining: Pra ti al Ma hine Learning Tools and Te h-

niqueswithJavaImplementations.MorganKaufmannPublishers,(2000).

Cytaty

Powiązane dokumenty

Zatem testy oparte na powyższych antygenach mogą być przydatne do wykrywania zakażenia prątkiem we wczesnej fa- zie, jeszcze przed rozwojem aktywnej, klinicznie jawnej postaci

We can nevertheless recognize, at Gezira Dabarosa (Adams 2004: 99-101) and at Meinarti (Adams 2000: 43-44), the dwellings of two persons who were perhaps comparable to the local

According to DIGIPEDIE 2020 (MŠVVaŠ SR 2013) in higher education is a fundamental objective of promoting digital educational content (called e-learning),

At the level of coupling accessible by KLOE in this channel the U boson is expected to decay promptly leaving its signal as a resonant peak in the invariant-mass distribu- tion of

Możesz zautomatyzować sterowanie wszystkimi roletami w domu za pomocą przycisku roletowego Master lub, jeśli wolisz, możesz sterować nimi indywidualnie. Dzięki aplikacji Simon

• reklama w mediach - ściśle współpracujemy z prasą (Nowa Trybuna Opolska, Gazeta Wyborcza, Tygodnik Żużlowy, Przegląd Sportowy, Sport), telewizją (TVP 3 Opole),

Data-MC comparison for M ee (left) and cos ψ ∗ (right) distributions after the M ee ( recoil ) cut (top) and at the end of the analysis chain (bottom). The signal production

The effects of background rejection cuts on the various data components are visible in Fig. Systematics on the Monte Carlo description of photon conversion have been studied