Some Remarks on the Choice of the Kernel Function in Density Estimation


F O L IA O E C O N O M IC A 194, 2005

A l e k s a n d r a B a s z c z y ń s k a *



T h e basic c h aracteristic describ in g the b eh av io u r o f the ra n d o m v ariab le is its d ensity fu nction. K ern el den sity e stim atio n is one o f the m o st w idely used n o n p a ram etric d ensity estim atio n s. In th e p ro cess o f co n stru ctin g the e stim ato r we have to choose tw o p a ra m e te rs o f the m eth o d : the kernel fu n ctio n K (u ) and sm o o th in g p a ram ete r h (b an d w id th ). In the p ap er, kernel m eth o d is discussed in detail, w ith p a rtic u la r e m phasis o n influence o f th e choice o f the kernel fu n ctio n K (u ) on the q u a n tity o f sm o o o th in g . M o n te C arlo stu d y is presented, w here seven kernel fu n ctio n s (G a u ssia n , U n ifo rm , T rian g le, E p an ech n ik o v , Q u a rtic , T riw eight, C osinus) a re used in den sity estim ation.

Key words: d en sity estim atio n , kernel fu n ctio n , sm o o th in g p a ram ete r.


D ensity function is the basic characteristic describing th e b eh avio ur o f the ra n d o m variable. It is used in investigation o f p rop erties o f a given set o f d a ta and provides a way o f show ing its structure.

T he oldest density estim ator for univariate case is the histogram . Popularity o f the histogram is connected with its sim plicity, b u t this estim ato r has som e draw b ack s (for exam ple: influence o f the placem ent o f the bin edges on the estim ato r, estim ating all densities by a step function). O ne o f the m o st k n o w n and widely used m ethods o f estim atio n o f density function is the kernel m eth o d . T he m o tiv atio n for kernel density estim ation is the average shifted histogram , which averages several histo gram s based o n shifts o f the bin edges. K ernel estim ato r does n o t have d isadv antages o f the histo g ram and provides sim ple and effective m eth o d o f show ing stru ctu re in a d a ta set a t the beginning o f analysis.


K ernel estim ato r o f a density function f ( x ) is defined by:

+ 00

w here K(u) is kernel function satisfying J K(u)du = 1. - 00

T h e idea o f kernel estim ato r was introduced by Fix and H odges in 1951 as nonparam etric version o f discrim inant analysis. R osenblatt (1956) considered kernel estim ato r with one special kernel fu nction , and P arzen in 1962 introduced a general form o f kernel estim ator.

In practice, the kernel function K(u) is a density fu nction (for exam ple n orm al function) and then estim ator (1) is also density function. Some o f the best know n kernel functions are presented in D om ańsk i, P ru sk a, W agner (1998).

P a ra m e te r h (h > 0) is a sm oothing p aram eter, also called w indow width o r bandw idth .

E xpressions for E (f (x)) and D 2( f (x )) are the following:




L et the kernel function be a sym m etric fu nction satisfying: + 00 J K(u)du = 1 j uK{u)du — 0 — 00 + oo J u2K(u)du = k 2 Ф 0, for h(n) > 0

lim h(n) = 0 and lim nh(ri) = oo,


T he bias and asym ptotic m ean integrated squared erro r o f kernel estim ator (1) is the following: E (Д х )) - f ( x ) = 7 ~ K - / ( x ) . (4) A M 1 S E = ' / i 4 /c2 f / ' ( x ) 2dx + A f K ( u ) 2du. (5) 4 nn II. M O N T E C A R L O ST U D Y

M o n te C arlo study was conducted to indicate the influence o f the choice o f the kernel function on the q u an tity o f sm o o th in g in kernel density function. Analysis o f properties o f estim ator was done in three basic variants, depending o n d istrib u tio n , from which the d a ta were chosen. T h ese variants are as follows:

- v a ria n t I: no rm al d istrib u tio n N(0,0.2),

- v aria n t II: m ixture o f norm al distributions: f ( x ) = 0 .2 5 /x(x) + 0.7 5/2(x), w here / x(x) is density function N(0, 0.2), / 2(x) is density fu nction N(3, 0.5). V a rian t II presents tw o-m odal d istributions,

- v a r ia n t III: m ix tu re o f n o rm a l d is trib u tio n s : f ( x ) = 0 . 5 f i ( x ) + + 0 .2 5 /2(x) + 0 .2 5 /3(x), where f t (x) is density functio n N (0, 0.2), / 2(x) is density fu n ctio n N(3, 0.5), / 3(x) is density function N (7, 0.5). V arian t III presents three-m odal distributions.

In the experim ent we used som e m easures: - m ean squared erro r

B S K = - £ [ / ( х , ) - / ( х , ) ] 2 (6) n i=1

- m axim um value

M R = m ax I f ( x t) - / ( x ; ) | (7)


- P is a n u m b er o f cases, w here the value o f estim ato r is greater than value o f density function in this p oint (over sm oothing)

- L is a n u m b er o f cases, where the value o f estim ato r is less th an value o f density fu nction in this p oint (under sm oothing).

In the experim ent, tru e density functions (described as v arian t I, II and III) were com p ared , using m easures m entioned above, w ith the estim ators


o f density function. K ernel estim ation was do n e, based on 128 random o bserv ations chosen from po pulatio n s (described as v arian t I, II and III), using one o f the seven kernel function s (G au ssian , U n iform , Triangle, H pancchnikov, Q uartic, I riw eight, C osinus) and using sm o o th in g p aram eter, which m inim izes m ean squared e rro r B S K (6). M inim alization o f BS K causes th a t kernel density estim ator can be treated as op tim al fo r this value ol p a ra m e tr h. I he analysis of values of sm oothing p aram eter, m inim izing B S K , for p artic u la r kernel function allow us to co m p are the properties of kernel function used in the estim ation. T h e results o f this p a rt o f study are presented in tables 1, 2, 3.

T abic 1. V alues o f sm o o th in g p a ra m ete r h m inim izing B SK for v a ria n t I

K ernel fu n ctio n V alue o f p a ra m e tr h B S K M R P N

E p an ech n ik o v 0.0800 0.028609 0.326695 40 88 G a u ssia n 0.0800 0.030831 0.418698 43 85 Q u a rtic 0.2100 0.028370 0.355300 42 86 T rian g le 0.2000 0.028185 0.359394 41 87 U niform 0.1300 0.031556 0.453241 52 l b T riw eight 0.2400 0.028592 0.368761 42 86 C osinus 0.1800 0.028325 0.343490 41 87

T able 2. V alues o f sm o o th in g p a ram ete r h m inim izing B S K for /a ria n t II

K ernel fu n ctio n V alue o f p a ra m e tr h B S K M R P N

E p an ech n ik o v 0.1600 0.004511 0.172078 45 83 G a u ssia n 0.1800 0.004702 0.161273 42 86 Q u a rtic 0.4200 0.004522 0.168167 52 76 T rian g le 0.4000 0.004471 0.163618 56 72 U niform 0.3000 0.004993 0.169591 44 84 T riw eight 0.4800 0.004542 0.168551 53 75 C osinus 0.36(H) 0.004509 0.174501 49 79

I able 3. V alues o f sm o o th in g p a ram ete r h m inim izing B S K fo r v a ria n t 111

K ernel fun ctio n V alue o f p a ra m e tr h B S K M R P N

E p an e ch n ik o v 0.0900 0.003292 0.116821 82 46 G au ssian 0.1100 0.003353 0.108255 74 54 Q u artic 0.2600 0.003134 0.114355 79 49 T riangle 0.2500 0.003139 0.109422 78 50 U niform 0.1400 0.004459 0.145988 84 44 T riw eight 0.3000 0.003137 0.112873 77 51 C osinus 0.2200 0.003192 0.117113 82 46


T h e study was also expanded by calculating sm o o th in g p aram eter h in the estim ation the follow ing the density functions: n o rm al w ith p aram eters H = 0 and о — 1.3, no rm al with param eters ц = 5 an d о — 0,2, un ifo rm on interval < — 1, 1 > , uniform on interval < —3, 4 > , triang le on interval < 1 , 5 > , gam m a with p aram eters A = 2 and a = 0.5 ( j 2 w ith 2 degrees o f freedom ), gam m a with param eters X = 0.5 and a = 1, gam m a with param eters A = 0.5 and a = 5, gam m a with param eters A = 2 and a = 5 ( j 2 with 10 degrees o f freedom ). T h e results concerning values o f sm o o th in g p aram eter h m inim izing B S K and B S K (in brackets) arc presented in T ab le 4.

O n the basis o f the results in presented T ables 1-4 wc can divide regarded kernel functions into tw o groups. G au ssian , E p an ech niko v and U niform kernel functio ns arc kernels th a t need sm aller values o f sm oothing p aram eter and th e second group: Q uartic, T riangle, T riw eight and C osinus kernels need greater values o f sm oothing p aram eter in estim atio n. T h e first group o f kernels is characterised by higher degree o f sm oothing. T his division occurs n o t only for v ariant I, bu t also for tw o-m o dal and tree-m odal d istrib u tio n s (varian t II and variant III).

It allows to estim ate value o f sm oothing param eter for paricular estim ator with p artic u la r kernel function.

M oreo ver, values o f m ean squared e rro r (B SK ) in the presented study d o n o t differ significantly.

T h e results in T ab le 1 and 4 indicate that:

1. C h an g e o f location p aram eter in norm al d istrib u tio n does n o t cause change o f sm o o th in g p aram eter h, which m inim ises B S K .

2. C hang e o f scale param eter in norm al d istrib u tio n and in unim odal gam m a d istrib u tio n s ( a > 1) causes constru ctin g estim ato r w ith hig her value o f sm o o th in g p aram eter.

3. C h an g e o f shape p aram eter for x 2 d istrib u tio n causes m o re value o f sm o o th in g p aram eter.

4. E x p o n en tial d istrib u tio n (a = 1) needs sm all value o f sm oothing p aram eter in com p ariso n with distrib u tio n s with the sam e scale param etr.

On the basis o f the above analysis we can fo rm u late a statem en t th at values o f sm oo th in g param eters, which m inim ise B S K are different for different kernel functions. It can be explained by different sm oothing properties o f kernels in the estim ation o f density function.


T able 4. V alues o f sm o o th in g p a ra m e te r h m inim izing B S K (expanded study) K ernel fu n c tio n D is trib u tio n s n o rm al N (0, 1.3) n o rm al N (5, 0.2) u n ifo rm И , 1] u n ifo rm [-3 , 4] tria n g le [1, 5] gam m a /. = 0.5, st = 5 g a m m a ; . = 2 , 1=5 g am m a /1 = 2 , 2 = 0.5 g am m a A = 0 .5 , a = l E p an e ch n ik o v 0.50 0.08 0.12 0.40 1.19 0.46 1.23 0.011 0.029 (0.000670) (0.028609) (0.011055) (0.000901) (0.009582) (0.001128) (0.000018) (1.697906) (0.061773) G a u ssia n 0.55 0.08 0.16 0.56 1.30 0.51 1.31 0.014 0.032 (0.000718) (0.030831) (0.010688) (0.000873) (0.010023) (0.001201) (0.000018) (1.703034) (0.05989) Q u a rtic 1.37 0.21 0.38 1.34 3.21 0.27 3.28 0.031 0.083 (0.000672) (0.028370) (0.010978) (0.000896) (0.009737) (0.001122) (0.000017) (1.707840) (0.060569) T rian g le 1.30 0.20 0.40 1.39 3.00 1.19 3.05 0.033 0.079 (0.000667) (0.028185) (0.010814) (0.000883) (0.009868) (0.001105) (0.000016) (1.709881) (0.058834) U n ifo rm 0.84 0.13 0.21 0.78 1.99 0.74 2.03 0.019 0.053 (0.000725) (0.031556) (0.011178) (0.000888) (0.009188) (0.001273) (0.000027) (1.6551001) (0.06167) T ri w eight 1.57 0.24 0.44 1.54 3.70 1.45 3.77 0.038 0.094 (0.000677) (0.028592) (0.010893) (0.000889) (0.009814) (0.001131) (0.000017) (1.706436) (0.060112) C osin u s 1.16 0.18 0.27 0.93 2.74 1.07 2.81 0.025 0.067 (0.000670) (0.028325) (0.011057) (0.000903) (0.009632) (0.001123) (0.000017) (1.701697) (0.061466) A le k sa n d ra B a sz c z y ń sk a



T h e m ain conclusions resulting from the conducted experim ents are the following:

1. G au ssian and E panechnikov kernels used in kernel estim atio n need sm oothing p aram eters o f the sam e values. T hese kernels are characterized by great p ro perties o f sm oothing.

2. C hanging o f location, scale or shape p aram eter in som e types o f distribution causes changing values o f sm oothing p aram eter m inim izing m ean squared e rro r in estim ation o f density function.


Aleksandra Baszczyńska

U W A G I O W Y B O R Z E F U N K C JI JĄ D R A W E S T Y M A C JI F U N K C J I G Ę S T O Ś C I Streszczenie

F u n k c ja gęstości jest je d n ą z p o d staw ow ych c h ara k te ry sty k opisujących zachow anie się zm iennej losowej. N ajczęściej w ykorzystyw aną m eto d ą n ieparam etrycznej estym acji je s t estym acja ją d ro w a . W procesie k o n stru k cji e sty m a to ra konieczne są dw ie decyzje, d o ty czące p a ra m e tró w m eto d y : w y b ó r funkcji ją d r a K (u) o ra z w ybór p a ram etru w y g ład zan ia h. W p racy nacisk p o ło żo n o n a w pływ w yboru funkcji ją d r a n a w ielkość p a ra m e tru w ygładzania. E ksperym ent M o n te C arlo dotyczy siedmiu funkcji ją d ra (gausowskiej, rów nom iernej, trójkątnej, epanechnikow a, d w u k w ad ra to w ej, tró jk w ad rato w ej i kosinusow ej) w estym acji jąd ro w ej funkcji gęstości.


