Smoothing of X-ray diffraction data and K (alpha)2 elimination using penalized likelihood and the composite link model

(1)

Journal of Applied Crystallography ISSN 1600-5767 Received 15 November 2013 Accepted 14 March 2014

#2014 International Union of Crystallography

Smoothing of X-ray diffraction data and

Ka

2

elimination using penalized likelihood and the

composite link model

Johan J. de Rooi,a,b* Niek M. van der Pers,bRuud W. A. Hendrikx,bRob Delhez,b Amarante J. Bo¨ttgerband Paul H. C. Eilersa

a

Department of Biostatistics, Erasmus Medical Centre, Rotterdam, The Netherlands, andbFaculty 3mE, Department of Materials Science and Engineering, Delft University of Technology, The Netherlands. Correspondence e-mail: j.derooi@erasmusmc.nl

X-ray diffraction scans consist of series of counts; these numbers obey Poisson distributions with varying expected values. These scans are often smoothed and the K2component is removed. This article proposes a framework in which both

issues are treated. Penalized likelihood estimation is used to smooth the data. The penalty combines the Poisson log-likelihood and a measure for roughness based on ideas from generalized linear models. To remove the K doublet the model is extended using the composite link model. As a result the data are decomposed into two smooth components: a K1 and a K2 part. For both

smoothing and K2 removal, the weight of the applied penalty is optimized

automatically. The proposed methods are applied to experimental data and compared with the Savitzky–Golay algorithm for smoothing and the Rachinger method for K2stripping. The new method shows better results with less local

distortion. Freely available software in MATLAB and R has been developed.

1. Introduction

Smoothing of X-ray diffraction data and K2elimination are

often applied before evaluation of the data in software packages as supplied with commercially available diffract-ometers. The scans consist of series of counts which appear to follow Poisson distributions. Smoothing, for instance using the method of Savitzky & Golay (1964), is used to reduce random noise, by combining the counts in adjacent channels to get better local estimates of the expected values. The resulting smooth curve is better suited for estimating, for example, the locations and heights of peaks than the observed data.

In addition to the random noise, analysis is hampered by the fact that the peaks show a doublet structure, caused by the K1 and K2 emission lines of the X-ray source. The K2

reflection can be eliminated by utilizing information about the differences in wavelength, intensity and shape of the K1and

K2lines. The K2component is often removed by applying

the Rachinger (1948) correction.

In this article, we present a smoothing procedure using penalized likelihood estimation and extend it to include K2

elimination. This results in a decomposition into two smooth components: a K1and a K2part. The amount of smoothness

can be optimized automatically. Because the method is grounded in a well developed statistical theory, the general-ized linear model (GLM) (Nelder & Wedderburn, 1972), standard errors can also be computed, if desired.

The next section briefly introduces the instrumental setup and software implementations. The theory is introduced in several steps. x3 starts with an introduction to smoothing,

using penalized least squares. Because we are dealing with counts, we switch to penalized likelihood smoothing, which utilizes the Poisson distribution. In x4 we estimate the K1and

K2 components. We use a P-spline basis (Eilers & Marx,

1996) in combination with the composite link model (Thompson & Baker, 1981), explained in xx4.1 and 4.2, respectively. Penalized estimation requires tuning of the penalty parameter; an automatic procedure is presented in x5. To illustrate the proposed methodology, a number of appli-cations are presented. The article closes with a discussion of the main findings and possibilities for future research.

Although the proposed methodology is useful in a number of settings, we also note that for various programs to obtain a fit to the whole diffraction pattern (e.g. Rietveld analysis or MAUD; Lutterotti & Bortolotti, 2003) it is usually not appropriate to apply smoothing and/or K2elimination prior

to data analysis.

2. XRD experiments

All measurements were carried out using a Bruker D8

Advance diffractometer in Bragg–Brentano geometry,

equipped with a Vantec position-sensitive detector and a graphite monochromator in the diffracted beam. Data collection was carried out at room temperature using Co K radiation (Co K1= 0.179026 nm). Diffractometer scans were

made in –2 scanning mode with a fixed step size and a fixed counting time per step. The specimens were spun at 30 r min1.

(2)

Three specimens were used to test the smoothing and K2

elimination:

(a) LaB6. For this specimen, the standard reference material

for line position and line shape, SRM660a from NIST, has been used. The particle size of the LaB6powder was such that

the effect of crystal statistics is virtually absent. The actual specimen consists of a thin layer of LaB6on an Si single-crystal

substrate wafer with orientation (510). It was made by applying a sedimentation method (mass thickness 33 m, diameter 24 mm). Scan parameters: step size 0.007 _{2, time}

per step 2 s. This specimen produces sharp isolated reflections on a low background and it was especially used to check the K2elimination procedure.

(b) FeOx. This is a bulk specimen made from finely grained

powder, pressed into a pellet with thickness 2 mm, diameter 31 mm. The powder consists mainly of haematite. Scan para-meters: step size 0.041 _{2, time per step 1 s. This specimen}

produces reflections on a relatively high background and was especially used to check the residual background level after K2elimination.

(c) Ni-alumina. This is a thin layer of powder on an Si single-crystal substrate wafer with orientation (510). The powder contains boehmite, aluminum oxides and nickel oxides. Scan parameters: step size 0.035_{2, time per step 4 s.}

This specimen produces a mixture of sharp and broad reflec-tions.

Data evaluation was performed with the DIFFRACplus evaluation package program EVA (version 17.0.0.2; Bruker

AXS Inc., Madison, Wisconsin, USA). The proposed

smoothing algorithm and procedure for K2 elimination are

available in a MATLAB script (The Mathworks Inc., Natick, MA, USA) and the R programming language (R Develop-ment Core Team, 2011). In addition, functions to read and write .raw files from Bruker (version 3) are provided in the R language. R and MATLAB scripts are made available as supplementary material.1

The penalized likelihood smoother is compared with the Savitzky–Golay filter as implemented in EVA. The latter is controlled by one parameter, the width of the sliding interval on which the filter is applied.

The procedure for K2 stripping is compared with the

algorithm available in the EVA software, which is an imple-mentation of the Rachinger method. This method assumes that the K1and K2line profiles are identical in shape and

that both profiles are tied by a fixed intensity ratio : IK2=IK1 = 0.5.

3. Smoothing

Smoothing is a frequently applied tool and many different algorithms are available. Very popular in analytical chemistry is the Savitzky–Golay filter (Savitzky & Golay, 1964), which tunes smoothness by varying the order of the local polynomial and the number of basis functions. Other methods use

wave-lets or Fourier transforms, which are both reviewed by Artursson et al. (2000). Chen et al. (2005) used smooth prin-cipal components to exploit the shared variation in a set of different patterns. This can be valuable in some cases but also limits the applicability of the method, because often only a single scan is obtained. A more recent article (Hogg et al., 2012) proposes a Bayesian method for de-noising signals. The Poissonian nature of the data is accounted for by the use of the Anscombe transform. Long computation times make this procedure less practical.

In contrast, we present an automatic, fast, likelihood-based method. The method is presented in a stepwise manner, alongside applications to experimental XRD patterns. For didactic reasons, we start with the case where the sum of squares of differences is used to measure the fit of a model to data (the ‘least-squares’ principle) and subsequently turn to the likelihood approach.

3.1. Penalized least-squares smoothing

We indicate the counts by yi, for i ¼ 1 : m, with m the

number of data points of the X-ray pattern. We try to find a smooth series l of the same length that does not deviate too much from y. We have to balance two goals: (1) the fidelity of l to the data y, and (2) the roughness (the opposite of smoothness) of l. Intuitively it is clear that the smoother we make l, the larger will be the difference between y and l.

To measure the fidelity of l to y, we use the familiar sum of squares: S ¼P m i ðyiiÞ 2 : ð1Þ

A convenient way to measure the roughness of l is to take the sum of squared differences of neighbouring values:

R ¼P m i¼2 ð_i_i1Þ2¼P m i¼2 ð_iÞ2: ð2Þ

In the last term of (2), is an operator, a shorthand notation for taking differences. Actually a better measure of roughness uses second-order differences, differences of differences:

R ¼P

m

i¼3

½ð_i_i1Þ ð_i1_i2Þ2

¼P m i¼3 ð_i 2i1þi2Þ 2 ¼P m i¼3 ð2 iÞ 2 : ð3Þ

The elegance of the operator becomes clear this way: 2 means that it is applied two times.

The balancing act between fidelity and roughness can now be written as an objective function that has to be minimized:

Q ¼P m i ðyiiÞ 2 þP m i¼3 ð2 iÞ 2 : ð4Þ

This is known as the Whittaker (1923) smoother, which was more recently discussed by Eilers (2003). Because the second term works as a penalty on l, driving it away from the minimal sum of squares solution (but in the desirable direction of more smoothness), this procedure is also called penalized least 1

Supporting information for this article is available from the IUCr electronic archives (Reference: FS5067).

(3)

squares. The parameter tunes the balance. If is small the second term has little influence and l will be near to y, but it will be relatively rough. On the other hand, when is large, l will be (too) smooth, sacrificing fidelity to y.

The loss function Q can be rewritten using matrices and vectors as

Q ¼ jjy ljj2þjjDdljj 2

: ð5Þ

Here jjxjj2¼P_ix2

i is shorthand notation for the quadratic

norm, the sum of squares of elements, of any arbitrary vector x. Dd is the difference matrix: it computes the dth-order

differences over l. For a signal with length m ¼ 5, D2 is

defined as D2¼ 1 2 1 0 0 0 1 2 1 0 0 0 1 2 1 0 @ 1 A: ð6Þ

In the remainder of this article we drop the subscript d and only use second-order differences. The minimizing vector ^ll follows as the solution the linear system of equations

ðI þ PÞ ^ll ¼ y; ð7Þ

with P ¼ D0_{D and I the identity matrix. This is a large}

system: m linear equations with m unknowns. Yet it can be solved extremely easily and quickly, using sparse matrices, in both MATLAB and R.

3.2. Penalized likelihood smoothing

Penalized least squares, as explained above, is very successful in various applications (Eilers, 2003), but it is not a good choice when we are dealing with counts. The least-squares procedure does not guarantee that l will be positive everywhere. Also we do not account for the fact that higher counts have a larger variance. In addition, this smoother rounds off sharp peaks too much (strong smoothing) or gives a wavy baseline (lighter smoothing). We can solve these problems by switching to penalized likelihood.

The X-ray photon counts in a diffraction pattern are assumed to follow a Poisson distribution, with an expected value that varies with the diffraction angle. The photon counting sets a limit to the precision that can be obtained in a given amount of time. According to the Poisson distribution, the variance of the counts is equal to the expected value, l, so the standard deviation, , is equal to the square root of l. The relative error is thus equal to l1=2_.

The Poisson distribution states that the probability of observing yi, given the expected value i, is

PrðyijiÞ ¼ yi

i expðiÞ=yi!: ð8Þ

The logarithm of this probability, as a function of i, given yi,

is the log-likelihood

Lðij yiÞ ¼ yilog ii logðyi!Þ: ð9Þ

The last term is a fixed number once yiis given, so we will drop

it from here. If the elements of y, conditional on their expected values, l, are independent, we can sum the log-likelihood over all observations and use it as a measure of fit, replacing the

sum of squares in (1). Because a better fit gives a larger log-likelihood, we flip its sign, so that minimization becomes the goal again. Also, for technical reasons, to get simpler equa-tions later on, we multiply it by 2. We then have

G ¼ 2P

m

i

ði yilog iÞ; ð10Þ

which is known as the deviance. In principle it is possible to combine equation (10) with the old measure of roughness, R, as in (4). Instead we introduce a transformation of l,

g ¼ log l; ð11Þ

and insert that into the roughness expression. This step is inspired by the theory of generalized linear models (GLM) (Nelder & Wedderburn, 1972), where it is called the link function. It guarantees positive values of l. The objective function to minimize now becomes

Q ¼ 2P m i ½expð iÞ yi i þ Pm i¼3 ð2 iÞ 2 ; ð12Þ

with the second part being the roughness penalty. Minimiza-tion of (12) leads to the system of equaMinimiza-tions

Pg ¼ y l: ð13Þ

Estimation is usually performed using iteratively reweighted least squares (IRWLS). This boils down to iteratively solving the linear system

ð ~WW þ PÞ ~gg ¼ y ~ll þ ~WW ~gg; ð14Þ where a tilde as in ~gg indicates the current approximation to the solution. W ¼ diagðlÞ is a diagonal matrix with weights, for Poisson data, defined as

wii¼ 1 i @i @ i 2 ¼i: ð15Þ

Convenient starting values are obtained by taking

~

₀¼ logðyiþ 1Þ. Usually about five iterations are needed to

obtain estimates for g that are near enough to a solution for practical purposes (four significant figures or better in the solver residuals). More details on the theoretical properties of the GLM and its estimation routines have been given by McCullagh & Nelder (2000) and Dobson & Barnett (2008). 3.3. Comparison: penalized likelihood smoothing versus the Savitzky–Golay smoother fromEVA

Fig. 1 shows the results of smoothing the XRD pattern of a sample of Ni-alumina. The upper panel shows the complete scan, with the observed data (black) and the smoothed pattern (green) obtained using the penalized likelihood method (PL smoother). The optimal smoothing parameter according the Akaike information criterion (AIC) is 33932. The AIC is further explained in x5.

For a more detailed inspection of the Ni-alumina scan, we zoom in on two specific regions, which are presented in the lower two panels of Fig. 1. In both panels the penalized like-lihood approach gives satisfactory results. The Savitzky–Golay filter from EVA (SG filter), which is tuned manually, seems to

(4)

be too flexible in the right panel. For the likelihood smoother we included 95% confidence bands, which are depicted as the grey areas surrounding the fit of the smoother.

Given the smooth fit ^ll, we can check whether the Poisson assumption is reasonable. The variance of yishould be equal

to i, so the normalized (or Pearson) residuals ðyi^iÞ=^ 1=2 i

should have a standard deviation close to 1. In the case of the penalized likelihood smoother, it is 0.92, close to the expected value, showing that the Poisson assumption holds well for the present data. Checking the residuals of the pattern derived using the SG filter gives a standard deviation of 0.80 and might indicate too little smoothing is applied.

Figure 2

Comparison of the SG filter and the PL smoother using a simulated data set. The upper panel shows the complete scan and the observed data (black) and the true, noise-free, input data (orange). The two smaller panels show more details about two regions of the scan. In both cases the input, the output and the smooth curves of both procedures are provided.

Figure 1

Results of smoothing the Ni-alumina scan. The upper panel shows the complete pattern, with the observed data (black) and the smoothed data (green) obtained using the penalized likelihood method. The two lower panels provide more detail on two regions of the scan. They include the penalized likelihood smoother with 95% confidence intervals (grey shaded regions) and the results of applying the Savitzky–Golay filter.

(5)

For a more solid comparison of both smoothers, a simulation was conducted.

The simulated XRD pattern is

presented in the upper panel of Fig. 2. The noise-free (input) pattern is depicted in orange. The observed data, including Poisson noise, are shown in black. The pattern consists of a number of peaks on a drifting baseline.

Parts of the smoothed data are shown in the lower panels of Fig. 2. The smooth pattern from the penalized likelihood smoother (green) is shown together with the pattern coming from the SG filter using ! = 17 (red), the true profile (orange) and the observed data

(black). The optimal amount of

smoothness when using the penalized likelihood smoother is obtained using the AIC.

The SG filter is generally tuned manually. Fig. 3 shows the performance of this filter for different values of the window width. The upper panel shows the distance between the simulated data without noise and the obtained pattern applying the SG filter. The distance is expressed as the root mean squared error (RMSE). The window width is given on the x axis. The red horizontal line is the optimal value obtained using the PL smoother and the AIC. The lower panel is similar to the upper one, the only difference

being that it shows the distance towards the data with noise (the ‘observed data’). The performance of the optimal model of the SG filter is similar to that of the model obtained using the penalized likelihood smoother. However, in the absence of a proper criterion it is difficult to identify the optimal model. In contrast, the PL smoother is tuned automatically and in addition can be extended to K2elimination (see x4).

4. Ka2 elimination

X-ray diffraction patterns usually contain several different wavelength components: K1, K2, K and bremsstrahlung.

For analysis of X-ray diffraction patterns it is attractive to work with monochromatic patterns. Usually, physical filtering of the X-ray radiation is applied. Often a K filter or a monochromator is placed in the X-ray beam path in order to absorb the unwanted radiation. Also detectors with high energy resolution can be applied to distinguish between K and K, but at present these are not able to separate K1lines

from K2lines.

In (our) daily practice K is removed by physical filtering. The classical procedure for K2elimination is the Rachinger

(1948) correction. This method relies on three assumptions.

First, the peak shape is assumed to be the same for the K1

and K2components. Second, the 2line is expected to be half

the intensity of the 1 line. Third, the doublet distance is

assumed to be a known function of the diffraction angle 2 and the characteristic wavelengths of the anode material.

Here we take a similar approach: the X-ray diffraction pattern is considered to be the sum of two smooth latent components. Both components can be estimated using the penalized composite link model (Eilers, 2007). Smoothness is imposed using P-splines, which are explained below.

4.1. Penalized likelihood using P-splines

In the previous section, smoothness was obtained by using simple differences in the roughness penalty of the smoother. This is possible because the scans we use have data that are equally spaced on the diffraction angle (2) scale. Because the doublet distance does not coincide with the data points, estimating a smooth component for both K1 and K2 is

impossible using this approach. Instead, a model is required that enables interpolation. This can be realized using a basis representation, with the additional advantage that the

Figure 3

Simulation results, comparing the Savitzky–Golay filter with the penalized likelihood smoother. The upper panel shows the distance (expressed as the RMSE) between the noise-free simulated profile and the obtained pattern applying the SG filter. The window width of the SG filter is given on the x axis. The red horizontal line is the optimal value obtained using the PL smoother and the AIC. The lower panel is similar to the upper one but shows the differences towards the data with noise (the ‘observed data’).

(6)

dimensions of the estimating equations are reduced, which leads to faster calculations.

Here we use P-splines (Eilers & Marx, 1996), which are B-splines on a regular grid, combined with a penalty. The penalty is a natural extension of the difference penalty as discussed in the previous section, with the only difference that, instead of penalizing differences of adjacent data points, the penalty is imposed on adjacent spline coefficients. As advo-cated by Eilers & Marx (1996) we use a richly defined basis and smoothness is tuned with the penalty.

A B-spline of degree q is a smooth bell-shaped curve, composed of q + 1 polynomial segments each of degree q which join smoothly at the knots. The left panel of Fig. 4 shows the individual polynomial pieces, the resulting smooth spline function and the positions of the knots for a single B-spline. A simple example where a series of P-splines are fitted to the data is presented in the right panel of Fig. 4. Notice that it shows the general principle of P-splines using least squares and does not resemble the likelihood case, which involves a transformation of the data. We refer to Dierckx (1995) and de Boor (2001) for a thorough discussion of B-splines.

Using splines for smoothing the data (meaning no K2

elimination) the model can be written as

g ¼ Bb; ð16Þ

with B the B-spline basis and b the vector of coefficients. Here we again use the transformation of l: g = log l. Parameters are estimated by using the penalized deviance; adding a penalty to equation (10) results in

G þ jjDbjj2¼ G þ b0Pb: ð17Þ

Equation (17) can be solved using the iteratively reweighted least-squares algorithm (IRWLS) for estimating GLMs (Nelder & Wedderburn, 1972), with the penalty included:

ðB0WWB þ PÞb ¼ B~ 0_~

W

WB ~bb þ B0

ðy ~llÞ; ð18Þ

where ~bb and ~ll denote current approximations to the solution. 4.2. The penalized composite link model

Here we are interested in estimating not a single smooth component but a separate smooth pattern for K1and K2.

To do this we combine the P-spline framework with the composite link model (CLM) (Thompson & Baker, 1981). The CLM is an extension of the GLM, modelling so-called ‘indirect observations’. The combination of P-splines and the CLM is described by Eilers (2007), where it is termed the penalized composite link model (PCLM).

Using the generalized linear model the counts would be modelled as EðyÞ ¼ l. Assuming that the expected counts EðyÞ are the sum of two components, K1 and K2, we add a

composition matrix C to the GLM. The elements of C describe how the latent expectations c1and c2, pertaining to the K1

and K2 components, respectively, are combined. Here we

assume the same shape for both components, which leads to the composition matrix C ¼ ½I j I. Applying a PCLM, the expected counts are modelled as

EðyÞ ¼ l ¼ Cc: ð19Þ

This results in l ¼ c1þc2, where c1 pertains to the K1

reflection and c2to the K2reflection. Parameter defines the

relative intensity of the two components and is set to 0.5 (Rachinger, 1948). The expectations are modelled using P-splines: c1¼ expðB1bÞ ¼ exp P k kBkðjÞ ð20Þ and c2¼ expðB2bÞ ¼ exp P k kBkðjþjÞ ; ð21Þ Figure 4

(Left) A single cubic B-spline function, including the knots and the (shifted) segments. (Right) A series of P-splines fitted to some data, using penalized least squares.

(7)

with ¼ 2 and being the doublet distance. It is calculated as ¼ 2 arcsin 2

₁sin

2; ð22Þ

with 2and 1the wavelength of the K2and K1emission,

respectively.

Similarly to a GLM, the PCLM is estimated in an iterative manner solving

ð ~XX0WW ~~XX þ PÞb ¼ ~XX0ðy ~ll þ ~WW ~XX ~bbÞ; ð23Þ with ~bb indicating the current approximation to the solution. The latent expectations c are combined in ¼ diagðcÞ and the matrix X is calculated as

X ¼ ~WW1C ~Bc; ð24Þ

Figure 5

K2elimination using the PCLM, applied to the FeOxdata. The two panels show the estimated K components and the reconstructed data at two different diffraction angles.

Figure 6

K2elimination applied to the LaB6data. All panels present the observed data (grey), the estimated K1part using EVA (red) and the same component estimated using the PCLM (blue). The lower-left panel shows a low-angle part of the data, while the right panel is taken at the high-angle side.

(8)

where B0 c¼ ½B

0 1j B

0

2: the composition of the two B-spline

bases.

The result is illustrated in Fig. 5, which shows two parts of an FeOx diffraction pattern, a low-angle part and the

high-angle part. In each panel, four curves are displayed: the observed data (black), the estimated smooth curve (orange), and the 1(red) and 2(blue) components. Irrespective of the

angle, the peaks are well separated, while the resulting components and the fitted total curve are smooth.

4.2.1. Comparison: PCLM versus the EVA–Rachinger

correction. The top panel in Fig. 6 shows the complete scan of the LaB6sample. In grey, the observed data are presented;

the red line shows the estimated K1using the EVA software,

while the blue line is the K1component obtained using the

PCLM. At this scale it is difficult to observe differences between the two procedures. Zooming in on one of the peaks (as done in the lower panel of Fig. 6) does unveil some differences. Especially on the high-angle side, EVA generates unwanted artefacts.

5. Automated selection of the penalty parameter

In the previous sections we have presented a method for smoothing and subsequently we have extended the method to enable K2 elimination. Both models can be tuned by visual

inspection, yet it is attractive to have a method for automatic tuning of the amount of smoothing. We applied two fitting criteria to determine the optimal penalty parameter, the AIC proposed by Akaike (1974) and the L-curve, as introduced by Hansen (1992).

In the case of the AIC one searches for the value of that minimizes the criterion. This minimum is supposed to be the best trade-off between fidelity to the data and model complexity. The AIC is defined as

AIC ¼ G þ 2ED: ð25Þ

The first part of the equation is the deviance, which is described in equation (10). The ED is the effective model

dimension, a measure of the effective number of parameters in a model. The ED is computed as

ED ¼ tr½ðX ^WWX þ PÞ1ðX ^WWXÞ; ð26Þ where ^WW ¼ diagð ^llÞ and tr denotes the trace of a matrix, which is the sum of the elements on its diagonal. A simple grid search can be used to find the value of that minimizes the criterion. The left panel of Fig. 7 shows, for K2 removal from the

LaB6data, how the AIC changes with . Note that a linear grid

has been used for the logarithm of , because a range of several orders of magnitude has to be investigated.

Although the AIC performs satisfactorily, the computation of the effective dimension is costly, because a complete matrix of m rows by m columns is to be computed. For large m this takes much memory and computation time. A simple solution is to use only a segment of a scan and use that to search for the optimal value for . Once has been determined in this way, it can be used for processing the whole scan.

A more elegant solution is to use a fast alternative fitting criterion: the L-curve, presented for ridge regression by Hansen (1992) and recently used in the setting of P-splines by Frasso & Eilers (2012). Here the algorithm is successfully applied to Poisson data. The basic idea is simple: one plots the logarithm of the size of the penalty term, ’, against the logarithm of the residuals, :

’ ¼ logðjjDbjj2_Þ; _{¼ logðGÞ:}

ð27Þ Notice that both quantities are a function of . The figure will show a more or less L-shaped curve. The optimal penalty parameter is obtained in the corner of the L shape, where the curvature is largest. A simple alternative, presented by Frasso & Eilers (2012), is to search for the minimum of the squared step size ð Þ2þ ð’Þ2 and take the average of the two values related to that step. The right panel of Fig. 7 shows these steps as a function of the penalty parameter, in the case of the LaB6data.

Figure 7

The AIC curve (left) and the sequence of step sizes connected to the L-curve (right), used to obtain the optimal penalty parameter for estimating the K1 component of the LaB6data.

(9)

6. Discussion

Although our results are gratifying, we see areas that may

need improvement. We assumed that the K1 and K2

components have the same shape. In the applications presented this assumption appears to hold well, but in the literature examples exist in which this is not the case. Currently we are investigating ways to incorporate these broadening effects into the PCLM.

The PCLM method is insensitive to the point spacing, meaning that one can always estimate the two components. However, when the observed data are very coarse, the results will not be useful, because too many details are lost. Conse-quently it is preferable to use a relatively small step size and at the same time keep the scanning times small as well. The increased noise level is compensated by the smoother.

Using the penalized likelihood procedure, smooth deriva-tives can be determined for peak finding purposes. In the case of very coarse data the P-splines can be used to obtain smoother results.

In the current setup, baseline removal is not required. The present baseline component is distributed over both parts. This is theoretically incorrect but does not cause any practical problems. After K2removal, baseline estimation can still be

performed on the individual components for further analysis. One of the aims for future investigations is to estimate the baseline as part of the model.

7. Conclusion

We have presented a new method for smoothing and for smoothing combined with K2 elimination. The general

procedure is grounded in the generalized linear model, which enables Poisson statistics to be used for modelling the counts. The optimal smoothing parameter is determined using the Akaike information criterion or the L-curve. If desired, manual tuning of the model is possible as well.

An attractive property of the procedure for K2removal is

that the pattern is modelled non-parametrically. No functional

form for the peaks has to be assumed. In essence only the characteristic wavelengths of the anode material are required. Results obtained for XRD scans from our laboratory show satisfactory results, both for smoothing and for smoothing combined with K doublet removal. This is confirmed by inspection of the estimated components.

References

Akaike, H. (1974). IEEE Trans. Autom. Control, 19, 716–723. Artursson, T., Hagman, A., Bjo¨rk, S., Trygg, J., Wold, S. & Jacobsson,

S. (2000). Appl. Spectrosc. 54, 1222–1230.

Boor, C. de (2001). A Practical Guide to Splines. Applied Mathematical Sciences, Vol. 21. New York: Springer.

Chen, Z. P., Morris, J., Martin, E., Hammond, R. B., Lai, X., Ma, C., Purba, E., Roberts, K. J. & Bytheway, R. (2005). Anal. Chem. 77, 6563–6570.

Dierckx, P. (1995). Curve and Surface Fitting with Splines. Mono-graphs on Numerical Analysis. Oxford University Press.

Dobson, A. J. & Barnett, A. (2008). An Introduction to Generalized Linear Models. Boca Raton: CRC Press.

Eilers, P. H. (2003). Anal. Chem. 75, 3631–3636. Eilers, P. H. C. (2007). Stat. Modell. 7, 239–254.

Eilers, P. H. C. & Marx, B. D. (1996). Stat. Sci. 11, 89–102.

Frasso, G. & Eilers, P. H. C. (2012). Proceedings of the 27th International Workshop on Statistical Modelling, pp. 402–405. Utrecht University.

Hansen, P. C. (1992). SIAM Rev. 34, 561–580.

Hogg, C. II, Mullen, K. & Levin, I. (2012). J. Appl. Cryst. 45, 471–481. Lutterotti, L. & Bortolotti, M. (2003). IUCr Commission on

Crystallographic Computing Newsletter, No. 1, pp. 43–50.

McCullagh, P. & Nelder, J. (2000). Generalized Linear Models. London: Champman and Hall/CRC.

Nelder, J. A. & Wedderburn, R. W. M. (1972). J. R. Stat. Soc. Ser. A, 135, 370–384.

Rachinger, W. A. (1948). J. Sci. Instrum. 25, 254–255.

R Development Core Team (2011). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing.

Savitzky, A. & Golay, M. J. E. (1964). Anal. Chem. 36, 1627–1639. Thompson, R. & Baker, R. J. (1981). Appl. Stat. 30, 125–131. Whittaker, E. T. (1923). Proc. Edinb. Math. Soc. 41, 63–75.