Bounded self-weights estimation method for non-local means image denoising using minimax estimators

(1)

Bounded self-weights estimation method for non-local means image denoising using

minimax estimators

Nguyen, Minh Phuong; Chun, Se Young DOI

10.1109/TIP.2017.2658941

Publication date 2017

Document Version

Accepted author manuscript Published in

IEEE Transactions on Image Processing

Citation (APA)

Nguyen, M. P., & Chun, S. Y. (2017). Bounded self-weights estimation method for non-local means image denoising using minimax estimators. IEEE Transactions on Image Processing, 26(4), 1637-1649.

[7833063]. https://doi.org/10.1109/TIP.2017.2658941 Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

IEEE Proof

Bounded Self-Weights Estimation Method for

Non-Local Means Image Denoising

Using Minimax Estimators

Minh Phuong Nguyen and Se Young Chun, Member, IEEE

Abstract— A non-local means (NLM) filter is a weighted 1

average of a large number of non-local pixels with various image

2

intensity values. The NLM filters have been shown to have

3

powerful denoising performance, excellent detail preservation

4

by averaging many noisy pixels, and using appropriate values

5

for the weights, respectively. The NLM weights between two

6

different pixels are determined based on the similarities between

7

two patches that surround these pixels and a smoothing

para-8

meter. Another important factor that influences the denoising

9

performance is the self-weight values for the same pixel. The

10

recently introduced local James-Stein type center pixel weight

11

estimation method (LJS) outperforms other existing methods

12

when determining the contribution of the center pixels in the

13

NLM filter. However, the LJS method may result in excessively

14

large self-weight estimates since no upper bound is assumed,

15

and the method uses a relatively large local area for estimating

16

the self-weights, which may lead to a strong bias. In this

17

paper, we investigated these issues in the LJS method, and then

18

propose a novel local self-weight estimation methods using direct

19

bounds (LMM-DB) and reparametrization (LMM-RP) based on

20

the Baranchik’s minimax estimator. Both the LMM-DB and

21

LMM-RP methods were evaluated using a wide range of natural

22

images and a clinical MRI image together with the various levels

23

of additive Gaussian noise. Our proposed parameter selection

24

methods yielded an improved bias-variance trade-off, a higher

25

peak signal-to-noise (PSNR) ratio, and fewer visual artifacts when

26

compared with the results of the classical NLM and LJS methods.

27

Our proposed methods also provide a heuristic way to select a

28

suitable global smoothing parameters that can yield PSNR values

29

that are close to the optimal values.

30

Index Terms— James-Stein estimator, minimax estimator, 31

non-local means, center pixel weight, bounded self-weight, image

32

denoising.

33

I. INTRODUCTION

34

I

MAGE denoising is a fundamental task in image process-35

ing, low-level computer vision, and medical imaging algo-36

Manuscript received March 9, 2016; revised October 26, 2016 and December 31, 2016; accepted January 21, 2017. This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning under Grant NRF-2014R1A1A1007928. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Oleg V. Michailovich. (Corresponding author: Se Young Chun)

Minh Phuong Nguyen was with the School of Electrical and Computer Engineering (ECE), Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, South Korea. Now he is with Section of Radiation, Detection and Medical Imaging, Delft University of Technology, Mekelweg 15, 2629 JB Delft, The Netherlands (e-mail: m.p.nguyen@tudelft.nl).

Se Young Chun is with the School of ECE, UNIST, Ulsan, 44919, South Korea (e-mail: sychun@unist.ac.kr).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIP.2017.2658941

rithms. The goal of denoising is to suppress image noise 37 when restoring desired details using prior information about 38 the images. For example, based on prior information regarding 39 “smooth images”, a simple filter, such as a Gaussian filter, can 40 be designed as a weighted average of the image intensities 41 of the pixels in the local neighborhood with non-adaptive 42 weights. However, this type of filter blurs the edges and 43 details of images because these features are not captured in 44 the assumed prior information. Many edge-preserving denois- 45 ing methods have been proposed, including bilaterial fil- 46 ters [1], [2], anisotropic diffusion [3], non-local means (NLM) 47 filters [4], [5], collaborative filters (BM3D) [6], and total 48 variation filters [7]. Many filters, including bilaterial filters, 49 anisotropic diffusion, and NLM filters (but, not BM3D, 50 see [8]), can be represented as the weighted averages of 51 adaptive weights or adaptive smoothing [9]. It should be noted 52 that it is important to select appropriate weights in these 53 types of filters in order to obtain improved denoised image 54

quality [8]. 55

Classical NLM filters use the similarities between two local 56 patches in a noisy image to determine the weights in non- 57 local adaptive smoothing [4]. The NLM weights are obtained 58 by first calculating the Euclidean distance between the two 59 local patches, which is denoted d, and then by evaluating 60 exp(−d2_/h2_{), where h is a smoothing parameter. This method}

61 allows higher weights to be assigned to pixels with similar 62 patches so that edges and details can be preserved through 63

non-local weighted averaging. 64

There are four different factors that determine the output 65 image quality of a NLM filter in terms of weights. 1) The 66 first factor is the similarity measure d. The Euclidean distance 67 is a usual choice [4], but other similarity measures have also 68 been proposed, such as hypothesis testing with adaptive neigh- 69 borhoods [10], principal component analysis (or the subspace 70 based method) [11], [12], blockwise aggregation [13], rotation- 71 invariant measures [14]–[16], shape-adaptive patches [17], and 72 patch-based similarities with adaptive neighborhoods [18]. 73 In multimodal medical imaging, inaccurate weights for noisy 74 molecular images were enhanced by using additional high 75 quality anatomical images [19], [20]. 2) The second factor 76 is the strategy for determining the smoothing parameter h. 77 Optimization strategies have been developed based on Stein’s 78 unbiased risk estimation (SURE) method for NLM with 79 Gaussian noise [21], [22], NLM with Poisson noise [23], and 80 blockwise NLM with Gaussian noise [24]. 3) The third factor 81 is in selecting the function to use to determine the weights, 82

1057-7149 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

(3)

IEEE Proof

such as exp(−x2_{). Other functions have also been proposed}

83

to calculate the weights, such as compact support func-84

tions [25], [26] and statistical distance functions [27], [28]. 85

4) The last factor, which is the focus of this article, is in how 86

best to determine the self-weights for the same pixel in the 87

input and output images. 88

The NLM weights for two different pixels are essentially 89

determined by the distance between the two noisy local 90

patches around these pixels. However, the weights for the 91

same pixel, or the self-weights, are not affected by the 92

noise in the patches and the distance is always 0. For an 93

extremely noisy image, the self-weights will be relatively too 94

large when compared to the other weights, which will cause 95

the filter output to be almost the same as the input noisy 96

image. Therefore, the use of appropriate self-weight values can 97

significantly affect the quality of the denoised image. Many 98

researchers have investigated strategies for determining the 99

self-weights, which are also known as center pixel weights, 100

in order to alleviate the so-called “rare patch effect.” For 101

the classical NLM filter proposed by Buades et al., the self-102

weights were set to be either one or the maximum weight 103

in a neighborhood [4]. This strategy guaranteed that at least 104

one or two of the largest weights would be the same. Doré 105

and Cheriet also used the maximum weight in a neighborhood 106

as the self-weight, but only if that maximum weight was 107

large [29]. Brox and Cremers proposed a method to have 108

at least n number of the weights to be the same [30], and 109

Zimmer et al. considered the self-weight to be a free para-110

meter during the estimation process [31]. Salmon developed 111

a SURE-based method for determining the self-weights that 112

accounted for the noise [32]. 113

Recently, Wu et al. proposed a method to determine the 114

self-weights using a James-Stein (JS) type estimator [33]. 115

The idea of that work was to use a JS estimator to determine 116

the reparametrized self-weight in a local neighborhood (called 117

the local JS estimator (LJS)). The LJS method yielded the 118

best peak signal-to-noise ratio (PSNR) results when compared 119

to other existing self-weight selection methods [4], [32]. 120

However, the method had some limitations. First, the LJS 121

could yield self-weights that were theoretically much larger 122

than 1 because no upper bound for the self-weights was 123

assumed, and this may lead to severe rare patch artifacts. The 124

JS estimator does not guarantee its optimality for bounded 125

shrinkage parameters. Second, the original LJS method was 126

tested with a relatively large local neighborhood when deter-127

mining a weight because it was assumed that the self-128

weights were the same in the local neighborhood. However, 129

the problem is that the selection of a local neighborhood size 130

that is too large may introduce a strong bias into the resulting 131

denoised images. 132

In this article, we investigate the original LJS method in 133

terms of the local neighborhood size for self-weight esti-134

mation and the potential for excessive self-weight estima-135

tion when no upper bound is applied on the self-weight. 136

We then propose novel self-weight estimation methods for 137

NLM that account for bounded self-weights using Baranchik’s 138

minimax estimator [34], called local minimax self-weight esti-139

mation with direct bound (LMM-DB) and with reparametriza-140

tion (LMM-RP). We evaluated our proposed methods using 141 performance criteria including PSNR, the bias-variance trade- 142 off curve and visual quality assessment with a wide range 143 of natural images and a real patient MRI image with various 144 noise levels. We compared the performance of our proposed 145 methods with a classical NLM filter using self-weights of 1 [4] 146 and the state-of-the-art LJS method, which has already been 147 shown to be the best among all other previous self-weight 148

determination methods [33]. 149

This article is an extension of a work that was presented 150 at the 2016 IEEE International Symposium on Biomedical 151 Imaging (ISBI) [35], and goes into more depth regarding the 152 theory of the minimax estimator and provides an evaluation 153 of the methods using a significantly larger image dataset. 154 This paper is organized as follows. Section II reviews the 155 classical NLM filter and revisits the LJS method. Section III 156 investigates the LJS method in terms of the local neighborhood 157 size for self-weight estimation and the potential for exces- 158 sively large self-weight estimates. Then, Section IV proposes 159 novel LMM-DB and LMM-RP methods using Baranchik’s 160 minimax estimator in order to overcome two limitations of 161 the LJS method. Section V illustrates the performance of our 162 proposed methods by providing our simulation results. Lastly, 163 Sections VI and VII discuss and then conclude this paper, 164

respectively. 165

II. REVIEW OF THELOCALJAMES-STEINSELF-WEIGHT 166 ESTIMATIONMETHOD FOR THENLM FILTER 167 In this section, we will briefly review both the classical 168 NLM method proposed by Buades et al. [4] and the LJS self- 169 weight selection method proposed by Wu et al. [33]. 170 A. Reviewing the Classical Non-Local Means Filter 171 Let us assume that an image x is contaminated by noise n, 172

which produces a noisy image y: 173

y= x + n (1) 174

where n is zero-mean white Gaussian noise with standard 175 deviation σ. The NLM filtered value at the pixel i is the 176 weighted average of all pixels in a search regioni: 177

ˆxi = j∈iwi, jyj j∈iwi, j (2) 178 where yi is the i th element of y,wi, j is the weight between 179 the i th and j th pixels, and i is the set of all pixels in an 180 area around the i th pixel, which could be an entire image. 181 The similarity weight of the classical NLM is defined as: 182

wi, j = exp −Piy− Pjy2 2|P| h2 (3) 183 where Pi is an operator used to extract a square-shaped patch 184 centered at the i th pixel,· is an l2norm,|P| is the number of 185 pixels within a patch, and h is a global smoothing parameter. 186 Equation (3) implies that the self-weights wi,i are always 187 equal to 1. Previous works on self-weights have shown that 188 good strategies for determining the self-weights also affect the 189 image quality of the NLM filtering [4], [29], [32], [33]. 190

(4)

IEEE Proof

B. Reviewing Local James-Stein Self-Weight Estimation

191

The LJS method was proposed in order to determine wi,i

192

as follows [33]. First, (2) was decomposed into two terms: 193 ˆxi = Wi Wi + wi_,iˆzi + wi_,i Wi+ wi_,i yi (4) 194 where Wi = j∈i\{i}wi, j and 195 ˆzi = j∈i\{i}wi, j yj/Wi. (5) 196

The terms ˆzi do not contain wi,i. Then, the LJS method

197 reparametrized (4) using 198 pi = w i_,i Wi+ wi_,i (6) 199 so that (4) became: 200 ˆxi = (1 − pi) ˆzi+ piyi. (7) 201

The problem of estimating the self-weights wi,i became the

202

problem of estimating pi. Lastly, the JS estimator [36], [37]

203

for pi was proposed:

204 pLJSi = 1 − (|B| − 2) σ2 Biy− Biˆz 2 (8) 205

where Bi is an operator used to extract a square-shaped

206

neighborhood centered at the i th pixel, |B| is the number of 207

pixels within that neighborhood, and σ is the known noise 208

level. 209

Equation (8) implies that pLJS_i ∈ (−∞, 1]. Since 210

the weights are non-negative, it was proposed to use the 211

zero-lower bound for pLJS_i as follows [33]: 212 ˆxLJS₊ i = 1− pLJS_i + ˆzi+ p_iLJS+yi (9) 213 where 214 pLJS_i +:=[piLJS]+= 1− (|B| − 2) σ 2 Biy− Biˆz 2 + (10) 215

and[s]₊:= max (s, 0). Wu et al. also mentioned that a user-216

defined upper bound for pi can be used, but did not investigate

217

further [33]. It should be noted that the JS estimator does not 218

guarantee its optimality when bounding p_iLJSin (8). 219

III. LIMITATIONS OF THELOCALJAMES-STEIN

220

SELF-WEIGHTESTIMATION FOR

221

THENLM FILTER

222

We now investigate two limitations of the original LJS 223

method [33] in terms of the size of local neighborhoods for 224

weight estimation, and the potential for excessive self-225

weight estimation. 226

A. Size of Local Neighborhood for Self-Weight Estimation 227

In the method described in [33], there are two implicit steps 228

required in order to obtain the LJS self-weight estimator (10). 229

The first step is to choose a local set of pixels around the i th 230

pixel, referred to as set_iB, that correspond to the operator Bi,

231

and assume that: 232

ˆxj = (1 − pi) ˆzj+ piyj, j ∈ iB. (11)

233

Fig. 1. Bias-variance curves (cameraman example) for the classical NLM and LJS methods (LJS₊) for different sizes of local neighborhoods (B). The curves were plotted while varying the smoothing parameter h (log2h ∈ [1.8, 3.2]). Based on the works of Stein [36] and James and Stein [37], 234 if|B| ≥ 3, then for a neighborhood _iB extracted using Bi, 235

ˆxj =

1− p_iLJS+

ˆzj+ pLJSi +yj, j ∈ iB. (12) 236 is a dominant estimator for xj “locally” in _iB. The LJS 237 method used the zero lower bound when estimating pi in 238 order to obtain a realistic non-negative self-weight value. This 239 was also a good choice in terms of the estimator perfor- 240 mance since the positive part of the JS estimator is dominant 241 over the original JS estimator, according to the works of 242 Baranchik [34], [38] and Efron and Morris [39]. 243 The second implicit step is to assign the resulting p_iLJSto pi 244

in (7) for only the single pixel i so that: 245

ˆxLJS i = 1− p_iLJS ˆzi + pLJSi yi. (13) 246 Wu et al. evaluated the LJS method with|B| = 15 × 15 [33], 247

which seems relatively large. 248

Based on this implicit two-step interpretation, we can sur- 249 mise that using a smaller size of |B| may be more desirable 250 for obtaining a less biased estimate of pi since the assumption 251 of having the same pi in _iB is less likely to be true for 252 larger sizes of _iB. Figure 1 confirms our conjecture. The 253 bias-variance curves of the LJS method yielded better bias- 254 variance trade-offs than those in the classical NLM method for 255 both large local neighborhoods with a half window size B= 7 256 (|B| = 15 × 15) and small local neighborhoods with B = 2 257 (|B| = 5 × 5). However, using larger local neighborhood sizes 258 for estimating pi yielded stronger biases than those estimated 259 using smaller sizes for the same level of variance. 260

B. Excessively Large Self-Weight Estimates 261

In the LJS method for determining the self-weights by 262 estimating values for pi [33], it is theoretically possible that 263 the self-weights have excessively high values. For example, 264 (6) suggests that if pi = 1 and Wi > 0, then wi,i 1. 265 Slight artifacts were observed in [33] in the background area 266 that were potentially caused by excessive self-weight estimates 267 when a relatively larger neighborhood size|B| = 15 × 15 was 268 used. We observed a significantly higher degree of degradation 269 in the visual image quality in the background area when the 270

(5)

IEEE Proof

Fig. 2. Denoised image of the cameraman example using the original LJS method [33] with no upper bound for the self-weights (top left), estimated pi

values (top right), calculated Wi(bottom left), and resulting self-weights (wi,i) showing excessive self-weights (bottom right). B= 2 and σ = 10.

size of|B| in (8) was small, as shown in the image in the top 271

left figure of Fig. 2. 272

We investigated this issue using an example of the camera-273

man image that was denoised using the LJS method [33], but 274

with a smaller neighborhood size |B| = 5 × 5. For areas with 275

more details, such as edges and textures, large pi values were

276

estimated and yielded large self-weights, as shown in the top 277

right figure of Fig. 2. However, since the values for Wi were

278

also very small in these areas, as shown in the bottom left 279

image in Fig. 2, the resulting self-weight map yielded values 280

close to 1 in the areas with details as shown in the bottom 281

right image in Fig. 2. 282

In contrast, for areas with almost no details, such as those 283

with a flat intensity background, relatively smaller pi values

284

were estimated, some of which were much larger than 0 285

while the rest were closer to 0, as shown in the top right 286

image in Fig. 2. However, since the Wi values for the flat

287

areas were relatively large, as shown in the bottom left image 288

in Fig. 2, some of the estimated pi values obtained using the

289

LJS method (LJS+) were estimated to yield excessively large 290

self-weights that were much larger than 1, as shown in the 291

bottom right image of Fig. 2. Consequently, these excessively 292

large self-weights caused severe rare patch artifacts in the 293

filtered image, which resulted in visual quality degradation, 294

as observed in the top left image of Fig. 2. 295

IV. LOCALMINIMAXESTIMATIONMETHODS FORUPPER

296

BOUNDEDSELF-WEIGHTS IN ANLM FILTER

297

In this section, we propose two local upper bounded self-298

weight estimation methods that use Baranchik’s minimax 299

estimator [34]. 300

A. Bounded Self-Weights 301

It is usually assumed that the self-weights satisfy wi,i ∈

302

[0, 1]. However, there are many possible upper bounds for the

303

self-weights, including 1 [4] or some positive value that is 304

possibly less than 1 based on SURE [32]. In this article, two 305 different upper bound values wmax_i_,i for the self-weights were 306 evaluated such that 0 ≤ wi,i ≤ w_imax_,i . One upper bound 307

was: 308

wmax−one

i,i = 1, (14) 309 which is the usual choice for the self-weights in the classical 310

NLM method [4]. The other upper bound was: 311

wmax−stein

i_,i = exp

−σ2_/h2_,

(15) 312

which was motivated by the SURE-based NLM 313

self-weights [32]. We assume that σ is known and h is 314 pre-determined, which means that the upper bound for 315 the self-weights can also be determined in advance. 316 Equation (15) takes the noise level into account. As σ is 317 smaller, the maximum self-weight in (15) is closer to one. 318 It should be noted that the difference between (15) and (14) 319

will be greater at higher noise levels. 320

Since pi is estimated instead ofwi,i, it is necessary to derive 321 the range of pi that corresponds to 0 ≤ wi,i ≤ wmax_i_,i . 322 From (6), the derivative of pi with respect to wi,i is non- 323

negative as follows: 324 d dwi,i pi = Wi (Wi + wi,i)2 ≥ 0 325

since Wi ≥ 0. Therefore, pi is a non-decreasing function of 326 wi,i and for 0 ≤ wi,i ≤ wmax_i_,i , the range of pi will be 327

0≤ pi ≤ wmax i,i Wi + wmaxi,i =: pmax i ≤ 1. 328 Note that if Wi = 0, then pmaxi = 1. The estimator 329 p_iLJS+ in (10) automatically guarantees that 0 ≤ pi ≤ 1 if 330

|B| ≥ 2. However, since Wi > 0 generally holds for most real 331 images with noise, it is necessary to constrain pi to be less 332 than or equal to the upper bound pmax_i , which is usually less 333

than one. 334

B. Local Minimax Self-Weight Estimation With Direct Bound 335 Enforcing the upper limit pmax_i on the estimated pi in (10) 336 using min(pLJS₊

i , pmaxi ) breaks the optimality of the JS estima- 337 tor if p_imax< 1. In this article, we propose using Baranchik’s 338 minimax estimator [34] to incorporate bounded self-weights 339 into the estimator (see Baranchik [34], Erfon and Morris [39], 340 and Strawderman [40] for more details on this minimax 341

estimator). 342

Theorem 1 (Baranchik): For y ∼ Nr

x, σ2I, r ≥ 3, and 343 loss L(x, ˆx) = x− ˆx, an estimator of the form ˆx = qy 344

where 345 q = 1− c (y)σ 2_{(r − 2)} y2 (16) 346

is the minimax, provided that: 347

(i) 0≤ c (y) ≤ 2 and 348

(ii) the function c(·) is nondecreasing. 349

350 Here y shrinks toward 0 which is the initial estimate of x. 351

(6)

IEEE Proof

Fig. 3. Graphical illustrations of the original and positive part JS estimators without upper bounds, and the proposed minimax self-weight estimators with upper bounds in terms of c(s) vs. s. (a) Original and positive-part JS estimators. (b) Proposed minimax estimators with bounds.

The original JS estimator and its positive part are special 352

cases of Baranchik’s minimax estimator. For the original JS 353

estimator (8): 354

c(s) = 1, (17)

355

where s = Biy − Biˆz so that both conditions (i, ii) of

356

the Baranchik’s theorem are satisfied. In the positive part 357

estimator (9), it can be shown that: 358 c(s) = ⎧ ⎨ ⎩ s2 σ2_{(r − 2)}, 0 ≤ s ≤ Y1 1, otherwise (18) 359 where Y1 := σ √

r− 2. The original and positive part 360

JS estimators are illustrated in Fig. 3 (a). 361

We propose a new local minimax self-weight estimation 362

method that uses a direct bound with a specific upper-bound 363

value, as follows: 364

p_iLMM−DB:= min(p_iLJS+, pmaxi ). (19)

365

This estimator is minimax under certain conditions that can 366

be derived using Baranchik’s minimax estimator theorem. 367

According to this theorem, this operation can be interpreted 368 as follows: 369 c(s) = ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩ s2 σ2_{(r − 2)}, 0≤ s ≤ Y1 1, Y1< s ≤ Y2 s2_{(1 − p}max₎ σ2_{(r − 2)} , Y2< s (20) 370 where Y2:= σ √

(r − 2)/(1 − pmax_). _We _call _this

371

a local minimax self-weight estimator using direct 372

bound (LMM-DB), which is illustrated in Fig. 3 (b) 373

where Y4:= σ

√

2(r − 2)/(1 − pmax_).

374

However, note that LMM-DB is not minimax fors > Y4. 375 Fortunately,s = Biy− Biˆz can be limited by adjusting 376 the smoothing parameter h by making it smaller so that all 377

s ≤ Y4 and c

Biy− Biˆz

≤ 2. Then, the LMM-DB 378 becomes “practically” a minimax estimator. Let us denote the 379

maximum h that satisfiess ≤ Y4as hmax. 380

In this case, a question can be raised: will the optimal value 381 for h fall into the range of h that satisfiess ≤ Y4? Interest- 382 ingly, our simulations with many natural images showed that 383 the optimal smoothing parameter h∗based on the true images 384 is very close to hmax. This is because the LMM-DB yielded 385 pmax→ 1 so that Y2→ ∞, and almost allBiy− Biˆz were 386 less than or equal to Y4. Therefore, pLMMi −DBis “practically” 387 a minimax value based on Baranchik’s theorem for many 388 natural images. Moreover, the LMM-DB method may provide 389 a way to choose the optimal global smoothing parameter value 390 h without knowing the underlying true image. We empirically 391

investigate this issue in Section V. 392

C. Local Minimax Self-Weight Estimation With 393

Re-Parametrization 394

The LMM-DB algorithm set p to be the same pmax for a 395 wide range ofBiy− Biˆz values. We now propose another 396 new method, called the local minimax self-weight estimation 397 with reparametrization (LMM-RP) method, that assigns dif- 398 ferent p values for differentBiy− Biˆz. 399 We reparametrized pi in (7) in the following way: 400

ˆxi = ˆzi(pi/pimax)p max i (yi− ˆzi) 401 = ˆzi+ pTi(yiT− ˆzTi) (21) 402 = (1 − pmax i )ˆzi + ˆzTi + pTi(yiT− ˆzTi) (22) 403 where ˆz_iT= pmax_i ˆzi, y_iT= p_imaxyi, and 404

piT= 1 pmax_i wi,i Wi+ wi,i. (23) 405 Note that for 0 ≤ wi,i ≤ wmaxi_,i , pTi is an increasing function 406 of wi,i and the range of p_iT is 0 ≤ p_iT ≤ 1. We propose 407 to use the positive part of the JS estimator to estimate the 408

reparametrized p_iT, as follows: 409 p_iT,LJS+ = 1−(|B| − 2) (p max i )2σ2 _B_i_yT_{− B} iˆzT2 + 410 = 1− (|B| − 2) σ 2 Biy− Biˆz2 + = pLJS₊ i . (24) 411 This method is equivalent to using a multiplicative factor pmax_i 412

for the original JS shrinkage (9): 413

ˆxLMM−RP i = (1 − p LMM−RP i )ˆzi+ piLMM−RPyi (25) 414 where 415 pLMM_i −RP= pmax_i 1− (|B| − 2) σ 2 _B_i_y_{− B}_i_ˆz2 + . (26) 416

This proposed LMM-RP estimator is not dominant when 417 estimating xi, but rather is dominant when estimating pmaxi xi, 418

(7)

IEEE Proof

as shown in (22). Thus, the positive part JS estimator does not

419

guarantee that the LMM-RP is dominant. 420

Baranchik’s minimax estimation theorem can be used to 421

analyze the LMM-RP estimator as follows: 422 c(s) = ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ s2 σ2_{(r − 2)}, 0≤ s ≤ Y1 s2_{(1 − p}max₎ σ2_{(r − 2)} + p max_{, Y} 1< s (27) 423

where if s is Y3:= σ√(2 − pmax)(r − 2)/(1 − pmax), then

424

c(s) = 2. The LMM-RP method is also illustrated in Fig. 3 425

(b), and is minimax if s ≤ Y3. The global smoothing

426

parameter h can be adjusted so that this condition is satisfied 427

for different images. As in the case of the LMM-DB, it turns 428

out that the optimal global smoothing parameter h∗ and 429

the upper bound h that satisfies s ≤ Y3 are also very

430

close to each other when the LMM-RP method is applied 431

to many natural images. Therefore, the LMM-RP method 432

is “practically” a minimax. The following table summarizes 433

the LJS self-weight estimation method and our proposed 434

LMM-based self-weight estimation methods. 435

Summary of Self-Weight Estimation Methods LJS₊[33] : p_iLJS+ = 1− (|B| − 2) σ2/Biy− Biˆz 2 + ˆxLJS₊ i = (1 − p LJS₊ i )ˆzi+ p LJS₊ i yi LMM− DB : p_iLMM−DB= min(p_iLJS+, pmaxi ) ˆxLMM−DB i = (1 − p LMM−DB i )ˆzi+ p LMM−DB i yi LMM− RP : p_iLMM−RP= pLJS_i +pmax_i ˆxLMM−RP i = (1 − p LMM−RP i )ˆzi + p LMM−RP i yi V. SIMULATIONRESULTS 436 A. Simulation Setup 437

Ten natural images1 (cameraman, lena, montage, house, 438

pepper, barbara, boat, hill, couple, fingerprint) and five images 439

from the SUN database2 (abbey, airplane cabin, airport ter-440

minal, alley, amphitheater) were used in our study as noise-441

free images (128× 128, 256 × 256, or 512 × 512 pixels, 442

8 bits). A real patient MRI (512× 512 pixels, 8 bits) that was 443

acquired and processed under institutional review board (IRB) 444

approved protocols was also used. White Gaussian noise was 445

added to each input image with various standard deviations 446

σ ∈ {10, 20, 40, 60}. 447

All algorithms were implemented using MATLAB

448

R2015b (The Mathworks, Inc., Natick, MA, USA). The patch 449

size and search window size of the NLM filter were chosen 450

to be 7× 7 and 31 × 31, respectively, which were the same 451

1_{Available online at: http://www.cs.tut.fi/~foi/GCF-BM3D/BM3D_images.}

zip as the date of 16 Nov. 2015.

2_{Available online at: http://vision.princeton.edu/projects/2010/SUN/ as the}

date of 16 Sep. 2016.

as those used in [33]. Both the state-of-the-art LJS algorithm 452 and the proposed algorithms were tested using B= 1, · · · , 9 453 where|B| = (2B + 1)2_{> 3.}

454 The global smoothing parameter h was chosen empirically 455

to yield the best PSNR: 456

PSNRˆx= 10 log10

2552

ˆx− x2/N, (28)

457

where N is the size of the image. In addition to the PSNR, the 458 mean bias vs. the mean variance trade-off curves were used as 459 performance measures for the different smoothing parameter 460

values h: 461 bias2 = 1 N N i=1 ( ¯xi− xi)2, (29) 462 var= 1 N N i=1 1 k− 1 k j=1 ˆxi j− ¯xi 2 , (30) 463

where k is the number of realizations (20 in our simulation), 464

ˆxi j is the j th estimation at the i th pixel, and ¯xi is the mean 465

of ˆxi j, as given by: 466 ¯xi = 1 k k j=1 ˆxi j. 467

A visual quality assessment was also performed. 468

B. Performance Studies Using the PSNR 469

In order to estimate values of pi for a fixed neighborhood 470 size B, the optimal NLM smoothing parameter h∗ was deter- 471 mined such that the PSNR was maximized. In our proposed 472 methods, the two maximum self-weights in (14) and (15) were 473 used. The LMM-DB and LMM-RP methods given by (14) 474 are denoted LMM− DBone and LMM− RPone, while the 475 LMM-DB and LMM-RP methods given by (15) are denoted 476 LMM− DBsteinand LMM− RPstein. Table I summarizes the 477 quantitative PSNR results for the 16 images with 4 different 478 noise levels. When B = 7, our proposed LMM-DB and 479 LMM-RP methods based on Baranchik’s minimax estima- 480 tor yielded much better PSNR results than did setting the 481 self-weight to one in the classical NLM method [4], and 482 comparable PSNR values to the LJS method based on the 483 JS estimator [33]. When B = 2, our proposed LMM-DB 484 and LMM-RP methods yielded better PSNR values than did 485

the LJS. 486

For the five examples of lena, house, peppers, barbara, 487 boat with σ = 20, PSNRs of our proposed methods (global 488 smoothing parameter and fixed neighborhoods, but adaptive 489 self-weight) were 0.72 ∼ 0.97 dB better than classical 490 NLM. In [10], it is reported that for the same five examples 491 with the same level of noise, the work of Kervrann et al. 492 (fixed self-weight, but local smoothing parameters and adap- 493 tive neighborhoods) yielded 0.99 ∼ 1.55 dB better PSNR 494 than classical NLM. Self-weights, local smoothing parameter, 495 neighborhoods size are important factors in the NLM filter to 496

(8)

IEEE Proof

TABLE I

PSNR (dB) SUMMARY(MEAN± STANDARDDEVIATION)FORVARIOUSNATURALIMAGES

C. Performance Studies With Bias-Variance Trade-Off 498

The bias-variance trade-off was investigated using many 499

natural images. As shown in Fig. 1, a neighborhood size B was 500

used to estimate pi using the LJS method [33], and this was a

501

significant factor when determining the bias. This tendency 502

was also observed for the other different natural images, 503

as illustrated in Fig. 4. Increasing B in the LJS method moved 504

the bias-variance trade-off curves in the bottom right direction, 505

meaning that the bias increased and the variance decreased. 506

However, the role of the smoothing parameter h changed in 507

the LJS method. Unlike in classical NLM method (see the 508

NLM bias-variance curve in Fig. 1), increasing the smoothing 509

parameter h beyond a certain point in the LJS method did not 510

further decrease the variance in any of the natural images that 511

we tested. This is because increasing h will also increases the 512 pi values so that the resulting LJS estimator becomes closer 513 to the noisy input image yi due to the lack of bounds for the 514

self-weights. 515

Our proposed methods (LMM-DB, LMM-RP) yielded 516 trade-off curves that have decreased variances for increasing 517 values of the smoothing parameter h. Figure 5 shows the 518 trade-off curves for the cameraman example for different 519 methods (LMM-DB, LMM-RP), different neighborhood sizes 520 (B = 2, 7), and different noise levels (σ = 10, 40). Our 521 proposed methods yielded bias-variance curves that were less 522 than or equal to those in the LJS method for fixed B andσ. 523 This tendency was also observed with other natural images, 524 as illustrated in Fig. 4. It was important to choose appropriate 525

(9)

IEEE Proof

Fig. 4. Bias-variance curves for natural images using LJS+[33] and our proposed LMM− DBoneand LMM− RPonemethods with a noise level ofσ = 10. (a) couple. (b) montage. (c) lena. (d) pepper. (e) house. (f) M R I .

neighborhood sizes B in order for the LJS method to obtain 526

a certain level of bias, but our proposed methods were able 527

to achieve that same level of bias by adjusting the smoothing 528

parameter h, which was the same as in classical NLM. Based 529

on our results, it appears that the use of LMM-RP has slightly 530

more advantages than using LMM-DB in terms of the PSNR, 531

as shown in Table I, and the bias-variance trade-off curves, as 532

shown in Fig. 5, for high noise levels. 533

D. Performance Studies With Visual Quality Assessment 534

The most important improvements in our proposed 535

LMM-DB and LMM-RP methods when compared to the 536

LJS method were achieved in terms of the visual quality. 537

Figure 6 (a) shows the true cameraman image (left) and the 538

noisy image (right) with a noise level ofσ = 10. Figure 6 (b) 539

presents the filtered images using the LJS method [33] with 540

B = 2 and B = 7. Severe artifacts were observed in the 541

background areas when using B= 2, and these artifacts were 542

reduced when using B = 7. However, there were still some 543

artifacts near the edges of objects. Our proposed LMM-DB 544

and LMM-RP methods exhibited fewer image artifacts than 545

were observed in the images processed using the LJS method 546

for both B= 2, 7. This tendency was observed in many of the 547

natural images, as shown in Fig. 7, especially in the high inten-548

sity flat areas. PSNR improvements in the LJS method were 549

achieved with severe (when B = 2) or mild (when B = 7) 550

artifacts; however, our proposed methods achieved both a high 551

Fig. 5. Bias-variance curves for LMM− DBone and LMM− RPone for comparison with LJS+for two neighborhood sizes B= 2, 7 and two noise levelsσ = 10, 40. (a) LMM − DBone(σ = 10). (b) LMM − DBone(σ = 40). (c) LMM− RPone(σ = 10). (d) LMM − RPone(σ = 40).

PSNR and significantly reduced visual artifacts. This ability 552 to reduce the number of visual artifacts in a denoised image 553 is important in some applications, such as diagnostic medical 554

(10)

IEEE Proof

Fig. 6. True, noisy (σ = 10), and filtered images using LJS+ [33] , and the proposed LMM− DBone and LMM− RPone. (a) True and noisy images (σ = 10). (b) LJS+[33]. (c) Proposed LMM− DBone. (d) Proposed LMM− RPone.

E. Maximum Self-Weights: One vs. Stein’s 556

Two maximum self-weights were proposed for use: the 557

value one in (14) that was proposed in [4], and Stein’s 558

in (15) that was proposed in [32]. Figure 8 shows that the 559

LMM− DBone method yielded an improved bias-variance 560

curve and PSNR than did the LMM− DBstein method when 561

the noise levels were low. For high noise levels σ = 40, 562

the LMM− DBstein method yielded an improved PSNR and 563

bias-variance curve than did the LMM− DBone method. 564

However, these differences were not significant, as also illus-565

trated in terms of the PSNR in Table I. In terms of the visual 566

quality, no significant differences were observed between the 567

two methods. 568

TABLE II

PERCENTAGE(%)OFc(s) THATEXCEED2 USINGLMM− DB

ANDLMM− RP METHODS,σ = 10, B = 2

F. “Practical” Minimax Estimator 569

The proposed LMM-DB and LMM-RP methods are min- 570 imax with respect to s ≤ Y4 and s ≤ Y3, respectively, 571 as shown in Fig. 3. However, these conditions impose upper 572 bounds for the smoothing parameters h and the optimal 573 h∗, which means that the smoothing parameter values that 574 yield the best PSNR may not be achievable. We empirically 575 investigated this issue using many natural images. 576 Table II shows the ratio (percentage unit) of the number of 577 pixels for which c(s) > 2 to the total number of pixels 578 in the cameraman, fingerprint, and MRI images when the 579 optimal h∗for the highest PSNR was chosen based on the true 580 images for the proposed LMM-DB and LMM-RP methods. 581 For most of the pixels, the LMM-DB and LMM-RP values 582 were minimax. The relationship between the percentage of pix- 583 els with c(s) > 2 and the root mean squared error (RMSE) 584 is illustrated in Fig. 9 for the cameraman and MRI images. 585 Surprisingly, the optimal global smoothing parameters h for 586 the lowest RMSE point (or the highest PSNR) of the LMM- 587 DB and LMM-RP methods are very close to the smoothing 588 parameters h such that the percentage of c(s) > 2 is 0.1%. 589 This phenomenon was not only observed in these two images. 590 As shown in Table III, the pixel percentage of c(s) > 2 that 591 do not require knowledge of the true image can still determine 592 smoothing parameters that are able to yield comparable PSNR 593 values to the best PSNR values obtained by using the optimal 594 smoothing parameters calculated based on knowledge of the 595 true image. This was observed in all of the natural images 596 used in our simulations, with different noise levels, and when 597 B = 2 was used. However, the criteria of using the pixel 598 percentage of c(s) > 2 did not work very well for B = 7 in 599 our simulations. These criteria can be potentially used when 600 choosing a global smoothing parameter with our proposed 601 methods as a heuristic approach without knowing the true 602

image. 603

G. Computation Time for Algorithms 604

Table IV reports the computation time of the proposed 605 methods in comparison with the classical NLM and LJS₊. 606

(11)

IEEE Proof

Fig. 7. Filtered results using LJS₊[33] and the proposed LJS− RPonemethod with a noise level ofσ = 10 and neighborhood size B = 2. (a) couple. (b) montage. (c) lena. (d) pepper. (e) house. (f) M R I .

Fig. 8. Bias-variance curves and PSNR vs. varying neighborhood sizes (B) using classical NLM (only in the PSNR figure), LJS, and the proposed LMM− DBstein vs. LMM− DBonefor the cameraman example.

We used 8 threads (Intel Core i7 2.8 GHz) when computing 607

the patch distances for all methods. The local block size was 608

B = 2, the patch size was 7 × 7, and the window size 609

was 31× 31. All parameters were fixed for all of results pre- 610 sented in this section. Adjusting these parameters can greatly 611 reduce the running time. For example, setting B= 4, the patch 612 size to 5× 5, and the window size to 13 × 13 reduces the 613 computation time of the proposed methods to 0.60, 1.12, and 614 2.91 seconds (s) for 1282, 2562, and 5122images, respectively. 615 However, analytically, the classical NLM requires 3|P||| + 616 4|| − 1 operations per pixel where || is the number of 617 elements ini and LJS+ requires 3|P||| + 4|| + 3|B| + 5 618 operations per pixel. It is reported in [33] that the additional 619 operations for LJS₊ (3|B| + 6 operations) were negligible 620 compared to the NLM filtering computation (3|P|||+4||− 621 1 operations). Analytically, the additional computation for 622 LMM− DB and LMM − RP is 3|B| + 7 operations, which 623 is almost the same as the additional computation for LJS₊. 624 Therefore, further implementation optimization is possible by 625 exploiting the redundant computation of the patch distances 626

for the minimax estimator and NLM weights. 627

VI. DISCUSSION 628

The classical NLM method was a significant work in image 629 denoising [4], and required the determination of two important 630 parameters for good denoising performance: a smoothing 631 parameter and a self-weight value. The LJS method proposed 632

(12)

IEEE Proof

TABLE III

THEPSNR VALUES(dB)OF THEPROPOSEDMETHODSWITHB= 2 WHENCHOOSING THESMOOTHINGPARAMETER SO AS TOYIELD

THEHIGHESTPSNR USING THETRUEIMAGE(TRUE),

ANDWHENCHOOSING THESMOOTHINGPARAMETER SO AS TOYIELD THEPERCENTAGE OFc(s) > 2

TO BE0.1% (ESTIMATED)FOR

DIFFERENTNOISELEVELS

by Wu et al. [33] developed a state-of-the-art method for self-633

weight determination using JS estimation [37] and yielded 634

superior results in terms of the PSNR compared to the other 635

existing methods. However, since the LJS method did not 636

impose an upper bound for self-weight estimation, the bias 637

could no longer be controlled by the smoothing parameter, 638

which resulted in visual quality degradation. Our proposed 639

methods based on the Baranchik’s minimax theorem [34] 640

yielded comparable PSNR results to the state-of-the-art LJS 641

method. By imposing upper bounds for the self-weights, 642

Fig. 9. Comparison plots of the RMSE vs. the smoothing parameter h and the percentage of c(s) > 2 vs. the same smoothing parameter when using LMM-DB and LMM-RP with B = 2 and σ = 10. (a) cameraman LMM− DBone. (b) MRI LMM− DBone. (c) cameraman LMM− RPone. (d) MRI LMM− RPone.

TABLE IV

EXECUTION TIME(S) COMPARISON. THISWILLVARY

WITHPARAMETERSELECTION

the bias-variance trade-off was able to be controlled by a 643 smoothing parameter, and substantial visual artifact reduction 644

was achieved. 645

The focus of this article was self-weight parameter selec- 646 tion in the classical NLM filter with theoretical justification. 647 As discussed in the Introduction, there are other factors that 648 affect the performance of NLM based filters, and we expect 649 that our proposed methods would not be able to achieve 650 state-of-the-art denoising performance if there were no other 651 optimizations performed except the self-weights. Indeed, our 652 proposed methods with one patch size (non-adaptive neighbor- 653 hood) and one global smoothing parameter were not able to 654 achieve the level of denoising performance of the state-of-the- 655 art denoising methods such as BM3D [6]. However, when our 656 proposed methods have incorporated some of the other factors 657 into the NLM filters, such as local smoothing parameters 658 and adaptive neighborhoods [10], they have great potential to 659 achieve significantly improved denoising performance. 660 The minimax property of our proposed methods depends on 661 the choice of smoothing parameters. When using sufficiently 662 small smoothing parameters, the LMM-DB and LMM-RP 663 methods are “practically” minimax according to Baranchik’s 664

(13)

IEEE Proof

theorem [34]. However, when large smoothing parameters are

665

used, there may be some pixels that are not minimax for self-666

weight estimation. More empirical investigation showed that 667

the optimal global smoothing parameter h that yielded the 668

best PSNR only resulted in a very small portion of the pixels 669

that did not have minimax self-weight estimators. In fact, 670

this can be used as a useful heuristic when choosing a good 671

smoothing parameter since testing the minimax properties of 672

our proposed methods does not require the true image. More 673

theoretical analysis for this observation, or a statistical analysis 674

using many natural images as shown in [41], are potential 675

extensions of this work. Therefore, our proposed methods do 676

not only provide an optimal way to determine self-weights, but 677

also provide a heuristic way to determine a good smoothing 678

parameter. 679

VII. CONCLUSION

680

We proposed two methods, LMM-DB, LMM-RP, to deter-681

mine the self-weights of NLM filters that are “practically” 682

minimax, and this methods yielded a comparable PSNR, better 683

bias-variance trade-offs, and reduced visual quality artifacts 684

when compared to the results obtained using the state-of-the-685

art LJS method. Our methods also provide a potentially useful 686

heuristic way to determine a global smoothing parameter 687

without knowledge of the original image. 688

REFERENCES

689

[1] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color

690

images,” in Proc. 6th Int. Conf. Comput. Vis., Mumbai, India, Jan. 1998,

691

pp. 839–846.

692

[2] M. Elad, “On the origin of the bilateral filter and ways to improve it,”

693

IEEE Trans. Image Process., vol. 11, no. 10, pp. 1141–1151, Oct. 2002.

694

[3] P. Perona and J. Malik, “Scale-space and edge detection using

695

anisotropic diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12,

696

no. 7, pp. 629–639, Jul. 1990.

697

[4] A. Buades, B. Coll, and J.-M. Morel, “A review of image denoising

698

algorithms, with a new one,” Multiscale Model. Simul., vol. 4, no. 2,

699

pp. 490–530, 2005.

700

[5] S. P. Awate and R. T. Whitaker, “Unsupervised, information-theoretic,

701

adaptive image filtering for image restoration,” IEEE Trans. Pattern

702

Anal. Mach. Intell., vol. 28, no. 3, pp. 364–376, Mar. 2006.

703

[6] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising

704

by sparse 3-D transform-domain collaborative filtering,” IEEE Trans.

705

Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007.

706

[7] C. Louchet and L. Moisan, “Total variation as a local filter,” SIAM

707

J. Imag. Sci., vol. 4, no. 2, pp. 651–694, Jan. 2011.

708

[8] P. Milanfar, “A tour of modern image filtering: New insights and

709

methods, both practical and theoretical,” IEEE Signal Process. Mag.,

710

vol. 30, no. 1, pp. 106–128, Jan. 2012.

711

[9] D. Barash, “Fundamental relationship between bilateral filtering,

adap-712

tive smoothing, and the nonlinear diffusion equation,” IEEE Trans.

713

Pattern Anal. Mach. Intell., vol. 24, no. 6, pp. 844–847, Jun. 2002.

714

[10] C. Kervrann and J. Boulanger, “Optimal spatial adaptation for

patch-715

based image denoising,” IEEE Trans. Image Process., vol. 15, no. 10,

716

pp. 2866–2878, Oct. 2006.

717

[11] T. Tasdizen, “Principal components for non-local means image

denois-718

ing,” in Proc. 15th IEEE Int. Conf. Image Process. (ICIP), Oct. 2008,

719

pp. 1728–1731.

720

[12] T Tasdizen, “Principal neighborhood dictionaries for nonlocal means

721

image denoising,” IEEE Trans. Image Process., vol. 18, no. 12,

722

pp. 2649–2660, Dec. 2009.

723

[13] P. Coupé, P. Yger, S. Prima, P. Hellier, C. Kervrann, and C. Barillot,

724

“An optimized blockwise nonlocal means denoising filter for 3-D

725

magnetic resonance images,” IEEE Trans. Med. Imag., vol. 27, no. 4,

726

pp. 425–441, Apr. 2008.

727

[14] Z. Ji, Q. Chen, Q.-S. Sun, and D.-S. Xia, “A moment-based nonlocal- 728

means algorithm for image denoising,” Inf. Process. Lett., vol. 109, 729

nos. 23–24, pp. 1238–1244, Nov. 2009. 730

[15] S. Grewenig, S. Zimmer, and J. Weickert, “Rotationally invariant simi- 731

larity measures for nonlocal image denoising,” J. Vis. Commun. Image 732

Represent., vol. 22, no. 2, pp. 117–130, Feb. 2011. 733

[16] R. Yan, L. Shao, S. D. Cvetkovi´c, and J. Klijn, “Improved nonlocal 734

means based on pre-classification and invariant block matching,” J. Dis- 735

play Technol., vol. 8, no. 4, pp. 212–218, Apr. 2012. 736

[17] C. A. Deledalle, V. Duval, and J. Salmon, “Non-local methods with 737

shape-adaptive patches (NLM-SAP),” J. Math. Imag. Vis., vol. 43, no. 2, 738

pp. 103–120, Jun. 2012. 739

[18] P. Chatterjee and P. Milanfar, “Patch-based near-optimal image denois- 740

ing,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 1635–1649, 741

Apr. 2012. 742

[19] S. Y. Chun, J. A. Fessler, and Y. K. Dewaraja, “Post-reconstruction non- 743

local means filtering methods using CT side information for quantitative 744

SPECT,” Phys. Med. Biol., vol. 58, no. 17, pp. 6225–6240, 2013. 745

[20] C. Chan, R. Fulton, R. Barnett, D. Feng, and S. Meikle, “Postre- 746

construction nonlocal means filtering of whole-body PET with an 747

anatomical prior,” IEEE Trans. Med. Imag., vol. 33, no. 3, pp. 636–650, 748

Mar. 2014. 749

[21] D. Van De Ville and M. Kocher, “SURE-based non-local means,” IEEE 750

Signal Process. Lett., vol. 16, no. 11, pp. 973–976, Nov. 2009. 751

[22] D. Van De Ville and M. Kocher, “Nonlocal means with dimensionality 752

reduction and SURE-based parameter selection,” IEEE Trans. Image 753

Process., vol. 20, no. 9, pp. 2683–2690, Sep. 2011. 754

[23] C.-A. Deledalle, F. Tupin, and L. Denis, “Poisson NL means: Unsuper- 755

vised non local means for Poisson noise,” in Proc. 17th IEEE Int. Conf. 756

Image Process. (ICIP), Sep. 2010, pp. 801–804. 757

[24] Y. Wu, B. H. Tracey, P. Natarajan, and J. P. Noonan, “Fast block- 758

wise SURE shrinkage for image denoising,” Signal Process., vol. 103, 759

pp. 45–59, Oct. 2014. 760

[25] B. Goossens, Q. Luong, A. Pižurica, and W. Philips, “An improved 761

non-local denoising algorithm,” in Proc. Int. Workshop Local Non-Local 762

Approx. Image Process., Lausanne, Switzerland, 2008, pp. 143–156. 763

[26] V. Duval, J.-F. Aujol, and Y. Gousseau, “A bias-variance approach for 764

the nonlocal means,” SIAM J. Imag. Sci., vol. 4, no. 2, pp. 760–788, 765

2011. 766

[27] C. Kervrann, J. Boulanger, and P. Coupé, “Bayesian non-local means 767

filter, image redundancy and adaptive dictionaries for noise removal,” 768

in Scale Space and Variational Methods in Computer Vision. Berlin, 769

Germany: Springer, 2007, pp. 520–532. 770

[28] Y. Wu, B. Tracey, P. Natarajan, and J. P. Noonan, “Probabilistic non- 771

local means,” IEEE Signal Process. Lett., vol. 20, no. 8, pp. 763–766, 772

Aug. 2013. 773

[29] V. Dore and M. Cheriet, “Robust NL-means filter with optimal pixel- 774

wise smoothing parameter for statistical image denoising,” IEEE Trans. 775

Signal Process., vol. 57, no. 5, pp. 1703–1716, May 2009. 776

[30] T. Brox and D. Cremers, “Iterated nonlocal means for texture restora- 777

tion,” in Scale Space and Variational Methods in Computer Vision. 778

Berlin, Germany: Springer, 2007, pp. 13–24. 779

[31] S. Zimmer, S. Didas, and J. Weickert, “A rotationally invariant block 780

matching strategy improving image denoising with non-local means,” 781

in Proc. Int. Workshop Local Non-Local Approx. Image Process., 2008, 782

pp. 135–142. 783

[32] J. Salmon, “On two parameters for denoising with non-local means,” 784

IEEE Signal Process. Lett., vol. 17, no. 3, pp. 269–272, Mar. 2010. 785

[33] Y. Wu, B. Tracey, P. Natarajan, and J. P. Noonan, “James–Stein type 786

center pixel weights for non-local means image denoising,” IEEE Signal 787

Process. Lett., vol. 20, no. 4, pp. 411–414, Apr. 2013. 788

[34] A. J. Baranchik, “A family of minimax estimators of the mean of a 789

multivariate normal distribution,” Ann. Math. Statist., vol. 41, no. 2, 790

pp. 642–645, 1970. 791

[35] M. P. Nguyen and S. Y. Chun, “Center pixel weight estimation for non- 792

local means filtering using local James-Stein estimator with bounded 793

self-weights,” in Proc. IEEE 13th Int. Symp. Biomed. Imag. (ISBI), 794

Apr. 2016, pp. 82–85. 795

[36] C. Stein, “Inadmissibility of the usual estimator for the mean of a 796

multivariate normal distribution,” in Proc. 3rd Berkeley Symp. Math. 797

Statist. Probab., vol. 1. 1956, pp. 197–206. 798

[37] W. James and C. Stein, “Estimation with quadratic loss,” in Proc. 4th 799

Berkeley Symp. Math. Statist. Probab., vol. 1. 1961, pp. 361–379. 800

[38] A. J. Baranchik, “Multiple regression and estimation of the mean of a 801

multivariate normal distribution,” Dept. Statist., Stanford Univ., Stanford, 802

(14)

IEEE Proof

[39] B. Efron and C. Morris, “Families of minimax estimators of the mean of

804

a multivariate normal distribution,” Ann. Statist., vol. 4, no. 1, pp. 11–21,

805

1976.

806

[40] W. E. Strawderman, “Proper Bayes minimax estimators of the

multivari-807

ate normal mean vector for the case of common unknown variances,”

808

Ann. Statist., vol. 1, no. 6, pp. 1189–1194, 1973.

809

[41] X. Li, H. He, R. Wang, and D. Tao, “Single image superresolution via

810

directional group sparsity and directional features,” IEEE Trans. Image

811

Process., vol. 24, no. 9, pp. 2874–2888, Sep. 2015.

812

Minh Phuong Nguyen received the B.S.E. degree in

813

biomedical electronics from the School of

Electron-814

ics and Telecommunications, Hanoi University of

815

Science and Technology, Hanoi, Vietnam, in 2013,

816

and the M.S. degree in electrical engineering from

817

the Ulsan National Institute of Science and

Tech-818

nology, Ulsan, South Korea. He is currently

pursu-819

ing the Ph.D. degree with the Delft University of

820

Technology, The Netherlands. His research interests

821

include image reconstruction and image filtering for

822

SPECT, image registration between histology images

823

and MRI, and electrocardiogram signal processing for authentication tasks.

824

Se Young Chun (M’10) received the B.S.E. degree 825

from the School of Electrical Engineering, Seoul 826

National University, Seoul, South Korea, in 1999, 827

and the M.S.E. degree in electrical engineering 828

systems, the M.S. degree in mathematics, and the 829

Ph.D. degree in electrical engineering: systems from 830

the University of Michigan, Ann Arbor, MI, USA, 831

in 2005 and 2009, respectively. 832

From 2003 to 2005, he was a recipient of the two 833

year scholarship from the Ministry of Information 834

and Communications, South Korea. From 2009 to 835

2011, he was a Research Fellow with Massachusetts General Hospital and 836

Harvard Medical School, Boston, MA, USA, and from 2011 to 2013, he 837

was a Research Fellow with Electrical Engineering and Computer Science 838

and Radiology, University of Michigan. Since 2013, he has been an Assis- 839

tant Professor with the School of Electrical and Computer Engineering, 840

Ulsan National Institute of Science and Technology, Ulsan, South Korea. 841

His research interests include image reconstruction, image denoising, image 842

registration for multi-modal imaging systems, motion compensated image 843

reconstruction, and statistical bio signal processing and biometrics. 844

Dr. Chun was a recipient of the 2010 Society of Nuclear Medicine Computer 845

and Instrumentation Young Investigator Award, 2nd place. He was also a 846

recipient of the 2015 Bruce Hasegawa Young Investigator Medical Imaging 847

Science Award from the IEEE Nuclear and Plasma Sciences Society. He is 848

a member of IEEE Computational Imaging Special Interest Group and 849

an Editorial Board Member of Nuclear Medicine and Molecular Imaging 850

(15)

IEEE Proof

Bounded Self-Weights Estimation Method for

Non-Local Means Image Denoising

Using Minimax Estimators

Minh Phuong Nguyen and Se Young Chun, Member, IEEE

Abstract— A non-local means (NLM) filter is a weighted 1

average of a large number of non-local pixels with various image

2

intensity values. The NLM filters have been shown to have

3

powerful denoising performance, excellent detail preservation

4

by averaging many noisy pixels, and using appropriate values

5

for the weights, respectively. The NLM weights between two

6

different pixels are determined based on the similarities between

7

two patches that surround these pixels and a smoothing

para-8

meter. Another important factor that influences the denoising

9

performance is the self-weight values for the same pixel. The

10

recently introduced local James-Stein type center pixel weight

11

estimation method (LJS) outperforms other existing methods

12

when determining the contribution of the center pixels in the

13

NLM filter. However, the LJS method may result in excessively

14

large self-weight estimates since no upper bound is assumed,

15

and the method uses a relatively large local area for estimating

16

the self-weights, which may lead to a strong bias. In this

17

paper, we investigated these issues in the LJS method, and then

18

propose a novel local self-weight estimation methods using direct

19

bounds (LMM-DB) and reparametrization (LMM-RP) based on

20

the Baranchik’s minimax estimator. Both the LMM-DB and

21

LMM-RP methods were evaluated using a wide range of natural

22

images and a clinical MRI image together with the various levels

23

of additive Gaussian noise. Our proposed parameter selection

24

methods yielded an improved bias-variance trade-off, a higher

25

peak signal-to-noise (PSNR) ratio, and fewer visual artifacts when

26

compared with the results of the classical NLM and LJS methods.

27

Our proposed methods also provide a heuristic way to select a

28

suitable global smoothing parameters that can yield PSNR values

29

that are close to the optimal values.

30

Index Terms— James-Stein estimator, minimax estimator, 31

non-local means, center pixel weight, bounded self-weight, image

32

denoising.

33

I. INTRODUCTION

34

I

MAGE denoising is a fundamental task in image process-35

ing, low-level computer vision, and medical imaging algo-36

Manuscript received March 9, 2016; revised October 26, 2016 and December 31, 2016; accepted January 21, 2017. This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning under Grant NRF-2014R1A1A1007928. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Oleg V. Michailovich. (Corresponding author: Se Young Chun)

Minh Phuong Nguyen was with the School of Electrical and Computer Engineering (ECE), Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, South Korea. Now he is with Section of Radiation, Detection and Medical Imaging, Delft University of Technology, Mekelweg 15, 2629 JB Delft, The Netherlands (e-mail: m.p.nguyen@tudelft.nl).

Se Young Chun is with the School of ECE, UNIST, Ulsan, 44919, South Korea (e-mail: sychun@unist.ac.kr).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIP.2017.2658941

rithms. The goal of denoising is to suppress image noise 37 when restoring desired details using prior information about 38 the images. For example, based on prior information regarding 39 “smooth images”, a simple filter, such as a Gaussian filter, can 40 be designed as a weighted average of the image intensities 41 of the pixels in the local neighborhood with non-adaptive 42 weights. However, this type of filter blurs the edges and 43 details of images because these features are not captured in 44 the assumed prior information. Many edge-preserving denois- 45 ing methods have been proposed, including bilaterial fil- 46 ters [1], [2], anisotropic diffusion [3], non-local means (NLM) 47 filters [4], [5], collaborative filters (BM3D) [6], and total 48 variation filters [7]. Many filters, including bilaterial filters, 49 anisotropic diffusion, and NLM filters (but, not BM3D, 50 see [8]), can be represented as the weighted averages of 51 adaptive weights or adaptive smoothing [9]. It should be noted 52 that it is important to select appropriate weights in these 53 types of filters in order to obtain improved denoised image 54

quality [8]. 55

Classical NLM filters use the similarities between two local 56 patches in a noisy image to determine the weights in non- 57 local adaptive smoothing [4]. The NLM weights are obtained 58 by first calculating the Euclidean distance between the two 59 local patches, which is denoted d, and then by evaluating 60 exp(−d2_/h2_{), where h is a smoothing parameter. This method}

61 allows higher weights to be assigned to pixels with similar 62 patches so that edges and details can be preserved through 63

non-local weighted averaging. 64

There are four different factors that determine the output 65 image quality of a NLM filter in terms of weights. 1) The 66 first factor is the similarity measure d. The Euclidean distance 67 is a usual choice [4], but other similarity measures have also 68 been proposed, such as hypothesis testing with adaptive neigh- 69 borhoods [10], principal component analysis (or the subspace 70 based method) [11], [12], blockwise aggregation [13], rotation- 71 invariant measures [14]–[16], shape-adaptive patches [17], and 72 patch-based similarities with adaptive neighborhoods [18]. 73 In multimodal medical imaging, inaccurate weights for noisy 74 molecular images were enhanced by using additional high 75 quality anatomical images [19], [20]. 2) The second factor 76 is the strategy for determining the smoothing parameter h. 77 Optimization strategies have been developed based on Stein’s 78 unbiased risk estimation (SURE) method for NLM with 79 Gaussian noise [21], [22], NLM with Poisson noise [23], and 80 blockwise NLM with Gaussian noise [24]. 3) The third factor 81 is in selecting the function to use to determine the weights, 82

1057-7149 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.