Towards accuracy and scalability

(1)

Delft University of Technology

Towards accuracy and scalability

Combining Isogeometric Analysis with deflation to obtain scalable convergence for the

Helmholtz equation

Dwarka, V.; Tielen, R.; Möller, M.; Vuik, C. DOI

10.1016/j.cma.2021.113694 Publication date

2021

Document Version Final published version Published in

Computer Methods in Applied Mechanics and Engineering

Citation (APA)

Dwarka, V., Tielen, R., Möller, M., & Vuik, C. (2021). Towards accuracy and scalability: Combining Isogeometric Analysis with deflation to obtain scalable convergence for the Helmholtz equation. Computer Methods in Applied Mechanics and Engineering, 377, [113694]. https://doi.org/10.1016/j.cma.2021.113694 Important note

To cite this publication, please use the final published version (if applicable). Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

ScienceDirect

Comput. Methods Appl. Mech. Engrg. 377 (2021) 113694

www.elsevier.com/locate/cma

Towards accuracy and scalability: Combining Isogeometric Analysis

with deflation to obtain scalable convergence for the Helmholtz

equation

V. Dwarka

∗

, R. Tielen

1

_{, M. Möller}

1

_{, C. Vuik}

1

Delft University of Technology, Delft Institute of Applied Mathematics, Delft, The Netherlands Received 20 October 2020; received in revised form 21 January 2021; accepted 22 January 2021

Available online xxxx

Abstract

Finding fast yet accurate numerical solutions to the Helmholtz equation remains a challenging task. The pollution error (i.e. the discrepancy between the numerical and analytical wave number k) requires the mesh resolution to be kept fine enough to obtain accurate solutions. A recent study showed that the use of Isogeometric Analysis (IgA) for the spatial discretization significantly reduces the pollution error. However, solving the resulting linear systems by means of a direct solver remains computationally expensive when large wave numbers or multiple dimensions are considered. An alternative lies in the use of (preconditioned) Krylov subspace methods. Recently, the use of the exact Complex Shifted Laplacian Preconditioner (CSLP) with a small complex shift has shown to lead to wave number independent convergence while obtaining more accurate numerical solutions using IgA.

In this paper, we propose the use of deflation techniques combined with an approximated inverse of the CSLP using a geometric multigrid method. Numerical results obtained for one- and two-dimensional model problems, including constant and non-constant wave numbers, show scalable convergence with respect to the wave number and approximation order p of the spatial discretization. Furthermore, when kh is kept constant, the proposed approach leads to a significant reduction of the computational time compared to the use of the multigrid-approximated or exact inverse of the CSLP with a small shift, in particular for three-dimensional model problems.

c

⃝2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Keywords:Helmholtz; Pollution; Numerical dispersion; Isogeometric Analysis; GMRES; Deflation

1. Introduction

The Helmholtz equation has been widely studied in various fields of physics ranging from biomedical physics to geo- and nuclear physics. The electromagnetic scattering problem thus finds many applications in engineering practices. Many efforts have been made to find fast yet accurate numerical solutions to the Helmholtz problem. The latter remains a challenging topic in research due to the pollution error and the resulting linear system having undesirable properties. In particular, the pollution error results from a discrepancy between the analytical and

∗ _{Correspondence to: van Mourik Broekmanweg 6, 2628 XE, Delft, The Netherlands.}

E-mail address: v.n.s.r.dwarka@tudelft.nl(V. Dwarka).

1 _{Present address: van Mourik Broekmanweg 6, 2628 XE, Delft, the Netherlands.}

https://doi.org/10.1016/j.cma.2021.113694

0045-7825/ c⃝2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons. org/licenses/by/4.0/).

(3)

numerical wave number [1–3]. Consequently, the mesh resolution has to be kept fine enough to obtain accurate numerical solutions. If we let k denote the wave number, Ndofthe number of degrees of freedom in one dimension

and p the order of a finite difference or standard finite element scheme, then Ndof=C k

(_p+1

p

)

,

where C is a constant that only depends on the accuracy achieved [4]. In practice, this has led to the rule of thumb kh ≈ 2₁₀π, where 10 denotes the number of degrees of freedom per wavelength and h the mesh width. However, the resulting numerical solution still suffers from pollution, unless the resolution is kept at C(kp+1hp) ≤ 1, for a general p-th order scheme. While this minimizes the pollution error, the resulting linear systems are too large for direct solution methods. This exacerbates in higher-dimensions, which opens the door to the use of iterative solution methods. Due to the resulting linear systems being indefinite and non-Hermitian, Krylov subspace methods are necessary. In fact, even using standard multigrid as a stand-alone solver diverges for the Helmholtz equation [5,6]. Moreover, for Krylov subspace methods, the number of iterations until convergence grows with the wave number k. Thus, the difficulty in solving Helmholtz-type problems can be reduced to optimizing the trade-off between having accurate numerical solutions, while using a scalable solver.

One potential way to mitigate this problem is to adopt Isogeometric Analysis (IgA) [7] as a discretization technique. IgA can be considered as the natural extension of the finite element method (FEM) to higher-order B-splines and has become widely accepted as a viable alternative to standard FEM. The use of high-order B-splines or Non-Uniform Rational B-splines (NURBS) enables a highly accurate representation of complex geometries and bridges the gap between computer-aided design (CAD) and computer-aided engineering (CAE) tools. Furthermore, a higher accuracy per degree of freedom can be achieved compared to standard FEM [8]. A new branch of studies has demonstrated that IgA furthermore helps to control the pollution error while keeping the size of the resulting linear system moderate [9–13]. In [14], the authors investigated the obtained accuracy for several Helmholtz-type problems using a non-constant wave number and documented increased accuracy. Thus, while the use of IgA for Helmholtz-type problems becomes more established, the process of solving the underlying discretized systems remained fairly untouched. Until recently, a study by Diwan et al. [15] covered this for the Helmholtz equation and researched the use of IgA together with an iterative solver. There, the resulting linear systems are solved using the Generalized Minimum Residual Krylov method (GMRES) preconditioned with the Complex Shifted Laplacian Preconditioner (CSLP) using a small complex shift. The results show wave number independent convergence of the iterative solver and, at the same time, higher accuracy of the numerical solution.

The well-known CSLP has been the industry standard for many years [16]. While this has accelerated the convergence dramatically, the number of iterations increases with the wave number k, which is why in order to obtain wave number independent convergence, the complex shift has to be kept at O(k−1_{) [}₁₇_{]. One drawback}

of keeping the shift very small is that the resulting preconditioner starts resembling the original matrix and exact inversion puts a heavy tax on the computational resources. Therefore, a few multigrid cycles are often used to approximate the inverse of the CSLP, which amounts to O(N ) FLOPs [16]. However, in order to prevent multigrid from diverging, the complex shift has to be kept as large as possible O(1) [18].

As a consequence, recent developments have led to a broad range of preconditioners such as domain de-composition based preconditioners [19–25], sweeping preconditioners [26–30] and (multilevel) deflation based preconditioners [31–33]. One of these new preconditioners is the Adapted Deflation Preconditioner (ADP) [33], which uses quadratic B ´ezi er curves to construct the deflation space. There, second order finite differences was used to construct the linear systems on the fine-level. The construction of Z should balance the computational cost of the coarse-grid solve and the k-independent convergence. Although cubic and higher-order schemes are expected to result in more accuracy on the coarse-level, the associated computational costs might outweigh the benefits of using such schemes. For finite difference discretizations, the two-level deflation preconditioner has shown to be simple yet competitive to the small-shift and exact inversion of CSLP in terms of wave number independent convergence and computational complexity for large wave numbers k. In essence, the deflation preconditioner projects the near-zero eigenvalues of the CSLP-preconditioned system onto zero. These near-zero eigenvalues are known to interfere with fast convergence of the Krylov subspace solver.

Consequently, our aim in this paper is to extend the research direction set out in [14,15], by combining state-of-the-art iterative solvers with IgA to obtain both accurate and computationally efficient numerical solutions. In particular, we propose the use of deflation techniques combined with an approximated inverse of the CSLP using

(4)

multigrid to obtain scalable and faster convergence with respect to the wave number k and the order p. We study one-, two- and three-dimensional model problems containing both a constant wave number k and a variable wave number k(x, y). In the latter case, we focus on the performance of the solver in the presence of sharp discontinuities in the wave number and the underlying solution. For the two-dimensional model problems, we report the number of iterations and the CPU-timings to show that the use of deflation combined with a multigrid-approximated CSLP allows for tremendous gain in computational efficiency while keeping scalable convergence in terms of the number of iterations. The method outperforms the exact inversion of the CSLP with a small complex shift in terms of number of iterations and CPU-timings when a large constant or non-constant wave number is used. For the three-dimensional model problems, the combined use of deflation and multigrid-approximated CSLP leads to a significant speedup compared to the use of both a multigrid-approximated and exact inversion of the CSLP. In case of a curved geometry, the use of deflation (without CSLP) as a preconditioner further decreases the CPU-timings.

The paper is organized as follows. We start with the variational formulation and spatial discretization of the Helmholtz equation in Section 2. In Section 3 we discuss the deflation preconditioning technique for the Krylov subspace method. Here we introduce the use of higher-order B ´ezi er curves as a basis for the deflation space. We then proceed by performing a spectral analysis of the preconditioned systems and various numerical experiments in Section 4 in order to determine the convergence behavior. We provide CPU-timings in order to assess the computational time complexity. We conclude our results in Section5.

2. Problem definition

We start by presenting the variational formulation and B-spline discretization of the two-dimensional inhomoge-neous Helmholtz equation. In Section 4 instances of this general problem will be considered to assess the quality of our proposed solution method.

2.1. Variational formulation

To illustrate the variational formulation, we consider the inhomogeneous Helmholtz equation in two dimensions adopting inhomogeneous Robin boundary conditions:

−∆u(x, y) − k2(x, y)u(x, y) = f (x, y), (x, y) ∈ Ω ⊂ R2, (1) ( ∂

∂n−i k(x, y) )

u(x, y) = g(x, y), (x, y) ∈ ∂Ω. (2)

Here, Ω is a connected Lipschitz domain, f ∈ L2_{(Ω ), g ∈ L}2₍_{∂Ω) and k(x, y) a non-constant wave number. Let}

us define V as the first order Sobolev space H1_{(Ω ). The variational formulation of}₍₁₎_{is obtained by multiplication}

with a test functionv ∈ V and application of integration by parts: Find u ∈ V such that

a(u, v) = ( f, v), ∀v ∈ V, (3) where a(u, v) = ∫ Ω ∇u · ∇v dΩ − ∫ Ω k2uv dΩ − i ∫ ∂Ω kuv dΓ ( f, v) = ∫ Ω fv dΩ + ∫ ∂Ω gv dΓ . (4) A geometry function F is then defined to parameterize the physical domain Ω by describing an invertible mapping to connect the parameter domain Ω0=(0, 1)2 with the physical domain Ω .

F = Ω0→Ω, F(ξ, η) = (x, y). (5)

The considered geometries throughout this paper can be described by a single geometry function F, that is, the physical domain Ω is topologically equivalent to the unit square. In case of more complex geometries, a family of functions F(m)_{(m = 1}_{, . . . , K ) is defined and we refer to Ω as a multipatch geometry consisting of m patches. For}

a more detailed description of multipatch constructions, the authors refer to chapter 2 of [34]. 2.1.1. B-spline basis functions

To discretize Eq.(1), univariate B-spline basis functions are defined on the parameter domain Ω0by an underlying

(5)

Fig. 1. Linear and quadratic B-spline basis functions based on the knot vectors Ξ1 = {0, 0, 1, 2, 3, 3} and Ξ2 = {0, 0, 0, 1, 2, 3, 3, 3},

respectively. For interpretation of the color references in this and upcoming figures, the reader is referred to the web version of this article.

functions. Based on this knot vector, the basis functions are defined recursively by the Cox-de Boor formula [35], starting from the constant ones

φj,0(ξ) =

{

1 if ξj≤ξ < ξj +1,

0 otherwise. (6)

Higher-order B-spline basis functions of order p> 0 are then defined recursively φj,p(ξ) = ξ − ξj ξj + p−ξj φj,p−1(ξ) + ξj + p+1−ξ ξj + p+1−ξj +1 φj +1_,p−1(ξ). (7)

The resulting B-spline basis functions φj,p are non-zero on the interval [ξj, ξj + p+1) and possess the partition

of unity property. Furthermore, the basis functions are Cp−mj_{-continuous, where m}

j denotes the multiplicity of

knotξj. Throughout this paper, we consider a uniform knot vector with knot span size h, where the first and last

knot are repeated p + 1 times. As a consequence, the resulting B-spline basis functions are Cp−1 _{continuous and}

interpolatory at both end points.Fig. 1illustrates both linear and quadratic B-spline basis functions based on such a knot vector.

For the multi-dimensional case, the tensor product of univariate B-spline basis functions is adopted for the spatial discretization. Let Ndofdenote the total number of multivariate basis functions Φj,p. The spline space Vh,pcan then

be written as follows

Vh,p=span{Φj,p◦F−1}j =1,...,Ndof. (8)

The Galerkin formulation of (3)now becomes: Find uh,p∈Vh,p such that

a(uh,p, vh,p) = ( fh,p, vh,p), ∀vh,p∈Vh,p. (9)

The discretized problem in (9)can be written as a linear system

(Sh,p−Mh,p−iNh,p) uh,p=fh,p. (10)

Here, (Sh,p

)

i, j =

∫

Ω∇Φi,p· ∇Φj,p dΩ is the stiffness matrix,(Mh,p

)

i, j =

∫

Ωk 2_Φ

i,pΦj,p dΩ the mass matrix

and (Nh,p

)

i, j =

∫

∂ΩkΦi,pΦj,p dΓ the boundary mass matrix. Next, by defining Ah,p =Sh,p−Mh,p−iNh,p we

can write

Ah,puh,p=fh,p. (11)

For the ease of notation, we will proceed with the notation Au = f, and drop the subscript (h, p). 3. Preconditioned Krylov subspace methods

For Helmholtz-type problems, the number of degrees of freedom grows with the wave number k. Consequently, for larger values of k the linear systems become very large, especially in two and three dimensions. As a result, direct solvers become unattractive and computationally expensive due to fill-in. Thus, in order to solve the model problems, an iterative method is considered. For normal matrices (i.e. AA∗ = A∗A), the convergence of Krylov subspace

(6)

methods is closely related to the underlying distribution of the eigenvalues. The more clustered the eigenvalues, the better and faster the method converges. For MP 1-B, we can easily deduce the analytical eigenvalues which are given byλj = j2π2−k2. It is easy to see that the resulting systems will have both positive and negative eigenvalues,

rendering it indefinite. This limits our choice of Krylov subspace methods, where often GMRES is chosen as the underlying iterative solver. Throughout the years, the performance of GMRES for the Helmholtz equation has been studied widely, with an extensive overview given in [5,29,36]. One of these preconditioners is the CSLP, which is defined by taking the original coefficient matrix A and adding a complex shift. Thus, in the one-dimensional case, CSLP B is given by

B = A +βik2I, (12)

and the resulting preconditioned system becomes

B−1Au = B−1f. (13)

Here, I denotes the identity matrix andβ ∈ R the shift. In practice, the CSLP is often included by applying a fixed number of V-cycles of a (geometric) multigrid method to approximate B−1. As a smoother within the multigrid method, we adopt damped Jacobi (with damping parameter ω = 0.6). Note that the use of standard smoothers (i.e. Jacobi or Gauss–Seidel) within a multigrid solver [37] in IgA results in p-dependent convergence. This has led to the development of non-standard smoothers to obtain p-independent convergence rates [38–43]. Their application within a multigrid method to approximate B−1is, however, out of the scope of this paper. In order for B−1to remain a good preconditioner, the shift β should not be too small as otherwise multigrid will diverge [5,20]. On the other hand, the preconditioner should still remain close enough to the original coefficient matrix A, which is also whyβ should not be too large.

While the complex shift transfers part of the unwanted spectrum onto the complex axis, unless the shift is kept very small, near-zero eigenvalues start appearing around the origin as the wave number increases [31,44,45]. This effect accumulates in higher-dimensions. Especially the real part of these near-zero eigenvalues is known to have a detrimental effect on the convergence behavior of the Krylov solver. One simple yet effective way to get rid of these unwanted near-zero eigenvalues is to use deflation. By using an orthogonal projection, the deflation operator, which we will denote by P projects these unwanted eigenvalues onto zero. Thus, for a general symmetric linear system, we can define the projection matrix ˆP and its complementary projection P as

ˆ

P = AQ where Q = ZE−1ZTand E = ZTAZ, (14)

A ∈ Rn×n, Z ∈ Rn×m, P = I − AQ.

Here the matrix Z is the deflation matrix whose columns consist of the deflation vectors and E denotes the coarse-grid variant of the original coefficient matrix A. The performance of the deflation preconditioner depends on the choice of Z. In principle, the deflation matrix is defined as the prolongation and restriction matrix from a multigrid setting using a first-order linear interpolation scheme [31,46–50]. While this improves the convergence significantly, the near-zero eigenvalues start reappearing for very large wave numbers k. Consequently, it has been shown recently that the use of a quadratic interpolation scheme results in close to wave number independent convergence for the two-level deflation preconditioner [33]. In fact, the use of these higher-order deflation vectors results in a smaller projection error compared to the case where a linear interpolation scheme is used. To construct the stencil for the deflation matrix Z, we start by introducing the rational B ´ezi er curve.

Definition 1 (B ´ezi er Curve). A B ´ezi er curve of degree n is a parametric curve defined by B(t ) =

n

∑

j =0

bj,n(t )Pj, 0 ≤ t ≤ 1, (15)

where the polynomials bj,n(t ) =

(n j )

tj(1 − t )n− j, j = 0, 1, . . . , n, (16)

are known as the Bernstein basis polynomials of order n. The points Pj are called control points for the B ´ezi er

(7)

Definition 2 (Rational B ´ezi er Curve). A rational B ´ezi er curve of degree n with control points P0, P1, . . . , Pnand

scalar weights w0, w1, . . . , wn∈ R is defined as

C(t ) = ∑n j =0wjbj,n(t )Pj ∑n j =0wjbj,n(t ) . (17)

The motivation for using the rational B ´ezi er curve is that the latter formulation allows for the weights to be adjusted in order to account for the higher requested accuracy at the even degrees of freedom. To see why this is necessary, note that for large k the solutions become very oscillatory and the use of linear basis functions to construct the deflation vectors is not sufficiently accurate to map the underlying eigenvectors to its coarse-grid counterpart. In particular, when using a linear scheme to construct the grid-transfer operators, the even degrees of freedom j are copied onto the alternate grid and the value at the odd degrees of freedom j are determined by taking a weighted value of its neighbors. In particular for the one-dimensional case, if we define the coarse grid function with respect to the degree of freedom j by [u2h]j, then the linear scheme is given by

Definition 3 (Linear Interpolation). Let [u2h]( j −1)/2 and [u2h]( j +1)/2, be the neighboring degrees of freedom of

[u2h]j. Then the prolongation scheme for the even nodes can be characterized by a Rational B ´ezi er curve of

degree 1 with polynomials b0,1(t ) = 1 − t,

b1,1(t ) = t,

whenever j is odd by taking the weightsw0=w1=1 and t = 1₂. Note that in casew0=w1 and non-rational we

obtain the original B ´ezi er curve. C(1 2) = 1 2[u2h]( j −1)/2+(1 − 1 2)[u2h]( j +1)/2 1 2+(1 − 1 2) , (18) =1 2([u2h]( j −1)/2+[u2h]( j +1)/2 ) . (19)

When j is even, we take the middle component [u2h]j/2, which itself gets mapped onto the fine grid.

We thus consider a quadratic rational B ´ezi er curve as this allows us to replace the even component by a weighted average as well. The main difference is that the stencil to construct the grid-transfer operators now has a larger support and can thus provide more accuracy, as an increase in the number of iterations can be attributed to a loss of accuracy on the coarse-level. The quadratic scheme is given by

Definition 4 (Quadratic Approximation). Let [u2h]( j −2)/2and [u2h]( j +2)/2, be the neighboring degrees of freedom of

[u2h]j. Then the prolongation operator can be characterized by a Rational B ´ezi er curve of degree 2 with polynomials

b0,2(t ) = (1 − t )2,

b1,2(t ) = 2t (1 − t ),

b2,2(t ) = t2,

and [u2h]j/2, whenever j is even. Because we wish to add more weight whenever j is even, we take weights

w0 =w2 =1₂,w1= 3₂ and t =1₂ to obtain C(t ) = 1 2(1 − t ) 2_[u 2h]j −1+3₂2t (1 − t )[u2h]j+1₂(t )2[u2h]j +1 1 2(1 − t ) 2₊3 22t (1 − t ) + 1 2(t ) 2 = 1 2(1 − 1 2) 2_[u 2h]j −1+3₂(2)(1₂)(1 − 1₂)[u2h]j+1₂(1₂)2[u2h]j +1 1 2(1 − 1 2) 2₊1 2(2)( 1 2)(1 − 1 2) + 1 2( 1 2) 2 = 1 8[u2h]j −1+ 3 4[u2h]j+ 1 8[u2h]j +1 1 = 1 8([u2h]j −1+6[u2h]j+[u2h]j +1 ) . 6

(8)

When j is odd, [u2h]( j −1)/2 and [u2h]( j +1)/2 are associated to an even degree of freedom and the resulting stencil

leads to the standard linear interpolation scheme.

Thus, with respect to the coarse-grid function u2h at degree of freedom j , we can define the stencil for the

prolongation and restriction operator as [Zu2h]j = {1 8([u2h]( j−2)/2+6 [u2h]( j)/2+[u2h]( j+2)/2 ) if j is even, 1 2([u2h]( j−1)/2+[u2h]( j+1)/2 ) if j is odd } , (20)

for j = 1, . . . , Ndof and

[ZT uh ] j = 1 8([uh](2 j−2)+4 [uh](2 j+1)+6 [uh](2 j)+4 [uh](2 j+1)+[uh](2 j+2) )_, (21) for j = 1, . . . ,Ndof

2 . Additionally, a weight-parameter ε can be included to further increase the accuracy of the

prolongation and restriction operator [33]. In this case, the stencil for the prolongation and restriction operator is given by [Zu2h]j = {1 8([u2h]( j−2)/2+(6 −ε) [u2h]( j)/2+[u2h]( j+2)/2 ) if j is even, 1 2([u2h]( j−1)/2+[u2h]( j+1)/2 ) if j is odd } , (22)

[ZT_u h ] j = 1 8([uh](2 j−2)+4 [uh](2 j+1)+(6 −ε) [uh](2 j)+4 [uh](2 j+1)+[uh](2 j+2) )_, (23) for j = 1, . . . ,Ndof

2 . The value of ε is constant with respect to k and kh and is chosen such that the

projection error is minimized. In [33] this value has been determined analytically for the one-dimensional case (see Equation 5.13) and is adopted throughout this paper unless stated otherwise. Similarly, a cubic rational B ´ezi er curve can be constructed in an analogous way, which for the coarse-grid function u2hat degree of freedom j leads

to the stencil [Zu2h]j = { 1 32(6 [u2h]( j−2)/2+20 [u2h]( j)/2+6 [u2h]( j+2)/2 ) if j is even, 1 32([u2h]( j−3)/2+15 [u2h]( j−1)/2+15 [u2h]( j−1)/2+[u2h]( j+3)/2 ) if j is odd } , (24)

[ ZTuh ] j = 1 32([uh](2 j−3)+6 [uh](2 j−2)+15 [uh](2 j+1)+20 [uh](2 j)+15 [uh](2 j+1)+6 [uh](2 j+2)+[uh](2 j+3) ) , (25) for j = 1, . . . ,Ndof

2 . In Section 4.6.1, we will also use the cubic scheme in the one-dimensional numerical

experiments to explore the performance of both schemes with respect to the number of iterations to reach convergence. As the use of cubic and higher-order schemes is currently an active topic of research, no analytical weight-parameter ε has been constructed yet.

The aforementioned construction of Z using order rational B ´ezi er curves naturally extends to higher-dimensions by taking the Kronecker product of Z with itself to construct the 2- and 3D-variants respectively. This approach is widely used in the multigrid context [51].

Now that we have a stencil to construct Z, we can use Eq. (14) to construct the deflation preconditioner. The resulting linear system to be solved becomes

PTAu = PTf. (26)

Often, the deflation preconditioner P is combined with the CSLP B to accelerate convergence. The motivation for combining these methods comes from the observation that the eigenvalues of the CSLP preconditioned system move to zero, leading to an increase in the number of iterations as the wavenumber grows [52]. Thus, by first preconditioning and then deflating, the deflation preconditioner is applied to the preconditioned coefficient matrix B−1_{A. Given that the CSLP and the coefficient matrix A share the same eigenvectors, rigorous Fourier analysis}

(RFA) has been used to analytically determine the spectrum of the preconditioned operator [33,50]. Combined with CSLP, the preconditioned system appears to have better clustering properties. Thus, combining the deflation operator P with the CSLP B leads to the following linear system which needs to be solved

(9)

where, as mentioned previously, B−1 is generally approximated using a multigrid method. Note that the operator PT is never formed explicitly but is constructed through matrix vector products with Z and A. Moreover, we will refer to P based on the higher-order quadratic approximation as the ’Adapted Deflation Preconditioner’ (ADP) to distinguish between the standard deflation preconditioner using linear interpolation and the higher-order deflation scheme.

4. Numerical results

To assess the quality of the proposed iterative solver, this section starts with defining a variety of one-, two-and three-dimensional model problems based on the general problem described in Section 2.1. Then, we study the pollution error for our one-dimensional model problem when adopting high-order B-spline basis functions for the spatial discretization. In [15], a detailed first application of IgA for Helmholtz problems has been given. We therefore only show the pollution reduction for the model problem MP 1-A in this paper. We proceed by conducting a spectral analysis in one dimension (MP 1-B) to investigate the effect of the proposed preconditioning techniques on the spectrum of the preconditioned operator. Finally, the convergence of the iterative solver is studied in terms of both iteration numbers and CPU timings. These are obtained for the proposed deflation based preconditioner and compared to the use of the (exactly inverted) CSLP.

4.1. One-dimensional model problems

In this subsection, we present the considered one dimensional model problems. These model problems will be adopted in the remainder of this section to investigate the pollution error and the number of iterations needed to reach convergence with the proposed solution strategy.

4.1.1. MP 1-A

The first one-dimensional model problem, MP 1-A, is given below −d 2_u(x) d x2 −k 2_{u(x) = 0}_{, x ∈ Ω = (0, 1),} ₍₂₈₎ u(x) = 1, x = 0, u′(x) − i ku(x) = 0, x = 1.

Here, homogeneous Dirichlet and Sommerfeld boundary conditions are applied on the left and right boundary, respectively. The exact solution for MP1-A is given by u(x) = ei k x_{. Model problem MP 1-A will be adopted to}

investigate the pollution error for various values of the approximation order p of the B-spline basis functions. 4.1.2. MP 1-B

Model problem MP1-B involves an inhomogeneous source term. Furthermore, Dirichlet boundary conditions are applied on both boundaries, resulting in the following model problem

−d 2_u(x) d x2 −k 2_{u(x) =}_{δ(x −} 1 2), x ∈ Ω = (0, 1), (29) u(x) = 0, x =0, u(x) = 0, x =1.

Here,δ denotes the Dirac delta function. The analytic solution of MP1-B is based on the Green’s function of this model problem and is given by

u(x) = 2 ∞ ∑ j =1 sin( jπx) sin ( jπ1₂) j2π2₋_k2 , x ∈ Ω = [0, 1], k2 ̸= j2π2, j = 1, 2, 3, . . . .

Note that, for k2= j2π2, the eigenfunction expansion would become defective as this would imply resonance and unbounded oscillations in the absence of dissipation. Therefore, we explicitly impose the extra condition k2̸= j2π2 asserting that our Green’s function exists.

(10)

By imposing Dirichlet boundary conditions, the resulting system matrix exhibits the most unfavorable distribution of the eigenvalues [45]. Note that the inclusion of Sommerfeld radiation conditions instead would slightly shift the eigenvalues away from the origin due to the natural occurring damping.

4.2. Two-dimensional model problems

Both two-dimensional model problems will be presented in this subsection. In particular, we consider model problems involving a constant and non-constant wave number.

4.2.1. MP 2-A

In two dimensions, we consider as MP 2-A the natural extension of MP 1-B to two dimensions: −∆u(x, y) − k2u(x, y) = δ(x −1 2, y − 1 2), (x, y) ∈ Ω = (0, 1) 2_, ₍₃₀₎ u(x, y) = 0, (x, y) ∈ ∂Ω, Again, the analytic solution is given by the Green’s function:

u(x, y) = 4 ∞ ∑ i =1 ∞ ∑ j =1

sin(iπx) sin (iπ1₂) sin( jπy) sin ( jπ1 2 ) i2π2₊ _j2π2₋_k2 , (x, y) ∈ Ω = [0, 1] 2_, ₍₃₁₎ k2 ̸=i2π2+j2π2, i, j = 1, 2, 3, . . . . 4.2.2. MP 2-B

As model problem MP 2-B, we consider a non-constant wave number k = k(x, y), an inhomogeneous source function and Dirichlet boundary conditions on the entire boundary∂Ω.

−∆u(x, y) − k2(x, y)u(x, y) = δ(x −1 2, y − 1 2), (x, y) ∈ Ω = (0, 1) 2_, ₍₃₂₎ u(x, y) = 0, (x, y) ∈ ∂Ω.

Here, k(x, y) is chosen to be a two-dimensional step function consisting of 16 different values. For a fixed value of k, the values vary between 1₂k and 3₂k. Fig. 2 shows the considered field k(x, y) for k = 100. This model problem uses various horizontal layers in order to test the performance of the solver when a variable wave number k(x, y) is used. This is particularly important to investigate as in certain cases for Helmholtz-type problems the underlying solver might diverge. This has been reported for domain decomposition based preconditioners using inexact factorizations [29].

Fig. 2. Wave number distribution for k(x, y). k has been set to have a base value of 100. The figure shows the step-function to illustrate the variation profile of the wave number with respect to the x- and y-direction.

(11)

4.3. Three-dimensional model problems

Finally, two three-dimensional model problems are considered, involving a non-curved and curved geometry. The resulting linear systems for these model problems have been obtained using the open source package GeoPDEs [53]. 4.3.1. MP 3-A

We consider MP 3-A the natural extension of MP 2-A to three dimensions: −∆u(x, y, z) − k2u(x, y, z) = δ(x −1 2, y − 1 2, z − 1 2), (x, y, z) ∈ Ω = (0, 1) 3_, ₍₃₃₎ u(x, y, z) = 0, (x, y, z) ∈ ∂Ω, Again, the analytic solution is given by the Green’s function:

u(x, y, z) = 4 ∞ ∑ i =1 ∞ ∑ j =1 ∞ ∑ l=1

sin(iπx) sin (iπ1₂) sin( jπy) sin ( jπ1

2) sin(lπz) sin (lπ 1 2 ) i2π2₊_j2π2₊_l2π2₋_k2 , (x, y, z) ∈ Ω = [0, 1]3, (34) k2 ̸=i2π2+j2π2+l2π2, i, j, l = 1, 2, 3, . . . . 4.3.2. MP 3-B

As a final model problem, we consider a constant wave number on a curved (single patch) geometry −∆u(x, y, z) − k2u(x, y, z) = δ(x −1 2, y − 1 2, z − 1 2), (x, y, z) ∈ Ω, (35) u(x, y, z) = 0, (x, y, z) ∈ ∂Ω,

Here, Ω is a quarter annulus in three dimensions. Fig. 3shows the considered geometry and the obtained solution for MP 3-B.

Fig. 3. Illustration of the considered geometry (left) and solution (right) for MP 3-B.

4.4. Pollution error

In this section we will briefly discuss the effects of using IgA on the pollution error for the Helmholtz equation. As mentioned previously, the h-version of the error studies have shown that as the wave number k increases, the numerical solution suffers from dispersion errors [54,55]. While in 1D, one can define an exact modified wavenumber which is able to minimize and bound the pollution error, this is not possible in 2D and 3D as this relies on the direction of the waves [54,55]. Thus, instead of resorting to very fine meshes, it has been shown that higher-order methods suffer from less dispersion error and provide a viable alternative to obtaining accurate solutions while keeping the problem size economical [56,57]. In particular, Corollary 4.6 of [56] provides us with the following h-error estimate (given ∥u∥_L2 ∼1)

∥uex−uh∥L2 ≤C(hp+k−1(kh)p) h (1 + k(kh)p−1). (36)

(12)

Fig. 4. L2_{-error under h-refinement for MP 1-A using p = 1 (left) and p = 2 (right) for different wave numbers.}

Note that for p> 1, the error decreases asymptotically faster compared to p = 1 where the error scales at best with k. In order to illustrate these properties, we plot the L2_{-error under mesh refinement for our one-dimensional}

MP 1-A. Fig. 4shows the L2_{-error under mesh refinement for different values of k obtained for p = 1 (left) and}

p =2 (right). Note that, the k-dependence for p = 1 significantly differs from p = 2, as predicted in Corollary 4.6 in [56]. In fact, the numerical results presented in [56] (see Figure 2), showing the relative L2_{-error under mesh}

refinement, are in agreement with the results presented in Fig. 4.

Fig. 5. L2-error for MP 1-A using p = 1 to p = 5 for various wave numbers k. The solid line uses 10 degrees of freedom per wavelength (kh = 0.625) and the dashed line uses 7.5 degrees of freedom per wavelength (kh = 0.825).

While the use of IgA significantly reduces the pollution error, they do not remain pollution-free as the wavenumber becomes very large [57]. We illustrate this using the ’rule of thumb’, where the waves are resolved using 10 degrees of freedom per wavelength. Note that this has been used widely in practice and lies within the pre-asymptotic range for p = 1. InFig. 5we observe that, using kh = 0.625 for p = 2 to p = 5, the L2_{-error with}

respect to the analytical solution decreases. While this leads to significant more accurate solutions, we do observe that as the wave number increases, the L2_{-error increases accordingly. Moreover, as k increases the advantage}

of using p = 5 over p = 4 decreases as both lead to similar accuracy. For standard FEM, this was already observed [58]. Furthermore, decreasing the number of degrees of freedom per wavelength from 10 (solid line) to

(13)

7.5 (dashed line) already results in lower accuracy. In fact, the achieved accuracy for p = 4 and p = 5 with 7.5 degrees of freedom per wavelength is similar to the obtained accuracy for p = 3 when 10 degrees of freedom per wavelength are used. In this work, we will proceed by keeping kh = 0.625 and increasing the order p as we want to examine the extent of the iterative solver within this pre-asymptotic range. However, note that for engineering practices, the error can be bounded in the 0.1 to 1% range, where IgA can provide more accurate solutions using smaller linear systems [57].

4.5. Spectral analysis

We now proceed by analyzing the spectrum of the preconditioned system of MP 1-B. By using Dirichlet boundary conditions, we have the most unfavorable distribution of eigenvalues, allowing us to fully examine the potency of the preconditioner. It is widely known that the near-zero eigenvalue distribution strongly affects the resulting convergence factor of Krylov subspace methods. In general, these eigenvalues close to the origin hamper the convergence of such methods. It should be noted, however, that results for MP 1-A, involving both Dirichlet and Sommerfeld boundary conditions, do not significantly differ from the ones presented in this section. With respect to CSLP, many studies have confirmed that unless the complex shift is kept very small and the inversion is performed exactly, the eigenvalues cluster near the origin [17,20,45]. In this work, we are not inverting the CSLP exactly and we thus need to derive a proxy of the multigrid iteration used to approximate the inverse. This can be done by using the two-grid iteration matrix from a multigrid setting [59]. This leads to the following approximation for B

˜ B−1 _≈_{(I − (}_ωD)−1_B)ν (I − ZB−1 2hZ ⊤_{) (I − (}_ωD)−1_B)ν ,

where B2h denotes the coarse-grid variant of the CSLP, D the diagonal of B and ν denotes the smoothing

steps. Additionally, we use damped Jacobi as a smoother with damping parameter ω = 0.6. Note that for the multigrid cycle, Z is now the standard geometric multigrid prolongation and restriction operator based on the linear interpolation scheme. Using this approximation for B−1_{, we study the eigenvalues of the linear system P}T_B˜−1_A,

where P denotes the adapted deflation preconditioner based on the quadratic B ´ezi er scheme.

Fig. 6 shows the spectra of the preconditioned linear system for k = 50 (left) and k = 500 (right) for different values of p. The complex shift has been set toβ = 1 and one pre- and post-smoothing step has been used. Note that half of the eigenvalues of the preconditioned system will be projected onto the origin. The other half of the eigenvalues will therefore be non-zero. For k = 50 (left), all eigenvalues for a fixed value of p have a spiral shape, apart for the case p = 1. Furthermore, the angle between the eigenvalues and the real-axis in Quadrant 2 becomes smaller for higher values of p. Therefore, we can expect a p-dependency for small values of k for p ≥ 2. For k = 500 this becomes even more obvious visually, as the higher number of degrees of freedom leads to more eigenvalues. As the preconditioned operator becomes too large to determine all eigenvalues, it remains unsure how the spectra will further develop for large values of k.

Fig. 6. Spectrum of the preconditioned operator P ˜B−1_{A for different values of p, where k = 50 (left) and k = 500 (right) for MP 1-B.}

(14)

Next, in Fig. 7, we fix p = 2 (left) and p = 5 (right) and let k increase from k = 50 to k = 250. Here we can clearly observe that for p = 2, the eigenvalues remain fairly clustered in a semi-circular shape. Increasing k leads to a larger radius of this semi-circle and therefore a larger spread of the eigenvalues. If we focus on the small box containing a detailed illustration of what is occurring near the origin, we observe that for larger k more and more eigenvalues are starting to move closer towards the origin. Closest to the origin we can clearly see the eigenvalues for k = 250 (purple) and k = 200 (red) appearing. Although the eigenvalues seem less clustered for p = 5, the same general behavior can be observed. Classically, deflation based preconditioners are combined with the CSLP in order

Fig. 7. Spectrum of the preconditioned operator P ˜B−1_{A for different values of k, where p = 2 (left) and p = 5 (right) for MP 1-A. No}

weight-parameter has been included.

to obtain faster GMRES-convergence. Note that the projection matrix P projects a certain part of the spectrum of the coefficient matrix A onto zero. The addition of the CSLP ensures that the remaining non-zero eigenvalues are shifted towards the complex axis, which gives it the typical circular spectrum in the complex plane. However, for finite differences discretizations, the use of the CSLP combined with higher-order deflation is often redundant as wave number independent convergence can already be attained by using deflation without the CSLP [33]. An interesting point of investigation would be to study the spectrum of the preconditioned system PA. In Fig. 8, we study the spectrum of PA where we use the weight-parameterε in order to construct accurate higher-order deflation vectors. We indeed observe that half of the eigenvalues are mapped onto zero and the remaining part of the eigenvalues remains clustered. The eigenvalues no longer cross the negative real axis, which results in the preconditioned system PA being positive semi-definite. Apart from a scaling factor, the spectrum of k = 50 looks similar to the spectrum of k = 250 and illustrative of the k-independent convergence. However if we compare p = 2 (left) to p = 5 (right), we observe that for p = 5 the eigenvalues of PA are closer to zero and have a larger spread between the smallest and largest eigenvalue. For example for k = 250, the eigenvalues for p = 2 lie in the ballpark of 450 to 550, whereas for p = 5 the eigenvalues lie between 50 and 250.

For illustration purposes, we study the effect of interpolating and restricting the fine-grid systems with low accuracy. In Fig. 9, we have plotted the spectrum of PA, where we deliberately set the weight-parameter to a value which lowers the accuracy of the interpolation scheme to construct the deflation matrix Z. It immediately becomes apparent that the resulting preconditioned system is again indefinite as some eigenvalues are still negative. Moreover, if we compare p = 2 (left) to p = 5 (right), we observe a larger spread for p = 2 compared to p = 5. This is the opposite of what we observed inFig. 8. In both cases, the example is illustrative of the fact that having a low-order interpolation scheme to construct the prolongation and restriction operator, will lead to an ineffective mapping of the underlying eigenvalues and eigenvectors. As the wave number increases and the solutions become more oscillatory, the accurate mapping of the fine- and coarse-space becomes of increasing importance. Therefore, we chose a weight-parameter such that the projection error with respect to the eigenvectors is minimized [33].

(15)

Fig. 8. Real part of the spectrum of the preconditioned operator PA for different values of k, where p = 2 (left) and p = 5 (right) for MP 1-B. No weight-parameter has been included.

Fig. 9. Real part of the spectrum of the preconditioned operator PA for different values of k, where p = 2 (left) and p = 5 (right) for MP 1-B. Here we have used the weight-parameterε.

4.6. Numerical experiments

We will now present the convergence results for our model problems using the preconditioners described above. Unless stated otherwise, we set the grid resolution at kh ≈ 0.625, which is equivalent to using 10 degrees of freedom per wavelength. We use GMRES as the underlying Krylov subspace method and use a stopping criterion on the relative residual of 10−7. A serial implementation is considered on an Intel(R) i7-8665 CPU @ 1.90 GHz using 8 GB of RAM.

For the sake of completeness and clarity, we briefly introduce the notation of the preconditioners used in the experiments.

• D := Adapted Deflation Preconditioner (ADP) + GMRES.

• D_ε := Adapted Deflation Preconditioner (ADP) + GMRES using the shift-parameter ε to construct the deflation matrix. The value has been taken from [33] and is constant throughout the use of the numerical experiments.

• Cex :=CSLP (exactly inverted) + GMRES.

• DC_{M G}j := A D P preconditioner + GMRES using j number of multigrid V-cycles combined with (damped) Jacobi smoothing.

• D_εC_{M G}j := A D_εPpreconditioner + GMRES using j number of multigrid V-cycles combined with (damped) Jacobi smoothing.

(16)

4.6.1. One-dimensional model problems

We start by numerically solving MP 1-B using the deflation preconditioner together with the multigrid approximation of the CSLP. We differentiate between deflation with and without the weight-parameter ε and we vary the number of V-cycles between 1 and 10 iterations to obtain a fair approximation of the inverse of the CSLP.

Table 1 shows the number of GMRES iterations for the three different combinations. Starting with DC1_{M G} (first column) we observe that the number of iterations both grow with k and p. These results are in line with the spectral analysis from Section4.5, in particularFigs. 6and7. There we observed that the angle the eigenvalues make with the real axis becomes smaller for increasing p, anticipating some p-dependent convergence. Similarly, in Fig. 6, the radius of the circular shape of the eigenvalues grows with k, leading to the expectation that the number of iterations could grow with k. However, for very large wave numbers such as k = 104_{, we observe that the number}

of iterations is inversely related to p. Note that the spectrum of such large wave numbers has not been examined in this work.

For D_εC1

M G (second column) we solely observe p-dependent and k-independent convergence. In order to obtain

p-independent convergence as well, we allow for more iterations within multigrid in order to construct a more accurate approximation of the inverse of the CSLP. The results reported in the third column confirm that a better approximation of the inverse of the CSLP is indeed able to reduce the p-dependent convergence. Once we add the weight parameter ε to the deflation preconditioner we obtain k-independent convergence up to 106_{. Finally,}

increasing the number of V-cycles to 10 for D_εC10_{M G} (third column) leads to p-independent convergence and shows identical results to inverting CSLP exactly; see Table 3. Note, however, that the application of D_εC10_{M G} is more expensive compared to the application of D_εC1_{M G}. This result is in line with the literature as regards the p-dependent convergence observed for IgA discretizations combined with multigrid. Generally speaking, more smoothing steps and/or intricate smoothers are needed in order to counteract the increasing number of iterations for higher-order IgA schemes.

Table 1

Number of (preconditioned) GMRES iterations to reach convergence for MP 1-B. Here we combine the two-level deflation (D) using quadratic B ´ezi er curves with the CSLP. The shift β has been set to 1. CSLP has been inverted using C1

M G and C 10 M G respectively. k =102 _{k =}₁₀3 _{k =}₁₀4 _{k =}₁₀5 _{k =}₁₀6 N =161 N =1601 N =16 001 N =160 001 N =1 600 001 DC_{M G}1 D_εC1_{M G} D_εC10_{M G} DC1_{M G} D_εC1_{M G} D_εC10_{M G} DC1_{M G} D_εC1_{M G} D_εC_{M G}10 DC1_{M G} D_εC_{M G}1 D_εC10_{M G} DC1_{M G} D_εC1_{M G} D_εC10_{M G} p =1 7 7 5 7 7 5 13 7 5 50 7 5 ∗ 10 5 p =2 5 5 5 6 5 5 10 5 5 28 5 5 ∗ 5 5 p =3 6 6 5 6 6 5 8 6 5 22 6 5 ∗ 6 5 p =4 9 9 5 9 9 5 10 10 5 19 9 5 74 9 5 p =5 16 16 5 16 16 5 13 15 5 21 15 5 46 15 5

Alternatively, cubic B ézi er curves can be adopted as well to construct the deflation matrix Z; seeTable 2. For DC1_{M G}, the use of cubic B ézi er curves leads to a lower number of iterations when higher values of k are considered compared to the use of quadratic B ézi er curves. Compared to D_εC1_{M G} and D_εC10_{M G} (using quadratic interpolation), the number of iterations is similar for most values of k. Note that the use of cubic B ézi er curves, even when combined with CSLP and multiple V-cycles, does not lead to k- and p-independent convergence. Moreover, for k =106_{, the use of DC}10

M G leads a significant higher number of iterations. It should be noted that an optimal weight

parameter has only been determined analytically in literature for the quadratic case. The use of cubic (and higher-order interpolation schemes) is currently an active topic of research. Considering the computational costs associated with a higher-order interpolation scheme, we will adopt quadratic B ´ezi er curves throughout the remainder of this paper.

As mentioned previously, for a finite difference scheme, it has been shown that the deflation preconditioner without CSLP could also lead to close to wave number independent convergence. Thus, analogously, we perform a similar test to examine how well the deflation preconditioner performs with no other preconditioner. We will distinguish two cases; ADP without weight parameter D and ADP with weight parameter D_ε. Results are reported inTable 3, where we compare the number of iterations to the number of iterations obtained by using the (exactly inverted) CSLP with shift k−1 (Cex). Note that, the exactly inverted CSLP leads to iteration numbers independent

(17)

Table 2

Number of (preconditioned) GMRES iterations to reach convergence for MP 1-B. Here we combine the two-level deflation (D) using cubic B ´ezi er curves with the CSLP. The shiftβ has been set to 1. CSLP has been inverted using C1M G and C10M G respectively.

k =102 _{k =}₁₀3 _{k =}₁₀4 _{k =}₁₀5 _{k =}₁₀6 N =161 N =1601 N =16 001 N =160 001 N =1 600 001 DC1 M G DC1M G DC10M G DC1M G DCM G1 DC10M G DC1M G DCM G1 DCM G10 DC1M G DC1M G DC10M G DCM G1 DC1M G DC10M G p =1 7 5 5 7 5 5 7 5 5 8 7 7 16 16 16 p =2 5 5 5 5 5 5 5 5 5 6 6 6 11 11 11 p =3 6 5 5 6 5 5 6 5 5 6 5 5 11 9 6 p =4 10 5 5 10 5 5 10 5 5 10 5 5 12 8 8 p =5 18 6 5 17 5 5 17 5 5 15 6 5 16 6 6

of both k and p. In absence of the weight parameter, the number of GMRES iterations preconditioned with D increases with k and p for wave numbers k< 105_{. These results are similar to the ones reported in}_{Table 1}_{, where}

we observed a similar effect for DC1

M G. The observed number of iterations is also in agreement with the spectral

analysis from Fig. 9in Section4.5. It has been shown that as the accuracy of ADP decreases, the projection error increases, and the eigenvalues are not accurately projected onto the origin. As a result, the number of iterations is expected to increase with k. However, we did observe that this effect is less pronounced for larger values of p, which is why we obtain better convergence for larger values of k when p ≥ 4.

Adding the weight parameter significantly improves the convergence of the GMRES method with respect to k-dependent convergence. In particular, wave number independent convergence is observed for values of k up to 106. This is in line with the spectral analysis from Fig. 8 in Section 4.5. There, we observed that an accurate interpolation scheme ensures that half of the eigenvalues are mapped onto the origin and the spectrum remains as clustered as possible. However, for p = 5 we observed that the smallest and largest eigenvalue lie further away, which could explain the p-dependent convergence, and in particular the higher number of iterations observed for p =5. Thus, similar to multigrid solvers, deflation based solvers are also subjected to p-dependent convergence. The effect can be circumvented by combining both methods and increasing the number of V-cycles.

Table 3

Number of (preconditioned) GMRES iterations to reach convergence for MP 1-B. Here we use GMRES with either two-level deflation (D and Dε) using quadratic B ´ezi er curves or exact inverse of CSLP Cex usingβ = k−1. * indicates that the number of max iterations (100)

has been reached without convergence.

k =102 _{k =}₁₀3 _{k =}₁₀4 _{k =}₁₀5 _{k =}₁₀6 N =161 N =1601 N =16 001 N =160 001 N =1 600 001 D D_ε Cex D Dε Cex D Dε Cex D Dε Cex D Dε Cex p =1 9 9 5 8 9 5 13 9 5 49 9 5 ∗ 11 5 p =2 7 5 5 6 5 5 10 5 5 28 5 5 ∗ 5 5 p =3 8 8 5 8 8 5 10 8 5 20 8 5 ∗ 8 5 p =4 13 13 5 13 13 5 13 13 5 20 11 5 68 13 5 p =5 19 20 5 19 20 5 16 20 5 25 19 5 48 20 5

4.6.2. Two-dimensional model problems

In the previous subsection, it was observed that combining the deflation preconditioner D_εwith the approximated CSLP C_{M G}j yields the best results in terms of iteration numbers. In this subsection, we apply this preconditioner to MP 2-A, the natural extension of MP 1-B to two dimensions. In particular, CPU timings are determined to obtain a fair comparison in terms of computational costs.

Table 4compares DC1_{M G} and D_εC_{M G}12 with the exactly inverted CSLP Cex. For DεCM G12 , we obtain close to k- and

p- independent convergence. Only for p = 5, the number of iterations increases. Here, 3 pre- and post-smoothing steps and a shift ofβ = 4.2 are adopted. It has been shown that, when CSLP is combined with deflation, the shift β can be as large as 10 without negatively affecting the number of iterations [60]. In our experiments, we have

(18)

foundβ = 4.2 to be the optimal value, leading to the least number of iterations. However, letting β vary does not impact the number of iterations significantly [60].

For the Cex preconditioner, a shift of (3k)−1 has been adopted. Both the shift k−1 as well the shiftβ = (3k)−1

does not lead to wave number independent convergence. In fact, Cex uses more iterations for p< 5 in most cases.

This can be explained by the fact that we are using Dirichlet boundary conditions, which are known to cause a less favorable distribution of the eigenvalues compared to the use of Sommerfeld radiation conditions [45]. In particular, keeping the shift k−2 results in wave number independent convergence but leads to very uneconomical systems, which are close to the original coefficient matrix.

Table 4

Number of (preconditioned) GMRES iterations to reach convergence for MP 2-A. Here we combine the two-level deflation (D) using quadratic B ´ezi er curves with CSLP. CSLP has been inverted using C1

M G and C12M G respectively where the shift has been set toβ = 1 and

β = 4.2 respectively. When using Cex, the shift has been set toβ = 3k−1.

k =50 k =100 k =150 k =200 k =250 N =6241 N =25 281 N =57 121 N =101 761 N =159 201 DC1 M G DεC12M G Cex DC1M G DεC12M G Cex DC1M G DεCM G12 Cex DC1M G DεC12M G Cex DC1M G DεC12M G Cex p =1 7 7 7 8 7 8 12 12 10 8 8 9 12 9 10 p =2 10 7 7 10 7 8 10 7 8 11 8 11 12 8 10 p =3 18 6 6 20 9 8 18 7 7 20 7 11 19 7 10 p =4 36 7 6 36 7 8 36 7 7 36 7 11 37 7 10 p =5 85 20 7 86 21 8 87 21 7 86 21 11 21 21 10

Fig. 10 shows the corresponding CPU times to reach convergence with the GMRES method when applying D_εC12

M G and Cex as a preconditioner. The CPU-timings have been obtained using the Matlab 2019b ’tic toc’

command and include the computations needed to build and apply the different preconditioners as well as the GMRES iterations. For k = 50, inverting the CSLP preconditioner exactly leads to the lowest CPU times for all values of p considered. However, from k = 150 already, the opposite holds: D_εC12

M G is computationally more

efficient compared to the exact CSLP preconditioner. This effect becomes more pronounced as k increases. Thus, the larger k, the larger the computational speedup of the deflated preconditioned solver relative to the solver using the exact inversion of the CSLP combined with a small complex shift.

Fig. 10. CPU-time in seconds (s) for p = 2 to p = 5 for MP 2-A. The plot contains the timings for k = 50, 100, 150, 200 and k = 250. DC stands for D_εC12

M G and C stands for Cex usingβ = (3k)−1.

Next, we consider model problem MP 2-B, where the wave number is non-constant and given by a two-dimensional step function. This is an important benchmark as some solvers only perform successfully when a constant wave number is used. Moreover, it allows for testing whether the numerical solver can deal with sharp disruptions in the underlying velocity, which is the main focus of this section. InFig. 11we have plotted the constant

(19)

(left) and variable (right) solution for MP 2-A and MP 2-B respectively using k = 100 as a base wave number. The step-function used to vary k throughout the numerical domain is observed to disrupt the symmetric pattern observed for k = 100 (right).

Fig. 11. Real part of the two-dimensional numerical solution for the constant (left) and non-constant (right) wave number k(x, y) where k =100.

Table 5shows the number of GMRES iterations needed to reach convergence when DC12

M G and Cexare applied as

a preconditioner. With respect to p-dependent convergence, the number of iterations slightly varies with p for both preconditioned systems. In contrast to MP 2-A, however, we also observe a small increase in the number of iterations as k increases for both preconditioned systems. However, in terms of iterations, the deflated preconditioned system needs less iterations compared to the system using the exact inversion of the CSLP and a very small complex shift. Unlike the results from the constant wave number model problem, we therefore report weakly dependent convergence on k. However, note that for p = 5, the convergence appears to resemble wave number independent convergence. We do note that using the deflation preconditioner combined with the multigrid approximation of the CSLP, the number of iterations could be improved by using more V-cycles. These are relatively cheap in terms of computational costs as they are of order O(Ndof) FLOPs and given that the diagonal scaled Jacobi smoother is used.

Table 5

Number of (preconditioned) GMRES iterations to reach convergence for MP 2-B. Here we combine two-level deflation using quadratic B ´ezi er curves with CSLP (DC12

M G). For p< 5 we use 3 pre- and post smoothing steps, whereas for p = 5 we use 2 pre- and post

smoothing steps. CSLP has been inverted using C_{M G}12 where the shift has been set toβ = 4.2. When using Cex, the shift has been set to

β = (3k)−1 _{and CSLP is inverted exactly.}

k =50 k =100 k =150 k =200 k =250 N =6241 N =25 281 N =57 121 N =101 761 N =159 201 DC12 M G Cex DC12M G Cex DC12M G Cex DC12M G Cex DCM G12 Cex p =1 13 12 16 19 22 24 25 27 29 28 p =2 13 13 16 20 20 24 25 29 32 36 p =3 10 13 11 16 14 23 15 28 20 39 p =4 10 13 13 20 12 22 13 26 19 38 p =5 18 13 19 16 17 23 21 29 20 39

The corresponding CPU timings are provided in Fig. 12. The combination of deflation and the approximated deflation preconditioner (DC12

M G) is cheaper for all values of p and k. Hence, already for moderate values of

k, applying the CSLP preconditioner exactly is more expensive. Note that, for higher values of k, the difference between both approaches also becomes more visible in terms of CPU timings. This effect will only be magnified in 3D-applications.

(20)

Fig. 12. CPU-time in seconds (s) for p = 2 to p = 5 for MP 2-B. The plot contains the timings for k = 50, 100, 150, 200 and k = 250. DC stands for DεC12_{M G} and C stands for Cex usingβ = (3k)−1.

4.6.3. Three-dimensional model problems

Finally, we consider model problems MP 3-A and MP 3-B to investigate the performance of our solution strategy in three dimensions on a curved and non-curved geometry. Fig. 13shows the number of (preconditioned) GMRES iterations and CPU times to reach convergence for both MP 3-A and MP 3-B for k = 10, where for p = 5 the coefficient matrix has size 9261 × 9621. For MP 3-A, the number of iterations needed with DC12

M G is higher

compared to the use of Cex. Despite needing more iterations as p increases, the CPU times are significantly lower

when adopting both deflation and the approximated CSLP, with the exception of p = 5. Note that, the higher number of iterations needed for p = 5 to reach convergence has also been observed for the two-dimensional model problems.

As approximating the CSLP is known to reduce the computational costs compared to an exact inversion (in particular in higher dimensions and for larger values of k), results with an approximated CSLP have been added as well. Here, the CSLP is approximated with a single multigrid cycle, usingβ = 1 and applying 25 smoothing steps. Both the value of β and the number of smoothing steps have been determined such that the computational costs are minimized. For all values of p, combining two-level deflation with CSLP leads to lower iteration numbers and CPU times.

For MP 3-B, the use of the approximated or exact CLSP as a preconditioner leads to a substantially higher number of iterations for all values of p compared to DC12_{M G}. Hence, the CSLP is less effective for the model problem involving a curved geometry, leading to significantly higher CPU times.

As the results for MP 3-A and MP 3-B indicate that CSLP is less efficient when applied on a curved geometry, the use of two-level deflation using quadratic B ´ezi er curves as a preconditioner has been investigated as well.

Fig. 14 shows the number of GMRES iterations and CPU times needed for MP 3-B when two-level deflation D_ϵ is applied as a preconditioner for different values of k. For k = 10, the number of iterations to reach convergence lies (roughly) between the number of iterations needed with C1_{M G} and Cex. In terms of CPU times, however, the

use of two-level deflation as preconditioner significantly improves the results. In fact, on the curved geometry for example, we obtain a speedup factor of more than 100× when using two-level deflation without CSLP compared the exact inverted CSLP for k = 10 and p = 5.

(21)

Fig. 13. Number of (preconditioned) GMRES iterations and CPU times to reach convergence for MP 3-A and MP 3-B, where k = 10. Here we combine two-level deflation using quadratic B ´ezi er curves with CSLP (DC12

M G). For p< 5 we use 3 pre- and post smoothing steps,

whereas for p = 5 we use 2 pre- and post smoothing steps. CSLP has been inverted using C12

M G where the shift has been set toβ = 4.2.

When using C1

M G and Cex, the shift has been set toβ = 1 and β = (k)−1, respectively, and CSLP is inverted using a single multigrid cycle

(adopting 25 smoothing steps) or exactly.

Fig. 14. Number of (preconditioned) GMRES iterations and CPU times to reach convergence for MP 3-B. Here we apply two-level deflation using quadratic B ´ezi er curves (Dϵ) as a preconditioner.

5. Conclusion

In this work, we focus on the combination of IgA discretized linear systems with a state-of-the-art iterative solver using deflation and a geometric multigrid method. In particular, we extend the line of research set out by [15], where it was shown that the use of IgA reduces the pollution error significantly compared to p-order FEM. The authors have shown that the use of the exact inverse of the CSLP preconditioner with a small complex shift, yields wave number independent convergence for moderate values of k. Instead of inverting the CSLP exactly and using a small complex shift, we use a standard multigrid method to approximate its inverse and combine it with a two-level

(22)

deflation preconditioner to accelerate the convergence of GMRES. We use a large complex shift in order to ensure that the multigrid algorithm does not diverge.

The use of deflation techniques is motivated by studying the spectrum of the preconditioned systems. Deflation projects the unwanted negative and near-zero eigenvalues corresponding to the smooth eigenmodes onto zero, thereby accelerating the convergence of GMRES. Our spectral analysis shows that for increasing k and p, the spectrum remains well-clustered. This is supported by the numerical results in 1D as the number of iterations remains k- and p-independent for kh constant. If we exclude the CSLP, we obtain k independent convergence and the number of iterations increases slightly with p.

When deflation is combined with CSLP, the number of iterations weakly depends on k and p for kh constant in the 2D case. Starting from k = 150, the deflation based preconditioner combined with the approximate inverse of the CSLP outperforms the exact inversion of the CSLP with shiftβ = (3k)−1_{in terms of CPU-timings. The obtained}

speed-up becomes more significant as the wave number k increases. Results for the highly varying non-constant wave number model show a slight dependence on k but an inversely related dependence on p as the wave number increases. Even for this model problem, the proposed solver outperforms in terms of number of iterations and CPU-timings, when compared to the use of the exact inversion of the CSLP with a small complex shift.

For the three-dimensional model problems, the deflation based preconditioner combined with the approximate inverse of the CSLP outperforms the exact inversion of the CSLP with shift β = (k)−1 in terms of CPU-timings already for k = 10. Interestingly, the CSLP applied on the curved geometry performs worse compared to the use of two-level deflation in terms of CPU times. On this geometry, we obtain a speedup factor of more than 100 when using two-level deflation without CSLP compared to the exact CSLP. Future research will focus on further analyzing this behavior in more detail.

Although the considered model problems in this paper all involve single patch geometries, our solution strategy can (in principle) also be applied to multipatch geometries. Here, the resulting system matrix has a block structure, as is the case with Domain-Decomposition methods. As practical applications often involve multiple patches, future research will focus on exploring the use of the approximate inverse of the CSLP and/or deflation to solve the Helmholtz equation on multipatch geometries, possibly adopting strategies as often employed in Domain-Decomposition methods.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

[1] F. Ihlenburg, I. Babuška, Dispersion analysis and error estimation of Galerkin finite element methods for the Helmholtz equation, Internat. J. Numer. Methods Engrg. 38 (22) (1995) 3745–3774.

[2] F. Ihlenburg, I. Babuška, Finite element solution of the Helmholtz equation with high wave number part II: the h − p version of the FEM, SIAM J. Numer. Anal. 34 (1) (1997) 315–358.

[3] F. Ihlenburg, I. Babuška, Solution of Helmholtz problems by knowledge-based FEM, Comput. Assist. Mech. Eng. Sci. 4 (1997) 397–416.

[4] E. Turkel, D. Gordon, R. Gordon, S. Tsynkov, Compact 2D and 3D sixth order schemes for the Helmholtz equation with variable wave number, J. Comput. Phys. 232 (1) (2013) 272–287.

[5] O.G. Ernst, M.J. Gander, Why it is difficult to solve Helmholtz problems with classical iterative methods, in: I. Graham, T. Hou, O. Lakkis, R. Scheichl (Eds.), Numerical Analysis of Multiscale Problems, in: Lecture Notes in Computational Science and Engineering, Springer, 2011, pp. 325–363.

[6] O.G. Ernst, M.J. Gander, Multigrid methods for Helmholtz problems: A convergent scheme in 1D using standard components, in: I. Graham, U. Langer, J. Melenk, M. Sini (Eds.), Direct and Inverse Problems in Wave Propagation and Applications, De Gruyter, 2013, pp. 135—186.

[7] T. Hughes, J. Cottrell, Y. Bazilevs, Isogeometric analysis: CAD, finite elements, NURBS, exact geometry and mesh refinement, Comput. Methods Appl. Mech. Engrg. 194 (2005) 4135–4195.

[8] T. Hughes, A. Reali, G. Sangalli, Duality and unified analysis of discrete approximations in structural dynamics and wave propagation: Comparison of p-method finite elements with k-method NURBS, Comput. Methods Appl. Mech. Engrg. 197 (2007) 4104–4124.

[9] A. Buffa, G. Sangalli, R. Vázquez, Isogeometric analysis in electromagnetics: B-splines approximation, Comput. Methods Appl. Mech. Engrg. 199 (17–20) (2010) 1143–1152.

[10] A. Buffa, R. Vázquez, Isogeometric analysis for electromagnetic scattering problems, in: 2014 International Conference on Numerical Electromagnetic Modeling and Optimization for RF, Microwave, and Terahertz Applications, NEMO, IEEE, 2014, pp. 1–3.