• Nie Znaleziono Wyników

Frontal and multi-frontal solvers: orderings, elimination trees, refinement trees

N/A
N/A
Protected

Academic year: 2021

Share "Frontal and multi-frontal solvers: orderings, elimination trees, refinement trees"

Copied!
61
0
0

Pełen tekst

(1)

Maciej Paszynski

Department of Computer Science

AGH University of Science and Technology, Krakow, Poland maciej.paszynski@agh.edu.pl

http://www.ki.agh.edu.pl/en/research-groups/a2s

Frontal and multi-frontal solvers:

orderings, elimination trees,

refinement trees

(2)

Find temperature distribution such that

Finite difference disretization

(3)

(4)

Front matrix

Frontal matrix focuses on forward elimination of the first row

(5)

Front matrix

2nd row = 2nd row – 1/h2 * 1st row

(6)

Front matrix

2nd row = 2nd row – 1/h2 * 1st row

(7)

Front matrix

The first row has been eliminated

Frontral matrix focuses on the elimination of the second row

(8)

Front matrix

3rd row = 3nd row – [1/h2] / [-2/h2] * 2nd row

(9)

Front matrix

3rd row = 3nd row – [1/h2] / [-2/h2] * 2nd row

(10)

Construct multiple frontal matrices in such a way so they sum up to the full matrix

Variables must be split into parts

(11)

First all frontal matrices are constructed

(12)

First and second frontal matrices are sum up into new 3x3 frontal matrix

Its first and second rows are fully assembled +

(13)

+

First column is eliminated by using first row 2nd row = 2nd row – [1/h2] * 1st row

(14)

+

2nd row = 2nd row – [1/h2] * 1st row

(15)

Third and fourth frontal matrices are sum up into new 3x3 frontal matrix

Now only the second row is fully assembled

+ +

(16)

Change of the ordering

+ +

(17)

+ +

Eliminate entries below the diagonal

2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row 3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row

(18)

+ +

Why can I substract fully assembled 1st row from not fully assemled rows 2nd and 3rd ?

Eliminate entries below the diagonal

2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row 3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row

(19)

+ +

This is because substraction and addition are interchangable (now I am substructing from the 2nd not fully assembled row, and later I will add the remaining part)

Moreover the 1st row which is utilized for the substractions in other columns contains only zeros

Eliminate entries below the diagonal

2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row 3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row

(20)

+ +

Eliminate entries below the diagonal

2nd row = 2nd row – [1/h2] / [-2/h2] * 1st row 3rd row = 3rd row – [1/h2] / [-2/h2] * 1st row

(21)

Fifth and sixth frontal matrices are sum up into new 3x3 frontal matrix

Now the second and third rows are fully assembled

+ + +

(22)

+ +

+

Continue until root node

+ node

+ root

(23)

All frontal matrices are generated at the same time

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6

(24)

Summing up and elimination are executed at the same time over different pairs of frontal matrices

+ + +

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6

(25)

+ +

+

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6 Summing up and elimination are executed at the same time

over different pairs of frontal matrices

(26)

+ +

+

The agorithm is recursively repeated until we reach the root of the tree

The algorithm results in upper trianguler matrix stored in distributed manner Computational complexity = height of the tree = logN (where N = #unknowns-1)

Procesor 1 Procesor 2 Procesor 3 Procesor 4 Procesor 5 Procesor 6 +

node

+ root

(27)

a ( x ) u ' ( x )' b ( x ) u ' ( x ) c ( x ) u ( x ) f ( x )

  l

C

u

2

0 ,

Find such that

0 ) 0 (  u

 

 ( ) )

( ' )

( l u l u l

a

Strong formulation

Weak formulation

' ' '( ) ( ) ( )

0 0

l v dx

v f l

v l u dx

cuv v

bu v

au

l l

  

 

Find such that

uV

1

  0 , : ( 0 ) 0

v V v H l v

  u v l   v

b , 

(28)

Finite element method disretization

   

e e l e j N

b

a i j j

N

i

i , 1,...,

1

N

i

i i

e a u

1

e

j

v

   

 

1 , 5 . 0 0

5 . 0 , 0 2

1

1 for x

x for x x

e

   

 

1 , 5 . 0 0

5 . 0 , 0

1 2

x for

x for dx

x de

   

 

1 , 5 . 0 2

2

5 . 0 , 0 2

2 x for x

x for x x

e

   

 

1 , 5 . 0 2

5 . 0 , 0

2 2

x for

x for dx

x de

   

 

1 , 5 . 0 1

2

5 . 0 , 0 0

3 x for x

x x for

e

   

 

1 , 5 . 0 2

5 . 0 , 0

3 0

x for

x for dx

x de

Exemplary basis functions for [0,l] = [0,1], for two finite elements

  

, ' ' '

( ) ( )

0

l e l e dx

e ce e

be e

ae e

e

b i j

l

j i j

i j

i j

i

  

 

( )

0

l e dx

e f e

l j

l

j

j



(29)

     

     

     

   

 



















3 2 1

3 2 1

3 3 3

2 3

1

2 3 2

2 2

1

1 3 1

2 1

1

, ,

,

, ,

,

, ,

,

e l

e l

e l

a a a

e e b e

e b e

e b

e e b e

e b e

e b

e e b e

e b e

e b

N

i

i i

e a u

1

e

j

v

Exemplary basis functions for [0,l] = [0,1], for two finite elements

  

, ' ' '

( ) ( )

0

l e l e dx

e ce e

be e

ae e

e

b i j

l

j i j

i j

i j

i

   

 

( )

0

l e dx

e f e

l j

l

j

j



1

1 1

e K

1 2

2 2 1

K

e  K   2

3 2

e  K

,

 

1 ' ' '

1( ) 3( ) 0

0

3 1 3

1 3

1 3

1 e

ae e be e ce e dx e l e l e

b

  

0

.5 

0

2 1 2

1 2

1 1

1 1

0

2 1 2

1 2

1 2

1 1 1 1

1 1

1

) ( ) ( '

' '

, c dx

dx b d dx

d dx a d l

e l e dx

e ce e

be e

ae e

e

b K K K

K K

K     

 

  

1  

5 . 0

2 1 2

1 2

1 3

2 1

0

3 2 3

2 3

2 3

2 2 2 2

2 2

2

) ( ) ( '

' '

, c dx

dx b d dx

d dx a d l

e l e dx

e ce e

be e

ae e

e

b K K K

K K

K     

 

(30)

     

     

     

   

 



















3 2 1

3 2 1

3 3 3

2 3

1

2 3 2

2 2

1

1 3 1

2 1

1

, ,

,

, ,

,

e l

e l a

a e

e b e

e b e

e b

e e b e

e b e

e

b

N

i

i i

e a u

1

e

j

v

Exemplary basis functions for [0,l] = [0,1], for two finite elements

  

, ' ' '

( ) ( )

0

l e l e dx

e ce e

be e

ae e

e

b i j

l

j i j

i j

i j

i

   

 

( )

0

l e dx

e f e

l j

l

j

j



1

1 1

e K

1

2 2

2 1

K

e  K   2

3 2

e  K

    





5 . 0

0

2 1 1

1 1

1 1

1 1

0

1 1 1

1 1 1 1

1 1 1

1 1

1

) ( ) ( '

' '

, c dx

dx bd dx

d dx ad l

e l e dx

e ce e

be e ae e

e

b K K

K K

K

   

   









1

5 . 0

2 1 1

1 1

1 5

. 0

0

2 2 2

2 2

2

2 2 1

0

2 2 2

2 2 2 2

2

2 2

2 2

2 1

1 1 1

1

) ( ) ( '

' ' ,

dx dx c

bd dx d dx ad dx

dx c bd dx d dx ad

l e l e dx e ce e

be e ae e

e b

K K

K K

K K

K K K

K

   

 





 

1

5 . 0

2 2 2

2 2

2 3

3 1

0

3 3 3

3 3

3 3

3 2 2

2 2

2

) ( ) ( '

' '

, c dx

dx b d dx

d dx a d l

e l e dx

e ce e

be e

ae e

e

b K K

K K

K

(31)

   

       

   

     

 

2 2 1

1

2 2 2

2

2 2 2

2 1

1 1

1

1 1 1

1

2 1 2

1

3 2 1

2 2 2

1

1 2 1

1 2

2 2

1

1 2 1

1

, ,

0

, ,

, ,

0 ,

,

K K K

K

K K K

K

K K K

K K

K K

K

K K K

K

l l l

l a

a a b

b

b b

b b

b b

   

     

 









K K K

K K

K K

K

K K K

K

l l x

x b

b

b b

2 1 2

1 2

2 2

1

1 2 1

1

, ,

, ,

     

     

     

   

 







 











3 2 1

3 2 1

3 3 3

2 3

1

2 3 2

2 2

1

1 3 1

2 1

1

, ,

,

, ,

,

, ,

,

e l

e l

e l

a a a

e e b e

e b e

e b

e e b e

e b e

e b

e e b e

e b e

e b

Notice that when we switch from finite difference to finite elements, it only changes the local systems of equations at tree nodes

(32)

,

 

a ' ' b ' c

dx (l) (l)

b iK iK

K

K j K i K

j K i K

j K i K

j K

i          

 

  

 

e f dx (l)

l Kj

K

K j

j

  

1 2

2 2 1

K

e  K 

Global basis functions are composed with local shape functions, e.g.

   

     

  



 

 



 



 

 

 

K K K

K K

K K

K

K K

K K

l l x

x b

b

b b

2 1 2

1 2

2 2

1

1 2

1 1

, ,

, ,

Local system of equations generated over the element K

,

 

a ' ' b ' c

dx (l) (l)

b iK iK

K

K j K i K

j K i K

j K i K

j K

i          

 

  

 

e f dx (l)

l Kj

K

K j

j

  

(33)

   

     

 







K K K

K K K K K

K K K K

l l x x b

b

b b

2 1 2

1 2 2 2 1

1 2 1

1

, ,

, ,

   

     

 







K K K

K K K K

K

K K K

K

l l x x b

b

b b

2 1 2

1 2 2 2

1

1 2 1

1

, ,

, ,

   

     

 









K K K

K K K K

K

K K K

K

l l x

x b

b

b b

2 1 2

1 2 2 2

1

1 2 1

1

, ,

, ,

   

     

 







K K K

K K K K K

K K K K

l l x x b

b

b b

2 1 2

1 2 2 2 1

1 2 1 1

, ,

, ,

   

     

 







K K K

K K K K K

K K K K

l l x x b

b

b b

2 1 2

1 2 2 2 1

1 2 1

1

, ,

, ,

   

     

 









K K K

K K K K

K

K K K

K

l l x x b

b

b b

2 1 2

1 2 2 2

1

1 2 1

1

, ,

, ,

Notice that when we switch from finite difference to finite elements, it only changes the local systems of equations at tree nodes

(34)

1

e

hp

e

hp3

e

hp5

7

e

hp

e

hp9

8

e

hp

1 i

i hp i

hp

e u

u

We seek the solution u of some weak form of PDE as a linear

combination of shape functions spread over finite element mesh

e

hpi

4

e

hp 2

e

hp

(35)

1

ehp ehp3 ehp5

2

ehp ehp4

7

ehp ehp9

8

ehp

,    1 ,..., 15

15

1

 

j e

l e

e b

u

hpj

i

j hp i

hp i

hp

The coefficients

(called „degrees of freedom” d.o.f.) are obtained by solving

system of linear equations – finite element disecretization

of a weak (variational) form of PDE

i

u

hp

e

hpi

e

hpj

b , l   e

hpj

where and

are some integrals of shape functions

j hp i

hp

e e ,

e hp i e hp j

b ,

(36)

Generates frontal matrix for the first element, eliminates fully assembled degrees of freedom

SOLUTION BASED ON LINEAR ORDER OF ELEMENTS

Partial

forward elimination O(6*9^2)

(37)

Generates frontal matrix for the second element, merges with the current frontal matrix

eliminates fully assembled degrees of freedom

SOLUTION BASED ON LINEAR ORDER OF ELEMENTS

Full

forward elimination

(38)

SOLUTION BASED ON THE ELIMINATION TREE

Partial

forward elimination

(39)

Partial

forward elimination

SOLUTION BASED ON THE ELIMINATION TREE

(40)

Full forward elimination

of the interface problem matrix

SOLUTION BASED ON THE ELIMINATION TREE

(41)

Number of operations for partial forward elimination

(42)

BASED ON THE HISTORY OF MESH REFINEMENTS

For any initial mesh, the elimination tree can be created based on nested dissection algorithm.

(43)

BASED ON THE HISTORY OF MESH REFINEMENTS

The elimination tree created for the initial mesh is updated when the mesh is refined (elimination tree is constructed dynamically, during mesh refinements)

(44)

BASED ON THE HISTORY OF MESH REFINEMENTS

The elimination tree created for the initial mesh is updated when the mesh is refined (elimination tree is constructed dynamically, during mesh refinements)

(45)
(46)
(47)
(48)
(49)

• Local matrices for active elements – leaves of the elimination tree – are created

Level of

initial mesh elements

Level of

refinement trees

(50)

• Interior degrees of freedom are eliminated at every leaf

Level of

initial mesh elements

Level of

refinement trees

(51)

• Schur complement contributions are merged at parent level

Level of

initial mesh elements

Level of

refinement trees

(52)

• Fully assembled degrees of freedom are eliminated at parent nodes level (degrees of freedom related to common edges shared by both son elements can be eliminated now)

Level of

initial mesh elements

Level of

refinement trees

(53)

• Contributions to Schur complement are merged again at the next level

Level of

initial mesh elements

Level of

refinement trees

(54)

• Fully assembled degrees of freedom are eliminated

Level of

initial mesh elements

Level of

refinement trees

(55)

• Finally, Schur complement contributions are merged at the tree root node

Level of

initial mesh elements

Level of

refinement trees

(56)

• The common interface problem is solved at the tree root node

(The size of the common interface problem corresponds to the size of crossection of the entire domain)

Level of

initial mesh elements

Level of

refinement trees

(57)

Level of

initial mesh elements

Level of

refinement trees

• The solution obtained at root node is utilized at son nodes.

• The backward substitution is executed

(58)

Level of

initial mesh elements

Level of

refinement trees

• The solution utilized at the second level nodes is utilizes at their sone nodes.

• The backward substitution is executed

(59)

Level of

initial mesh elements

Level of

refinement trees

• The solution obtained at the third level nodes is utilized at leaf nodes.

• The backward substitution is executed on the level of leaf nodes

(60)

MEMORY MULTI-FRONTAL PARALLEL SOLVER

- Embeded into hp code

+ Outperforms MUMPS for large number of processors - Slower than MUMPS for low number of processors

Profiling showed that the time consuming part for low number of processors was actually process of merging of matrices –

performed on the level of unknowns, with moving of matrix entries.

Switch to the level of nodes, do not touch matrix entries –work with pointers 1,482,570 degrees of freedom (68,826,475 non-zeros)

(61)

http://home.agh.edu.pl/~paszynsk/Publications.html

Maciej Paszyński, David Pardo, Carlos Torres-Verdin, Leszek Demkowicz, Victor Calo

A PARALLEL DIRECT SOLVER FOR SELF-ADAPTIVE HP FINITE ELEMENT METHOD

Journal of Parallel and Distributed Computing, 70, 3 (2013) 270-281

Anna Paszynska, Maciej Paszynski, Konrad Jopek, Maciej Wozniak, Damian Goik, Piotr Gurgul, Hassan AbouEisha, Mikhail Moshkov, Victor Calo, Andrew Lenharth, Donald Nguyen, Keshav Pingali

QUASI-OPTIMAL ELIMINATION TREES FOR 2D GRIDS WITH SINGULARITIES

Scientiffic Programming, (2015) Article ID 303024, 1-18

Maciej Paszynski, David Pardo, Anna Paszynska, Leszek Demkowicz OUT-OF-CORE MULTI-FRONTAL SOLVER FOR MULTI-

PHYSICS HP ADAPTIVE PROBLEMS

Procedia Computer Science, 4 (2011) 1788-1797

Cytaty

Powiązane dokumenty

Research Center for Systems and Technologies (SYSTEC), Faculty of Engineering, University of Porto We consider continuous-time optimal control and model.. predictive control

This model aims to present the number of downloaders and seeds watching a video i and the average downloading speed at a peer as a function..

PERFORMANCE OF DIRECT SOLVERS ON H-ADAPTED GRIDS submitted to SIAM Journal of Numerical

Computational cost for a leaf = cost of elimination of interior degrees of freedom Computational cost for a node = cost for son1 + cost for son2 +.. cost of elimination of

(A1)-(A) 4 -(AN) (generation of frontal matrices at leaves of the elimination trees) (A2) 3 (merging contributions at father nodes). (E2) 3 (elimination of fully

For continuous data (e.g. metric in analytic form or gathered from the curvature of the surface) all nodes of the current control space structure are initialized with the available

Wat betreft de maximale golfsteilheid, is er vooral gekeken naar de resultaten van de SWAN berekeningen (de ééndimensionale sommen, §4.2.2, en de nieuwe SWAN resultaten voor de

W ustępie czwartym ustanowiono, że w celu ułatwienia ochrony oznaczeń geograficznych dla win zostaną podjęte w Radzie TRIPS negocjacje dotyczące ustanowienia wielostronnego systemu