• Nie Znaleziono Wyników

Frontal and multi-frontal solvers: Graph grammar based model of concurrency

N/A
N/A
Protected

Academic year: 2021

Share "Frontal and multi-frontal solvers: Graph grammar based model of concurrency"

Copied!
21
0
0

Pełen tekst

(1)

Maciej Paszynski

Department of Computer Science

AGH University of Science and Technology, Krakow, Poland maciej.paszynski@agh.edu.pl

http://home.agh.edu.pl/paszynsk

http://www.ki.agh.edu.pl/en/staff/paszynski-maciej http://www.ki.agh.edu.pl/en/research-groups/a2s

Main collaborators Victor Calo (KAUST)

Leszek Demkowicz (ICES, UT) David Pardo (IKERBASQUE)

Frontal and multi-frontal solvers:

Graph grammar based

model of concurrency

(2)

GENERATION OF 1D ELIMINATION TREE

1D elimination tree obtained by executing productions (P1)-(P2)2-(P2)2-(P3)6

(3)

GRAPH GRAMMAR PRODUCTIONS AS ATOMIC TASKS

We assign indices to grammar productions in order to localize the places where the graph grammar productions were fired (P1)-(P2)1-(P2)2-(P2)3-(P2)4-(P3)1-(P3)2-(P3)3-(P3)4-(P3)5-(P3)6

(4)

TRACE THEORY BASED SCHEDULER

Dependency relation for construction of the elimination tree (P1)D{(P2)1,(P2)2}

(P2)1D{(P2)3,(P2)4} (P2)3D{(P3)1,(P3)2} (P2)4D{(P3)3,(P3)4} (P2)2D{(P3)5,(P3)6} Alphabet:

A = {(P1) , (P2)1 , (P2)2 , (P2)3 , (P2)4 , (P3)1 , (P3)2 , (P3)3 , (P3)4 , (P3)5 , (P3)6 }

(5)

TRACE THEORY BASED SCHEDULER

Dependency graph

(6)

TRACE THEORY BASED SCHEDULER

Dependency graph

(7)

TRACE THEORY BASED SCHEDULER

(P1)-(P2)1-(P2)2-(P2)3-(P2)4- (P3)1-(P3)2-(P3)3-(P3)4-(P3)5-(P3)6

[(P1)][(P2)1(P2)2][(P2)3(P2)4(P3)5(P3)6][(P3)1(P3)2(P3)3(P3)4] Scheduling according to Foata Normal Form:

Thus, the execution of the solver consists of several steps, where independent tasks are executed in concurrent, interchanged with the synchronization barriers.

    

 

k

 

k

ik kj

k j k i k

k i

n l n n l

l

Da a

l j

l i

k

Ia a l

j i k

A a

a a

a a

a a a

a

a

n

1 1

2 1 2

2 2 2 1 1 1

2 1 1

,..., 1 ,...,

1

,..., 1 ,

...

...

...

...

1 1

i<>j where I=AxA\D

Foata Normal Form

(alphabet)

(8)

PROCESS OF THE ELIMINATION

EXPRESSED BY GRAPH GRAMMAR PRODUCTIONS

Graph grammar production construction local matrix for the first sub-interval

Graph grammar production construction local matrix for the last sub-interval Graph grammar production construction local matrix for the i-th sub-interval

(9)

PROCESS OF THE ELIMINATION

EXPRESSED BY GRAPH GRAMMAR PRODUCTIONS

Generation of frontal matrices at leaves of the eliminaton tree expressed as the execution of graph grammar productions (A1)-(A)4-(AN)

(10)

Graph grammar production expressing the merging process Exemplary merging of two internal contributions

PROCESS OF THE ELIMINATION

EXPRESSED BY GRAPH GRAMMAR PRODUCTIONS

(11)

ASSEMBLING AT PARENT LEVEL

Expression of the solver execution by graph grammar productions

(A1)-(A)4-(AN) (generation of frontal matrices at leaves of the elimination trees) (A2)3 (merging contributions at father nodes)

(12)

After merging of the two internal contributions,

the i-th equation is fully assembled, and can be eliminated

PROCESS OF THE ELIMINATION

EXPRESSED BY GRAPH GRAMMAR PRODUCTIONS

Graph grammar production expressing the elimination process Expression of the solver execution by graph grammar productions

(A1)-(A)4-(AN) (generation of frontal matrices at leaves of the elimination trees) (A2)3 (merging contributions at father nodes)

(E2)3 (elimination of fully assembled nodes)

(13)

Finally, we reach the root of the elimination tree

PROCESS OF THE ELIMINATION

EXPRESSED BY GRAPH GRAMMAR PRODUCTIONS

At the root node, all three equations are fully assembled, and the local system can be solved now

(14)

ELIMINATION OF FULLY ASSEMBLED NODES

Expression of the solver execution by graph grammar productions

(A1)-(A)4-(AN) (generation of frontal matrices at leaves of the elimination trees) (A2)3 (merging contributions at father nodes)

(E2)3 (elimination of fully assembled nodes)

(A2) – (E2) (merging at parent node followed by elimination)

(Aroot) – (Eroot) (merging at root node followed by full forward elimination)

(15)

PROCESS OF THE BACKWARD SUBSTITUTIONS EXPRESSED BY GRAPH GRAMMAR PRODUCTIONS

At the last stage of the solver execution, we execute partial backward substitutions

(16)

TRACE THEORY BASED SCHEDULER

Dependency relation for the solver algorithm {(A1),(A)1}D(A2)1

{(A)2,(A)3}D(A2)2 {(A)4,(AN)}D(A2)3 (A2)1D(E2)1

(A2)2D(E2)2 (A2)3D(E2)3

{(E2)1,(E2)2}D(A2)4 (A2)4D(E2)4

{(E2)3(E2)4}D(Aroot) (Aroot)D(Eroot)

(Eroot)D{(BS)1,(BS)2 (BS)1D{(BS)3,(BS)4}

Alphabet:

A={(A1), (A)1 , (A)2 , (A)3 , (A)4 , (AN), (A2)1 , (A2)2 , (A2)3 , (E2)1 , (E2)2 , (E2)3 , (A2)4 , (E2)4 , (Aroot) , (Eroot) , (BS)1 , (BS)2 , (BS)3 , (BS)4 }

(17)

TRACE THEORY BASED SCHEDULER

Dependency graph

(18)

TRACE THEORY BASED SCHEDULER

Dependency graph

(19)

TRACE THEORY BASED SCHEDULER

Scheduling according to Foata Normal Form:

(A1)-(A)1-(A)2-(A)3-(A)4- (AN)-(A2)1-(A2)2- (A2)3-(E2)1-(E2)2-(E2)3- (A2)4- (E2)4- (Aroot)-(Eroot)-(BS)1-(BS)2-(BS)3-(BS)4

[(A1)(A)1(A)2(A)3(A)4(AN)][(A2)1(A2)2(A2)3][(E2)1(E2)2(E2)3] [(A2)4][(E2)4] [(Eroot)][(Aroot)][(Eroot)][(BS)1(BS)2][(BS)3(BS)4]

Thus, the execution of the solver consists of several steps, where independent tasks are executed in concurrent, interchanged with the synchronization barriers.

    

 

k

 

k

ik kj

k j k i k

k i

n l n n l

l

Da a

l j

l i

k

Ia a l

j i k

A a

a a

a a

a a a

a

a

n

1 1

2 1 2

2 2 2 1 1 1

2 1 1

,..., 1 ,...,

1

,..., 1 ,

...

...

...

...

1 1

Foata Normal Form

(alphabet)

(20)

NUMERICAL EXPERIMENTS

NVIDIA GeForce 8800 gt with 16 multiprocessors, each having 8 cores (128 cores total)

1D solver O(logN) 2D solver O(NlogN)

When the number of leaves n is larger than number of processors, the execution time must be multiplied by n/p

(21)

PAPERS

Paweł Obrok, Paweł Pierzchała, Arkadiusz Szymczak, Maciej Paszyński

GRAPH GRAMMAR BASED MULTI-THREAD MULTI-FRONTAL PARALLEL SOLVER WITH THRACE THEORY BASED SCHEDULER

Procedia Computer Science, 1, 1 (2010) 1993-2001

Cytaty

Powiązane dokumenty

The main subject of this note is to establish conditions for the standard 2D Roesser model under which it is possible to choose state feedbacks such that the non-zero

[r]

Apart from the classic form affecting the frontal hairline, there are a range of disease manifestations involving loss of eyebrows and of eyelashes, loss of peripheral body

studied the functional neuroplasticity of inhibitory control and found that the left SMA-bilateral thalamic loop plays an important role in inhibitory control, suggesting that

Generation of frontal matrices at leaves of the eliminaton tree expressed as the execution of graph grammar productions (A1)-(A) 4 -(AN)... PROCESS OF

Lecture 1: Frontal and multi-frontal solvers: orderings, elimination trees, refinement trees The lecture introduces the frontal and multi-frontal solver algorithms on the example of

COMPUTATIONAL COST ESTIMATES FOR PARALLEL SHARED MEMORY ISOGEOMETRIC MULTI-FRONTAL SOLVERS,. Computers and Mathematics with Applications, 67(10)

Methods: The finite element analysis is employed to compare the different responses of the human body model, including comparison of kinematics, chest accelerations, seatbelt