Index of /rozprawy2/11478

Pełen tekst

(1)AGH University of Science and Technology Faculty of Metals Engineering and Industrial Computer Science Department of Applied Computer Science and Modelling. Development of computationally efficient cellular automata model for recrystallization Phd thesis MSc Mateusz Sitko. Supervisor: Professor Łukasz Madej. Cracow 2019.

(2) Akademia Górniczo-Hutnicza im. Stanisława Staszica Wydział Inżynierii Metali i Informatyki Przemysłowej Katedra Informatyki Stosowanej i Modelowania. Opracowanie wysokowydajnego obliczeniowo modelu automatów komórkowych dla rekrystalizacji. Rozprawa doktorska Mgr inż. Mateusz Sitko. Promotor: Prof. dr hab. inż. Łukasz Madej. Kraków 2019 2.

(3) Acknowledgements Financial assistance of the NCN project: Evaluation of high performance computing capabilities during modelling of microstructure evolution based on cellular automata method, no 2016/21/N/ST8/00194 is acknowledged. This research was supported in part by PL-Grid Infrastructure.. 3.

(4) I am grateful to my supervisor Professor Łukasz Madej for his guidance and encouragement during last few years. His extraordinary knowledge and permanent support were invaluable to the completion of the thesis. Moreover the friendly atmosphere that he is creating is always a motivation in everyday hard work. I am also grateful to Professor Krzysztof Banaś for his valuable and constructive comments on my work related to parallelization. Sincere thanks goes to my colleagues from the Faculty especially Krzysztof Muszka and Konrad Perzyński for their help in completion of this thesis. Finally, I would like to say thank you to my family for their support and constant motivation during my work. And last but not least, I would like to say thank you to my beloved wife Magda for her continuous support and most of all understanding. Without her, this work could never be realized.. The work is dedicated to Magda.. 4.

(5) Contents Acknowledgements ................................................................................................................. 3 1 Introduction ....................................................................................................................... 10 2 Numerical modelling of the static recrystallization .......................................................... 12 2.1 Fundamentals of the static recrystallization............................................................... 12 2.2 Mean field static recrystallization models ................................................................. 13 2.3 Full field static recrystallization models .................................................................... 16 3 Cellular automata static recrystallization models.............................................................. 24 4 Parallelization of cellular automata algorithms ................................................................. 35 5 Aim of the work ................................................................................................................ 41 6 Cellular automata static recrystallization model ............................................................... 43 6.1 Preprocessing module ................................................................................................ 44 6.2 Computation module.................................................................................................. 47 6.3 Postprocessing module .............................................................................................. 54 6.4 Implementation details ............................................................................................... 56 7 Evaluation of the CA SRX model robustness ................................................................... 58 7.1 Evaluation of representative CA cell size .................................................................. 58 7.2 Evaluation of representative CA time step length ..................................................... 63 7.2.1 CA SRX time step length adaptation algorithm ................................................. 65 7.3 CA SRX model robustness analysis .......................................................................... 66 8 Identification of the CA SRX model ................................................................................. 68 9 Mapping the SRX CA model to parallel execution .......................................................... 77 9.1 Computation domain decomposition ......................................................................... 77 9.1.1 Master – slave parallelization – mode 1 ............................................................. 79 9.1.2 Slave – slave parallelization – mode 2................................................................ 80 9.1.3 Combined parallelization – mode 3 .................................................................... 82 9.2 Communication mechanisms ..................................................................................... 82 9.3 CA SRX algorithm modifications to allow parallel computations ............................ 84 9.4 Validation of the parallel version of the CA SRX model .......................................... 86 10 Performance analysis of the parallel CA SRX code ...................................................... 91 10.1 Computation speedup and parallelization efficiency ................................................. 91 10.2 Role of CA space decomposition schemes ................................................................ 93 10.3 Communication and synchronization overheads ....................................................... 94 10.4 Scalability investigations ........................................................................................... 95 11 Conclusions ................................................................................................................... 97 11.1 CA SRX model - specific conclusions ...................................................................... 97 11.2 Parallelization - specific conclusions ......................................................................... 98 12 References ................................................................................................................... 100 Appendix ................................................................................................................................ 105 Figure list ................................................................................................................................ 107 Table list ................................................................................................................................. 113 5.

(6) List of symbols. a, b, mC , AC – cellular automata (CA) model parameters, ax , bx , csp , cx 3 , cx 4 – internal state variable model (ISV) parameters, a1 , a7 – maximum and minimum mobility for misorientation above specified region <θ1:θ2>, a2 : a6 – polynomial coefficients, A, k , n, p, q – JMAK model material parameters,. bB – Burgers vector,. ci – number of CA cells within the i-th grain neighbourhood, ciSRX – number of recrystallized CA cells within the i-th grain neighbourhood, C0 – nucleation parameter, C1 – recovery parameter,. CS – CA cell size, d – average grain size,. difT – threshold values difference, D0 – diffusion coefficient, Dgr – calculated grain size,. Dinit – measured grain size, EiSRX – energy of particular cell in the Monte Carlo (MC) algorithm, f p – volume fraction of precipitates,. hsm – smoothing length, H i 0 – initial stored energy in the CA cell,. H i – energy stored in the CA cell, H C – critical amount of energy required for nucleation, idi – grain id,. J gb – coefficient related to grain boundary energy,. k B – Boltzmann constant, Kink – number of cells in the neighbourhood with the same id as considered grain i,. m1 , m2 – mobility function parameters, minPC – percentage of cells with recrystallization volume fraction below assigned level,. minT – threshold value to increase time step, maxPC – percentage of cells with recrystallization volume fraction above assigned level,. 6.

(7) maxT – threshold value to reduce time step,. ms – number of experiments in inverse analysis,. M – matrix with orientation relationship between un-recrystallized and recrystallized grains,. M 0 – mobility parameter, M D – rotation matrix of deformed grains, M R – rotation matrix of recrystallized grains,. M G – grain boundary mobility, M m – high angle grain boundary mobility,. M θ – mobility parameter related with crystallographic orientation, MCS – Monte Carlo step,  nij – unit vector in level set method,. N CA – number of CA cells in the computational domain,. N NUC – number of nuclei in the CA model, N ss , N const , N max , N init – parameters in nucleation models, N – nucleation rate, N ′ – number of CA/MC neighbours,. p ( ∆E ) – probability of the change in the cell state, PNUC – nucleation probability,. P – net pressure, PE – net pressure related with accumulated energy, PGB – net pressure related with grain boundary curvature,. Pi – id of computational subdomain,. PZ – net pressure related with precipitates, Pρ – net pressure related with dislocation density,. Pψ ,i – cell phase (austenite, ferrite, pearlite,…), Q – number of available states in the MC algorithm,. Qdef – activation energy of deformation,. QG – activation energy for grain boundary motion,. QN – activation energy for nucleation, Qrec – activation energy for static recrystallization, r – position in phase field method,. r1 , r2 – principle radii of the boundary segment,. 7.

(8) rp – precipitate radius,. rx – number of recrystallized neighbours, RX fraction – calculated recrystallization volume fraction, RX init – measured recrystallization volume fraction,. – recrystallization volume fraction of the i-th cell in the t-1 time step, RX i ,fraction t −1. R – universal gas constant, S N – volume in which the nucleus can appear,. t – current simulation time, t0.5 – reference time for the 50% of recrystallization, tmod – percentage of a time step modification, t step – length of a time step in the CA iteration, tstepInit – initial time step length,. T – temperature, TR – recrystallization temperature, Tm – melting temperature, vmax – maximum velocity, vi – velocity of growing grains,  vi – velocity of i-th vertex, V j – material volume related to the j-th particle,. w0 , w1 – scaling parameters for inverse analysis, wswitch – hybrid CA model parameter, Wij – kernel function,. x – random number for range <0:1>, X – recrystallized volume fraction, Yi t – state of the i-th cell in a particular time step t,. Y jt – state of the j-th neighbouring cell in a particular time step t, Z – Zener-Hollomon parameter, γ – grain boundary energy,. γ HAGB – high angle grain boundary energy,. γ LAGB – low angle grain boundary energy, δ SiSj – Kronecker delta,. δ Eij – stored energy difference across grain boundary, 8.

(9) ∆Gij – driving pressure for the transformation,. ε i – equivalent plastic strain, ε C – critical equivalent plastic strain value for nucleation,. εi – equivalent plastic strain rate,. η – grain boundary thickness, θ – misorientation angle,. θi – crystallographic orientation, θ m – high angle grain boundary threshold, θ max – maximum value of crystallographic orientation, κ – grain boundary curvature coefficient,. κ X – isotropic hardening parameter, µ – shear modulus,. µij – interfacial mobility,. σ ij – interfacial energy, τ 0 – initial time of recovery,. φ ( x, t ) – level set function, ϕ1 , Φ, ϕ2 – Euler angles, ϕ1′ , Φ′, ϕ2′ – Euler angles of recrystallized grains, ⟨⟩ – approximation,. α X – equivalent quantity of recrystallization dependent kinematic hardening tensor.. 9.

(10) 1 Introduction Increasing clients demands for products with complex shapes and sophisticated properties is the major driving force for changes in the production chains of various metallic components. Nowadays, to obtain required final material properties at the macro scale e.g. strength, hardness, corrosion resistance or fatigue resistance not only process parameters like: temperature, deformation path, deformation rate or number of deformation cycles should be considered. Also micro scale features such as grain size inhomogeneities, crystallographic texture, volume fraction of different phases or even atomic bonds should be taken into account. Finding and controlling correlation between all these elements at various length and time scales can lead to production of metallic materials for wide range of present and future practical applications. Thermo-mechanical treatment is one of the possibilities to provide required range of options to control final microstructure morphology and in-use properties of products. Combination of deformation and varying process temperatures can be used to initiate and control three major groups of phenomena responsible for the microstructure evolution. The texture evolution is the first one [1–3]. The second group involves phase transformations in metallic materials both during heating and cooling stages [4,5] and the last group is represented by thermally activated phenomena of recrystallization (static, dynamic and metadynamic) [6–8]. The later, has been experimentally and numerically investigated for many decades, increasing the level of understanding of interactions between processing conditions and microstructural changes [9–11]. Modelling of the recrystallization started in mid-20th-century and works by Jonas in Canada [12], Sellars in Europe [13], Sakai in Japan [14] and Hodgson in Australia [15] can be considered as the most important contributions to the development of microstructure evolution models, which today are generally classified as mean field approaches. These mean field models are based on closed-form or differential equations and provide valuable results in the form of average data of flow stress values and process kinetics. Presently, they are still commonly used in practical industrial applications as they are fast and do not require large computing facilities. However, in recent years more advanced types of approaches called full field models are being intensively developed. They can accurately predict evolution of microstructure morphology in an explicit way what significantly increases the amount of information obtained from numerical modelling. A good example are explicit full field models namely the phase field (PF) [16] and the level set (LS) methods [17,18]. Both approaches are based on computational continuum mechanics. Conceptually different full field approach to modelling grain growth with explicit consideration of microstructure is represented by the discrete Monte Carlo (MC) and cellular. 10.

(11) automata (CA) techniques. Both approaches can also take into account stochastic aspects of deformation. The MC method performs calculations within arbitrary units framework and the governing rule controlling changes of cells states is based on an energy minimization of the investigated system. Lack of physical units is a limiting factor for practical applications in this case. On the contrary, the CA technique is free of such limitation as it is based on physical units and defined transition rules can also be based on analytical equations. As a result, the cellular automata technique allows to precisely track the recrystallization front, as the simulation step is related to the real time. Complex numerical models to simulate microstructure development in metallic materials based on the cellular automata method are recently more and more frequently published in the scientific literature [10,19–25]. Therefore, the later technique was selected for the investigation within the dissertation. Unfortunately, computing time of these advanced models is several times longer than that of mean field models, and this is presently the major limiting factor of their practical applications at the industrial scale. However, a computational power of modern computers has significantly increased in recent years and currently it is possible to perform even large number of calculations in an acceptable simulation time [26–29]. In the past, simulation of recrystallization or phase transformation in the 3D domain took weeks if not months and now it can be done in only few hours. As a result, mentioned progress created new possibilities for practical use of cellular automata models. Unfortunately, mapping conventional CA solution to the CA model that can take advantage of modern computer architectures is not trivial and requires series of fundamental investigations. Therefore, issues related to development of the reliable CA model of static recrystallization (SRX) and increase its computational efficiency by application of modern programming techniques and computer capabilities are addressed within the dissertation.. 11.

(12) 2 Numerical modelling of the static recrystallization As mentioned, large variety of approaches for modelling static recrystallization with different levels of complexity can be found in the literature. Both groups of mean and full field models used for SRX simulations are summarized below, however, particular attention is put on the progress in the cellular automata based models. 2.1. Fundamentals of the static recrystallization. Throughout a static recrystallization, which take place during heat treatment of plastically processed microstructures, highly deformed grains are being replaced with new ones, which are free of deformation defects (Fig. 2.1). This is possible due to nucleation and growth stages of newly formed grains. Contrary to the dynamic recrystallization, the static one occurs after deformation. Metalforming processes carried out at cold deformation conditions such as rolling, forging, drawing or bending result in a significant increase in the distortion in the sample crystal lattice. This distortion is caused by different types of defects such as: dislocations, vacancies, prismatic dislocation loops of vacancies, self-interstitial atoms (SIAs), vacancy clusters, voids or SIA-clusters. Their multiplication and accumulation lead to further work hardening of deformed component. Then, during subsequent heat treatment, recrystallization process along with the recovery phenomena (Fig. 2.1) reduces the amount of stored energy and cause increase in formability and decrease in material hardness and strength. With that, high deformation degrees can be obtained during multiple cycles of cold deformation and heat treatment operations.. Fig. 2.1 Schematic diagram of the annealing processes occurring during heat treatment after cold deformation: a) deformed, b) recovered, c) partially recrystallized, d) fully recrystallized microstructures and e) grain growth, f) abnormal grain growth stages [30].. 12.

(13) When during heat treatment operation deformed material reaches specified temperature the grain structure is being rebuild. This temperature is called a recrystallization temperature, and equals around TR ≈ 0.4 Tm for most metals and alloys ( Tm – melting temperature). Above that temperature rebuilding of crystallographic lattice takes place as a result of consumption of stored energy what is seen in Fig. 2.1c – Fig. 2.1f. Recrystallization temperature TR depends mostly on material deformation degree and addition of alloying elements. It can be reduced by decreasing the amount of alloying elements and increasing deformation level. When this temperature is reached the recrystallization process composed of two phases: nucleation and grain growth can start. Based on an experimental research it can be noticed that, a new nucleon appears when following assumptions are fulfilled: •. large amount of deformation energy is accumulated in a specific area of the material,. •. large gradients in the energy distribution are observed, what lead to local instabilities,. significant lattice curvatures are present. It can be assumed that when material is deformed less than approximately 35%, mentioned conditions are only fulfilled near primary grain boundaries and at the cross-sections of twins. That is why, when small deformation level is considered the final grain size and grain size distribution depend mostly on the grain sizes before deformation. For higher deformation levels around 50% new recrystallized nuclei appear also in shear regions. But always primary grain boundaries are the most probable nucleation sites. Mechanism of nucleation is in general described as a migration of primary grain boundaries caused by deformation or as a result of growth of sub-grains. After a new nucleon appears in the material, a growth stage can be observed. Driving force for the grain growth is related to differences in dislocation density between recrystallized grain and surrounding deformed grains. Additionally, the grain crystallographic orientation as well as grain boundary curvature play important roles [31]. As a result, during recrystallization decrease in dislocation density is observed. Therefore, prediction of recrystallization progress is crucial for designing a new material with sophisticated in-use properties [30]. That is why the next chapter of this work is dedicated to the literature review of different methods, which were developed for simulation of the SRX, starting with the simplest, but fastest, mean field approaches and finishing with complex full field models. •. 2.2. Mean field static recrystallization models. Conventional mean field models of microstructure evolution during SRX are commonly applied in simulation of metalforming processes for at least five decades [32–37]. The most widely used representative of mean field approaches for prediction of static recrystallization kinetics is the Johnson, Mehl, Avrami and Kolmogorov (JMAK) model [32–34], which in its basic form can be described as: 13.

(14) (. ). X= 1 − exp −kt n ,. (2.1). where: k , n – material parameters, X – recrystallized volume fraction, t – time. n Introducing k = ln ( 0.5 ) / t0.5 to eq. (2.1) provides slightly different form of the JMAK equation. [38]: n   t   X= 1 − exp  − ln ( 0.5 )    ,   t0.5  . (2.2). The reference time for the 50% of recrystallization can be described as [39]: Q  (2.3) t0.5 = Aε i− p Z − q exp  rec  ,  RT  where: Z = εi exp Qdef / RT – Zener-Hollomon parameter, A, q, p – material parameters,. (. ). R – universal gas constant, T – temperature in K, Qrec – activation energy of static. recrystallization, Qdef – activation energy of deformation, ε i – equivalent plastic strain, εi – equivalent plastic strain rate. Along with a volume fraction evolution during SRX also a final grain size d is often calculated [30]:. d = f (ζ i ) Z q ,. (2.4). where: q – exponent, f (ζ i ) – function of a set of parameters including initial average grain size and value of accumulated plastic strain. Predictive capabilities of different modifications of the JMAK equation for variety of materials are in detailed reviewed in [40–42]. Models based on the JMAK equation are the simplest solutions to obtain average behaviour of material microstructure during static recrystallization. Due to their simplicity the computation cost is low and that is why these approaches are commonly used during numerical simulations of industrial processing conditions. Moreover, mean field models for recrystallization and grain growth are often coupled with different computational methods to extend they predictive capabilities. One of such combination of mean-field SRX model with e.g. a cluster dynamics model for the evolution of the neutron radiation defects is presented in [43]. For each global time increment, the set of coupled first order nonlinear differential equations of the cluster dynamics model is solved to evaluate the defect evolutions, taking into account: generation, evolution and annihilation of lattice vacancies as well as self-interstitial defects and their clusters (Fig. 2.2a). Based on these information changes in the dislocation density are evaluated. After that recrystallization and grain growth are simulated as seen in Fig. 2.2b.. 14.

(15) Fig. 2.2. Concept of the multiscale model of microstructure evolution based on the JMAK equation coupled with cluster dynamics model a) lattice defect generation, b) grain growth and recrystallization [43].. Another interesting mean field approach for SRX predictions is based on an internal state variable (ISV) [44] constitutive model:.  κ X2 + α X 2  a dX S b = Cxs ( P, T )   X x (1 − X ) x ,  µ ( P, T )  dt  . (2.5).  c + Pcsp  Cxs= ( P, T ) cx3 exp  − x 4 , T  . (2.6). where: ax , bx , csp , cx 3 , cx 4 – parameters, P – hydrostatic pressure, κ X – isotropic hardening parameter, α X – equivalent quantity of recrystallization dependent kinematic hardening tensor, µ – shear modulus. Eq. (2.5) is integrated by time while recrystallization volume fraction increments are multiplied by the inelastic strain and time increments. Also mean field models taking into account crystallographic texture evolution during SRX should be mentioned. One of them based on nucleation and crystal growth kinetics is presented by Sebald and Gottstein in [45]. Another interesting one takes into account crystallographic transformation of deformed samples as an orientation distribution function [46]. Since computing times for mean field models are very short, these models can be solved at each integration (Gauss) point of the FE mesh without causing a noticeable increase in the computing cost (Fig. 2.3). However, at the same time these approaches provide only average information on state of the material e.g. average recrystalized volume fraction or average grain sizes. On the contrary, application of the full field models e.g. phase field, level set, Monte Carlo or cellular automata provides very detailed information on local material state, but causes a significant increase in computing costs. Therefore, these models are usually attached only to a few selected points of the macro model as presented in Fig. 2.3.. 15.

(16) Fig. 2.3 Schematic illustration of the multiscale approach based on mean (JMAK) and full field RVE (Representative Volume Element) models [47].. Review of the most often used full field models for modelling static recrystallization is described in the following chapter. 2.3. Full field static recrystallization models. Presented above SRX mean field models are based on closed-form or differential equations and provide valuable results. However, more advanced types of approaches called full field models can also accurately predict evolution of microstructure morphology in an explicit way. As mentioned, a good example are full field models based on the phase field (PF) and the level set (LS) methods. The major assumption of the phase field model is that investigated polycrystalline sample is represented in an integral manner. A set of continuous variables φi ( r , t ) representing individual grains, has to be defined as a function of location r and time t . These variables take constant values inside subsequent grains (within the particular grain the value is 1 while outside the grain is 0) and change gradually (from 1 to 0) along a diffuse boundary. At the same time the sum of all phase field variables is 1 at each time step and at each point in the simulation domain. According to physical background, the concept of the recrystallization model assumes two stages, nucleation and growth, respectively. The growth of new nuclei is controlled by the phase field equations. The interfaces mobility, interfacial energies and the driving pressure for the recrystallization are parameters of these equations and they determine the evolution of a phase field φi : dφi = dt. ∑µ j.  π π2   2  2 ∇ − ∇ + σ φ φ φ φ φ − φ j ) + φiφ j ∆Gij   ij ij  i j j i 2 ( i 2η  η   . ,. (2.7). 16.

(17) where: µij – interfacial mobilities, σ ij – interfacial energies, ∆Gij – driving pressure for the transformation, which is a function of the temperature T and the local composition, η – grain boundary thickness, φi ∇ 2φ j and φ j ∇ 2φi – gradient terms,. π2 (φi − φ j ) – stabilizing term. 2η 2. Recent, review of phase field approaches was realized in [48], where authors pointed out two commonly used models to simulate grain growth phenomena. The first is a continuum-field model [49] and the second multi-phase-field model [50,51]. As presented in [16,52] phase field method can be successfully applied to simulation of static recrystallization phenomenon. What is more important combination with crystal plasticity finite element (CPFE) simulation is possible with precise tracking not only of energy value (Fig. 2.4a), but also changes in crystallographic orientations (Fig. 2.4b) during deformation and heat treatment [51]. Examples of simulation results obtained from the mentioned approach can be seen in Fig. 2.4c and Fig. 2.4d.. Fig. 2.4 Example of the phase field simulation results a) initial stored energy calculated during deformation by the CPFE model, b) microstructure after deformation, c) microstructure evolution during SRX, d) evolution of crystallographic orientations during SRX [51].. As mentioned, another full field approach is the LS method. Overview of method assumptions can be found in [53]. Generally, the level set method group is based on computing and tracking the position of moving interface I in two or three dimensions. Interface I represents the zero level set function φ ( x, t ) : φ ( x, t ) > 0 for x inside I  x at I ,  φ ( x, t ) = 0 for φ ( x, t ) < 0 for x outside I . (2.8). Moving interface can be described by the following formula: dφ = + x∇φ 0, = given φ ( x, t ) 0 , dt. (2.9) 17.

(18) In the digital representation of the investigated microstructure each grain is associated with one independent level set function φi ( x, t ) . During the simulation subsequent level set functions change independently. Because this kind of solution can lead to voids and overlapping inside microstructure some correction algorithms should be implemented to the model, see e.g. [54]. The LS method found wide range of applications to simulation of recrystallization phenomena [55–57]. Connection of the level-set method with FE framework [17] gave an opportunity to use adaptive remeshing around moving grain boundaries and to get even better description of moving interfaces. Moreover, using this approach efficient link between SRX and crystal plasticity models can be achieved. Examples of LS simulations with different nucleation types are presented in Fig. 2.5.. Fig. 2.5 Different nucleation types in SRX simulation based on the level-set method with heterogenous energy distribution between grains: a) random nucleation, b) necklace-type nucleation, c) volumetric nucleation [56].. However, during rapid change of phase field across diffuse interface, both the level-set and the phase field methods cause significant increase in computational time, especially at the 3D space. Limitation of these two presented approaches is also associated with relatively small volume of material (order of few hundred µm3) that can be replicated during a SRX simulation. As was presented in [56] simulation with 3D domain after many LS algorithm improvements required around one day of computation on 24 Intel Xeons CPUs. Without any improvements of the standard LS SRX code, solving the same problem took around 1 month and 20 days. Another full field method for the SRX modelling that recently gained a lot of attention is the Vertex approach, which is based on the front tracking method. Microstructure in this case is again represented explicitly by points r (vertices) and their velocities v. Grain boundaries between two vertexes (ij) are defined as a vector rij (Fig. 2.6).. 18.

(19) Fig. 2.6 Representation of microstructure within the Vertex method [58].. As can be seen in Fig. 2.6 beside triple point junctions, grain boundary curvature can be represented by virtual vertices. In the literature the Vertex method is used by many researchers for simulation of grain growth, texture evolution or recrystallization phenomena [58–61]. During simulation of the recrystallization each grain can be additionally described by an individual orientations g n and a stored energy value En . Motion of a vertex is controlled by [62]:   1 i=N  Di v=i fi − ∑ Dij v j , (2.10) 2 j    where: vi – velocity of i-th vertex, v j – velocity of j-th vertex. Dij , Di , f i are calculated as:.  yij2 Dij =   3mij rij  − xij yij 1. − xij yij  , xij2 . (i ). (2.11). Di = ∑ Dij ,. (2.12).  (i ) (i )  rij  −∑ γ ij  + ∑ δ Eij nij , fi = rij j j. (2.13). j. where: mij – grain boundary mobility, γ ij – energy of the grain boundary, δ Eij – stored energy   difference across grain boundary, nij – unit vector perpendicular to the rij vector, which is calculated from:    rij= ri − rj ,. (2.14). Example of simulation results obtained with the Vertex method is presented in Fig. 2.7.. 19.

(20) Fig. 2.7 Progress of recrystallization simulation using the Vertex method [61].. As presented in [58] driving forces related to the stored energy, grain boundary curvature as well as particle pining effect can be included in the vertex formalism. Moreover, combination with crystal plasticity is also straightforward. However, front-tracking methods have some limitations. Mainly handling of topological events such as grain shrinkage or appearance of new nuclei in three dimensional space is complicated. Also the complexity of the implementation of these models is high. This is mainly related to the complex topological transformations, which must be implemented. Another group of full field methods, which is based on a random sampling of a solution space Ω with finite number of elements, is generally called Monte Carlo (MC). This group of methods is commonly used in many different material science applications from generation of microstructures during deposition [63] to simulation of material evolution during dynamic recrystallization processes [64]. MC algorithms are also commonly used and extended within many research papers to simulate static recrystallization [9,65–67]. All MC SRX models are based on the same set of steps: preparation of initial microstructure, nucleation and grain growth of recrystallized grains. Due to the fact that nucleation is very complex and it is hard to measure and investigate it experimentally, there is no single model that can describe nucleation phenomenon. So in the literature many different approaches to address nucleation were presented [68,69]. The second stage of static recrystallization is the grain growth phenomenon, where the nuclei of new recrystalized grains grow into the deformed matrix. Usually, the energy stored in the material due to deformation is the main driving force of that process [70]. MC models based. 20.

(21) on the arbitrary units of the energy are widely used [9,65,71]. SRX calculations are based in this case on the evaluation of energy value at each MC step with the following formula: N′. (. ). EiSRX= J gb ∑ 1 − δ Si S j + H i , j =1. where:. (2.15). J gb – coefficient related to grain boundary energy, δ SiSj – Kronecker delta,. N’– number of neighbours, H i – stored energy in a particular cell. After the energy computations, a selected lattice site is reoriented to the temporary recrystallized state Qi . The remaining calculations of the energy value after the reorientation are again performed with the eq. (2.15) but the H i is omitted due to the fact the energy is reduced during the recrystallization. After a change of the state, a system energy difference is calculated as: SRX , ∆E SRX = EiSRX +1 − Ei. (2.16). Calculations of the probability of the state acceptance in the SRX model is extended in comparison to the simple grain growth model. The probability term in this case incorporates the correlations between the misorientation angle and the grain boundary mobility [72]: m2   θ   M G = M m 1 − exp  −m1   , θm     . (2.17). where: M m – high angle grain boundary mobility (value of 1 is commonly assumed [9,65]), m1 , m2 – parameters (e.g. values of 1 and 3 can be used respectively [30,73]), θ – misorientation angle between two lattice sites, θ m – misorientation angle of a high angle grain boundary (value of 15° is assumed [49]). Thus, the probability of the state acceptance in the MC cell is a function of both mobility term M G and energy ∆E SRX and is calculated as [74]: 1 if M G > x ∧ ∆E SRX ≤ 0 where x ⊂< 0 :1 >  , p=  −∆E SRX  exp ∆E SRX > 0   if   kT  . (2.18). When a mobility is higher than the threshold value x and an energy difference is lower or equal to zero that means that the new state of the investigated MC cell is accepted. Otherwise there is still some probability of reorientation acceptance, which eliminates a so called lattice pinning effect [75]. With successive MC steps, the energy value of the entire system is reduced. Examples of MC SRX calculations are shown in Fig. 2.8.. 21.

(22) Fig. 2.8 Progress of a recrystallization simulation using the MC method.. Due to a stochastic nature of the MC method, the major limitation is a long computational time [76]. To accelerate the algorithm, several modifications to the standard approach can be implemented e.g. selection of a new cell state for reorientation from states directly adjacent to the analysed cell, analysis of cells located only along the grain boundary and selection of cells only once during a particular MC step [77]. All these modifications accelerate computations but in general they are still quite time consuming. Another weakness of this method is also related to the dependence to random number generator. Simple and fast ones provide relatively poor random numbers distribution, while more advance generators extend computational time. The cellular automata method, mentioned earlier, can also be classified to the same group as the MC approaches, performing calculations within the computational spaces with finite number of cells. In recent years a lot of papers dedicated to simulation of recrystallization by the cellular automata were published [78,79]. First mentions of the method can be found in the work presented by von Neumann in 1966 [80]. He defined a deterministic cellular automata as a model with three major components: •. CA space – finite set of cells, where each cell is represented by a set of internal and state variables describing the state of a cell.. •. Neighbourhood – describes the closest neighbours of a particular cell. It can be in 1D, 2D and 3D spaces. Example of neighbourhoods definition are presented in Fig. 3.3.. •. Transition rules f – the state of each cell in the lattice is determined by the previous states of its neighbours and the cell itself by the transition function f. Generally a transition rule can be expressed as:. = Yi t +1. f (Y jt ) , j ∈ N ( i ) ,. (2.19). where: N ( i ) – i-th cell neighbourhood, Yi – i-th cell state. The main assumption of the CA method is that the size of the computing space is fixed during the simulation and cells change their states synchronously in discretely defined time steps. Another assumption concerns cells long range interactions, which are in general neglected and only nearest neighbours are considered in a particular time step. The CA method found wide range of applications in the material science area e.g. simulation of solidification 22.

(23) [81], crack propagation [82], phase transformation [83] or, investigated within the dissertation, static recrystallization [77]. The first work focused on the cellular automata static recrystallization (SRX) modelling was a paper by Hesselbarth and Göbel [84]. In the approach they assumed that cells can be in two different states: recrystallized and unrecrystallized. Authors also adapted the approach that the recrystallization process is composed of the two main steps – nucleation and grain growth. The model could predict not only kinetics of the SRX but also provided data on microstructure evolution during subsequent steps. This was a precursor of a series of CA SRX models reviewed in detail in the following chapter of this dissertation. Besides classical approach to the CA algorithm, an interesting solution that can reduce computational cost is the frontal cellular automata (FCA) concept [85–87]. The frontal cellular automata is based on the same CA space and neighbourhood definition as the classical method but information flow direction is different. Models based on the classical CA method iterate through a CA space cell by cell applying transition rules to each of them. In this case the cell changes its state based on the information from neighbouring CA cells as presented in Fig. 2.9a. In the FCA method three different groups of cells with different interaction mechanisms (Fig. 2.9b) can be consider: •. cells belonging to grain boundaries, all these cells participate in simulation and evaluation of transition rules,. •. cells inside the grain waiting for a state change,. •. cells, which already changed their state and further changes are not expected.. Fig. 2.9. Concept of an information flow in the a) classical CA, b) frontal CA.. The FCA approach was successfully used for numerical modelling of the SRX phenomenon e.g. [86,88] as it provides possibilities to reduce computational time especially in 3D. However, restricting calculations only to CA cells located at the grain boundary may not replicate all the important phenomena occurring also in the CA cells within the grains e.g. recovery or precipitation. That is why, the classical approach to the CA method was selected for further investigation. The state-of-the-art in the area of SRX modelling based on the classical cellular automata method is presented in the following chapter. 23.

(24) 3 Cellular automata static recrystallization models There is a wide diversity of CA SRX models available in the scientific literature [75,79,89–94]. They are characterized by different levels of complexity related to definitions of e.g. dimensions of the computational domain (Fig. 3.1) [95,96], CA cell shapes (Fig. 3.2) [97] or neighbourhood types (Fig. 3.3, Fig. 3.4) [98,99].. Fig. 3.1 Different computational domains in a) 2D, b) 3D spaces.. Fig. 3.2 CA spaces with different CA cell shapes a) circural, b) hexagonal, c) square, d) triangular.. 24.

(25) Fig. 3.3 Different neighbourhood definitions in the CA space a) elementary, b) von Neumann, c) Moore, d) extended Moore, e) pentagonal random (x ∈ <0,1>), f) hexagonal random (x ∈ <0,1>), g) combination of Moore and von Neumann.. Fig. 3.4 Different neighbourhood definitions in the 3D CA space a) von Neumann, b) Moore, c) extended Moore, d) pentagonal random (x ∈ <0,1>), e) hexagonal random (x ∈ <0,1>).. 25.

(26) However, the most important component of the cellular automata method that diversifies the complexity and predictive capabilities of developed models is related to physical mechanisms considered during a definition of transition rules. Stochastic properties of the system, can be incorporated into CA microstructure evolution models mainly through the introduction of non-deterministic transition rules or by definition of the random type neighbourhood [100]. The CA SRX models complexity is also diversified via the definition of: initial state of the microstructure morphology, accumulated energy and its distribution, nucleation type or phenomena influencing grain growth. Progress in development of the SRX models and their predictive capabilities is presented in the following part of the chapter. Throughout the years different approaches to provide initial microstructure morphology for SRX calculations were proposed e.g. Voronoi tessellation [101] (Fig. 3.5), statistically representative approaches [102] (Fig. 3.6), Monte Carlo [103] (Fig. 3.7) or cellular automata [104] (Fig. 3.8).. Fig. 3.5 Initial 500×500×500 µm microstructure obtained by the Voronoi tessellation [101].. Fig. 3.6 Initial 128×128×128 µm microstructure obtained by the statistically representative approach [102].. 26.

(27) Fig. 3.7 Initial 200×200×200 µm microstructure obtained by the Monte Carlo grain growth algorithm: a) 10 Monte Carlo Steps (MCS), b) 100 MCS, c) 1000 MCS.. Fig. 3.8 Initial 200×200×200 µm microstructure obtained by the cellular automata grain growth algorithm with different neighbourhood definitions: a) Moore, b) von Neumann, c) random pentagonal, d) random hexagonal.. To eliminate unphysical artefacts in the digital microstructure morphologies an image analysis methods were also used [105]. In this case the input data are in the form of light or electron microscopy images. The main advantage of image processing approaches is that even very complex microstructures of multiphase materials can be easily digitalized for further SRX calculation see e.g. [70] for ferritic-pearlitic microstructure investigation (Fig. 3.9). This approach is more demanding in the case of generation of a 3D input data, as 3D computed tomography results based on e.g. synchrotron radiation techniques (Fig. 3.10) have to be used [106–108]. This procedure allows to track evolution of more than 5000 grains during multiple stages deformation, and provides precise information on grains crystallographic orientation evolution but it is very expensive. 27.

(28) Fig. 3.9 Digital material representation of a two phase ferritic-pearlitic microstructure [70].. Fig. 3.10 a) Schematic illustration of an experimental apparatus for the near-field high-energy X-ray diffraction microscopy (nf-HEDM) measurements under in-situ tensile loading [108], b) microstructure obtained via nf-HEDM procedure.. Therefore, for the 3D generation of microstructure morphology, mentioned numerical approaches are more often used [109]. Microstructures generated with these approaches in a statistical sense represent the grain size distribution, phase fractions or crystallographic texture observed during metallographic investigations. After generation of an initial microstructure morphology, energy distribution across the entire sample, accumulated due to plastic deformation, has to be addressed. This energy field is behind one of the key driving forces for the grain growth during SRX. Again, there is a large variety of approaches with different levels of complexity to handle this task. The simplest and commonly used ones to introduce energy into the CA computational domain are based on homogeneous or heterogenous artificial energy distribution [110]. Some of the approaches assume homogeneous distribution of energy after deformation, neglecting any influence of microstructure heterogeneities (Fig. 3.11a) [111]. As a result, these models can introduce inaccuracies into calculations. The solution to that, is artificial heterogenous distribution of accumulated energy with respect to grain boundaries and grain interior (Fig. 3.11b). Comparison of results obtained from both approaches was studied in [77] and proved major influence of energy distribution on SRX progress, as presented in Fig. 3.12. 28.

(29) Fig. 3.11 Artificial energy distribution in a a) homogeneous, b) heterogeneous manner.. Fig. 3.12 Recrystallization fraction for a a) homogenous, b) heterogenous initial energy distribution.. Thus, to properly reproduce distribution of energy accumulated during deformation prior CA SRX calculations, a combination of digital material representation (DMR) concept and finite element simulation can be used, see e.g. [77]. In this case [77], during the first step of simulation material morphology is generated, discretized with FE mesh and subjected to finite element modelling of deformation conditions. As a result, detailed information on energy distribution across various microstructural features and grain geometry can be replicated as in Fig. 3.13. This approach was successfully used in the literature to provide input data for a single-phase metallic microstructures [10]. The concept was then extended during one of the earlier author work to multiple phase microstructures, see [112].. 29.

(30) Fig. 3.13 Concept of an initial microstructure generation for SRX calculations based on FE calculations.. Mentioned class of models, significantly increase the quality of obtained results after SRX simulations. However, they still neglected very important factor influencing grain growth kinetics during the SRX, namely the crystallographic orientation of grains (texture effect). To overcome this limitation the crystal plasticity finite element models were proposed to provide information on deformed microstructure morphology, energy accumulation and texture [92,96]. However, providing input data for the CA simulation is not the only step where differences in models complexities can be noticed. The other two important stages where series of simplifications in the model assumptions can be introduced or eliminated are nucleation and grain growth. The nucleation phenomenon is the most critical element of the recrystallization models, which is very difficult to properly replicate, because it is hard to directly observe during experimental research. This is related to the fact that nucleation usually takes place at the lower length scale than the grain growth stage. As a consequence, researchers often treat nucleation as a separate module, which only communicates with the CA grain growth model. Nucleation algorithm presented in e.g. [113] assumes that new nuclei are created only at the beginning of simulation and are treated as regions with specified size (one nucleon may be represented by many CA cells), crystallographic orientation and location in the computational space. In this case eq. (3.1) determinates an amount of nuclei, which appear in specified time and temperature as:. N NUC = f (T , ti ) ,. (3.1) 30.

(31) When number of new nuclei is established, they can be distributed across the CA space. Different types of nucleation algorithms are available in the scientific literature: •. Site saturated nucleation – takes place only at the beginning of simulation [6]. This scheme is often modified to assume also thermodynamic conditions, which must be fulfilled to allow nucleation [90].. •. Nucleation based on regions selection can be divided into two groups: homogeneous (all cells in a CA space are potential nuclei candidates) and heterogeneous (only cells at the primary grain boundaries are considered) as presented in [6].. Continuous nucleation – additional nuclei are added during subsequent steps of the simulation. Nuclei can be added during the simulation with the same, increased or decreased amount in the following time steps [75]. Example of basic, site saturated homogeneous nucleation scheme is presented in Fig. 3.14. •. Fig. 3.14 Site saturated homogeneous nucleation scheme and subsequent grain growth.. Also, more advanced nucleation models based on probability factors were proposed e.g. [93]. In this case the probability of nucleation is calculated as:  Ndt , (3.2) PN = N CA where: N – nucleation rate, N CA – number of CA cells in the computational domain. Then, nucleation rate was introduced, which may be a function of time step or physical size of CA cells [114]:  Q N = C1 ( H i − H C ) exp  − N  RT.  , . (3.3). where: C1 – scaling parameter, H i – amount of energy in a particular cell, H C – critical amount of energy, which is necessary to trigger nucleation, QN – activation energy of nucleation. Influence of the recovery leading to reduction in the amount of accumulated energy in a particular CA cell was also addressed in e.g. [115]. Finally, after setting all initial data and nucleation model, various approaches to grain growth can be found in the scientific literature. However, most of them are based on the classical velocity equation: 31.

(32) (3.4). v = MP ,. where: M – mobility of grain boundary, P – driving pressure for grain boundary movement. The driving pressure on a grain boundary can be considered in the form of dislocation density difference ∆ρ in neighbouring grains [30,111]:. = Pρ cµ b 2 ∆ρ ,. (3.5). where: c – constant, µ – shear modulus, b – magnitude of Burgers vector. The other solution is based on the stored energy value [116]: PH= H=. εi γ LAGB , aε i + b. (3.6). where: a, b – material constants, ε i – equivalent plastic strain, γ LAGB – low angle grain boundary energy. However, different modifications were also proposed e.g. based on a general knowledge that both mobility M and pressure P are local quantities, probabilistic CA transition rules were introduced e.g. in [117]. This work presented “hybrid” model, based on an additional local variable wswitch , which is calculated based on a maximum velocity vmax occurring in a computational domain as: wswitch =. v vmax. ,. (3.7). Deterministic transition rules on the other hand were presented in [84]. The interaction of growing grain with the precipitation phenomenon often occurring during thermo-mechanical processing, was then introduced by Raabe in [90]. This interaction is called the Zener drag effect, and is taken into account by defining a driving pressure term:. 3 fp PZ = γ , 2 rp. (3.8). where: rp – precipitates radius, f p – volume fraction of precipitates, γ – grain boundary energy, which can be calculated based on a misorientation angle dependence using the Read-Shockley approximation [30] (angles below 15º):. γ γ HAGB =.  θ  θ  1 − ln    ,  θm   θm  . (3.9). where: γ HAGB – high angle grain boundary energy, θ – misorientation angle. Finally, the grain boundary curvature effect on a growing grain is usually described by: PGB = κγ HAGB ,. (3.10). where: κ – grain boundary curvature:. κ=. 1 1 + , r1 r2. (3.11). where: r1 , r2 – principle radii of the boundary segment. 32.

(33) Another solution, to calculate a grain boundary curvature is based on a kink-template and an extended Moore neighbourhood [96]:. κ=. AC Kink − ci CS N′. ,. (3.12). where: CS – cell physical size, AC – topological coefficient (mostly 1.28), N ′ – total number of the first and second nearest neighbours for square lattice (in 2D model: 24, in 3D: 124), ci – number of cells within the neighbourhood belonging to investigated grain, Kink – number of cells that create a flat interface (in 2D model: 15, in 3D: 75). As presented, a lot of research dedicated to simulation of material morphology evolution during SRX by the CA method can be found in the literature. Major advantages of these models are good predictability of complex material behaviours including multiphase materials with heterogeneous gains distribution. However, there is still lack of a complex model taking into account all mentioned important phenomena influencing SRX. At the same time, almost all of the CA SRX models are designed for sequential calculations. Researchers focus mostly on reduction of computational time for example by using simplifications (e.g. removing some parts of the algorithm dedicated to sophisticated nucleation mechanism or orientation influence, see Fig. 3.15) or modifications of the CA algorithm (e.g. considering cells located only at a moving recrystallization front [86]).. Fig. 3.15 Extension of a computational time with respect to a model complexity.. 33.

(34) Nevertheless, because of excessive computational times, simulations based on the CA method are still strongly limited in potential applications under industrial conditions. However, in recent years, enormous increase in computational power of single computational units as well as powerful servers was observed. This situation gives the possibility to make simulations based on the CA method much faster and on the other hand allows to simulate large microstructures in 3D. Therefore, more elements presented in Fig. 3.15 can be considered during a SRX modelling. But to properly take advantages of novel computer architectures different implementation concepts have to be introduced into the standard code for its parallel execution. Literature review of available general approaches to parallelization of CA algorithms is described in the first part of the following chapter. Then, in the second part, discussion on available parallel SRX models based on a CA method is presented.. 34.

(35) 4 Parallelization of cellular automata algorithms Based on the Moore’s law [118] computer chips density (as well as frequency and energy consumption) should double every two years. Unfortunately, in reality this assumption faces architectural problems. In 1990s engineers decided to stop boosting clock speed (which approximately achieved level of 3.5GHz) and selected a different path towards development of so called multicore processors. This gave a possibility to accelerate computations by paradigm of the parallelism. Currently different programming languages for various computational architectures are being intensively developed as presented in [119–121]. In the scientific literature a lot of works with different parallelization techniques can be found in the area of cellular automata calculations. In general they can by divided into three main groups: •. CPU (Central Processing Unit) based computations [122].. •. GPGPU (General Purpose Graphical Processor Unit) based computations [123].. Heterogenous platform computations [124]. The first group of approaches based on a CPU parallelization is generally directed on division of a large computational problem into few smaller ones realized within different threads as presented in Fig. 4.1. •. Fig. 4.1 CPU parallelization concept [125].. Number of working threads are limited to number of CPU cores (independent processing units, which read and execute instructions) available on a computing unit. It should also be mentioned that some processor are enhanced by the hyper-threading operations, which offer additional increase in their performance of approximately 30% [126]. When a single computation unit is concerned the OpenMP technique is often used for parallelization of cellular 35.

(36) automata simulations e.g. [122,127,128]. These approaches were dedicated to Symmetric Multiprocessing (SMP) architectures. The solution provides efficient way to use a single computational unit with a high amount of cores offered by new processors, but it can only use amount of memory available within this unit. This can be problematic when simulations of large 3D computational domains are considered. Therefore, approaches based on a distributed memory and the message passing interface (MPI) standard started to be applied to CA algorithms [129,130]. In these approaches multiple computational units called nodes (based mostly on 2 CPUs platform) are connected within a local area network (LAN) and are used to performed the same tasks. Computations performed on such a cluster can be controlled by a middleware software layer that allows an end user to treat the whole cluster as a single computing unit (e.g. via a single system image concept). In this concept combination of the OpenMP application programming interface and the MPI standard is used. These approaches provide efficient way to use supercomputer capabilities for scientific calculations based on the CA method. Presently, the topic of mapping scientific calculations into the high performance computers (HPC) and grid networks combining series of HPC computers (Fig. 4.2) is in the centre of interest of major scientific institutions around the world [122,131]. When grid computing is considered, multiple nodes can be distributed in different geographical locations. In this case the MPI standard is most often used, because it is compatible with the C and Fortran programming languages [132]. The MPI is not a language, but an open standard, which can be used by software developers, scientist and common software developers.. Fig. 4.2 Supercomputer concept.. On the other hand simulations based on the GPU are being also developed and adapted to the CA method using e.g. OpenCL [122] or CUDA [123]. In this case, the situation is rather 36.

(37) different than in the CPU calculations, because scientists have an access to enormous number of small processing units (Fig. 4.3). And finally, recent solutions based on heterogenous clusters and cloud computing should be mentioned as they gave significant possibilities for simulation of complex problems with the CA method [133].. Fig. 4.3 Comparison of different architectures a) CPU – small number of big cores, b) GPU – large number of small cores [134].. As described in Chapter 3, conventional approach to the CA method is based on a regular shape of the CA space what provides enormous possibilities for efficient parallelization. Because a CA space consists of a finite number of cells, it is fairly straightforward to change the way of computations from a sequential (Fig. 4.4a) to a parallel (Fig. 4.4b) mode.. Fig. 4.4 Different types of computations: a) sequential execution, b) parallel execution.. When the CA space decomposition schemes are investigated, approaches based on the so called “ghost cells” are often used [129]. As presented in Fig. 4.5 in this case the CA space is extended with additional vectors of cells at each application instance. When a layer decomposition is considered additional vectors are added to the left and right part of the CA 37.

(38) computational domain. Then, at each simulation time step information stored in these “ghost cells” are updated based on information received from neighbouring processes.. Fig. 4.5 Concept of “ghost-cells” in the layer space decomposition scheme [129].. All presented methods and techniques of cellular automata models parallelization are widely used in computational science for simulation of different problems but usually not related with complex microstructure evolution phenomena e.g. [127,135–137]. In the case of SRX simulations only simple parallel models with significant simplifications to the considered physics can be found in the scientific literature. In [138], authors performed simulation with enormous 3D computational domain consisted of 2560×640×640 CA cells with randomly placed nuclei. Presented model was limited only to a simple nucleation mechanism and did not take into account grain boundary curvature effects. Also Euler angles were not taken into account during evaluation of mobility term. Therefore, the model prediction capabilities are strongly limited. In case of parallelization, the work was based on a static domain decomposition with a 3D von Neumann subdomain neighbourhood. Each computational domain contains additional cells as shown in Fig. 4.5 and after each simulation time step an information update procedure in those cells based on a selected communication scenario is realized. What is worth noticing, is that obtained speedup of simulation up to 128 MPI processes are almost ideal (Fig. 4.6), and after that, speed up dramatically drops. This behaviour is related with a number of cells within a CA space and also with a number of cells, which communicate with each other during a simulation. Additionally, different communication mechanism were presented in this work: fully coupling, gradual coupling and isolated solution.. 38.

(39) Fig. 4.6 Speedup of main simulation loop for different communication and periodicity protocols [138].. Another interesting work was presented in [79], where authors decided to used SVE (Statistical Volume Element) concept to simulation of SRX, using a heterogenous HPC infrastructure. Unfortunately, in this case microstructure futures were treated in a statistical manner during simulation. The parallelization concept assumes that at the beginning of simulation random samples were cut from an initial microstructure as seen in Fig. 4.7. After that simple, separate microstructure evolution simulations with the same amount of randomly placed nuclei (the same random positions were assigned to each simulation) were computed on parallel computers. After a computation of each simulation all results were agglomerated for statistical evaluation of obtained results.. Fig. 4.7 Setup for simulations based on the SVE approach [79].. Therefore, based on the above presented information, current thesis is focused on development of an advanced CA SRX model taking into account all relevant physical mechanisms controlling grain growth as presented in Chapter 3 and then on an adaptation of the model to massively parallel computations. Thus, possibilities provided by supercomputers. 39.

(40) will be applied to develop a complex microstructure evolution model based on the 2D/3D cellular automata spaces within the multi core and grid platforms. Unfortunately, mentioned modern architectures force significant changes in general implementation concepts, even redevelopment and reimplementation of the simplest algorithms. All the apprehended developments are presented in following chapters of the dissertation.. 40.

(41) 5 Aim of the work The literature review clearly pointed out capabilities and limitations of the cellular automata method in the area of modelling static recrystallization. The primary identified issue is attributed with high computing costs of complex CA SRX models taking into account major physical phenomena controlling microstructure evolution. Therefore, the following hypothesis of the dissertation was formulated: Taking into account all major physical mechanisms of a static recrystallization, during implementation of the parallel version of the CA model dedicated for grid environments, is possible and results in high predictive capabilities of the model as well as reduction in computing times. To confirm the hypothesis a set of detailed objectives was specified as follows: 1. Implementation of the complex CA recrystallization model as a core element for further parallelization: a) development of the theoretical basis of all model components, b) implementation of the sequential version of the model taking into account all major driving forces associated with a SRX progress. 2. Evaluation of the CA SRX model robustness and elimination of unphysical artefacts associated with implementation aspects. 3. Identification of the model parameters for specific investigated steel based on an experimental data and inverse analysis technique: a) realization of cold rolling experiments and heat treatment operations, b) preparation of a coupled cellular automata Finite Element simulation of cold rolling, c) identification of the CA SRX model parameters. 4. Mapping the CA SRX model to modern hardware architectures: a) distribution of calculations into multiple computation nodes (MPI) with different computational domain decomposition schemes, b) development and implementation of different communication mechanisms. 5. Evaluation of the parallel CA SRX model capabilities: a) evaluation of an influence of model and algorithms setups on a computing efficiency, b) performing functional and performance tests. 6. Adaptation of the implemented solution to perform simulations of large CA spaces with number of CA cells exceeding one billion. Realization of defined goals is presented within the next four chapters. First, in Chapter 6 developments of the advanced and complex CA model taking into account all major physical mechanisms associated with a static recrystallization as well as their interactions is presented. 41.

(42) During this stage of research, a sequential version of the code was developed to ensure the quality of the model and reliability of its predictions. The model robustness is confirmed in Chapter 7, with series of tests performed with various model initial setups. That way, all unphysical artefacts were recognised and eliminated from the CA SRX model. Then, developed CA SRX model parameters are identified based on the series of experimental tests described in Chapter 8. Cold rolling experiments combined with a heat treatment were used to provide data for the inverse analysis stage used to identify model parameters for the investigated austenite model alloy Fe30%Ni. Finally, fully operational and reliable CA SRX code and its further parallelization is presented in Chapter 9. Details of the developed parallel implementation of the CA SRX code are provided. Additionally, the appropriate division of a CA space between processes, as well as an appropriate synchronization of the work between all of the nodes, with particular focus directed on a communication mechanism is presented. The last part of the work, presented in Chapter 10, is dedicated to series of model tests based on computations speedup, efficiency and scalability. Finally, the work is summarized with comments and conclusions in Chapter 11.. 42.

(43) 6 Cellular automata static recrystallization model Predictive capabilities of a model are always related with physical mechanisms that are considered during its development. The more sophisticated model is, the better predictive capabilities it can offer what then widens its future potential applications. Therefore, the first part of the present work is dedicated to development of a physically based static recrystallization model within the cellular automata framework, which takes into account interactions of all major mechanisms controlling the microstructure evolution under specific heat treatment conditions. The model was designed, developed and implemented during the PhD work based on the three stage workflow presented in Fig. 6.1. This concept was created according to the literature knowledge on the static recrystallization phenomenon presented in Chapter 3.. Fig. 6.1 Concept of the developed SRX CA model.. The developed CA model is based on a series of assumptions: • • •. CA space: regular grid of rectangular CA cells arranged into 2D/3D computational domains with dimensions m×n/m×n×k, CA state variables: two CA states representing un-recrystallized (uSRX) and recrystallized (SRX) CA cells, CA internal variables: idi – grain id, θi – crystallographic orientation,. (ϕ1 , Φ, ϕ2 )i – Euler angles,. Yi t – cell state,. RX i fraction – recrystallization volume fraction, Pψ ,i – cell phase (austenite, ferrite, pearlite,…),. H i – accumulated energy value, f i – volume fraction of precipitates, H i 0 – initial stored energy value,. 43.

(44) •. CA neighbourhood: 2D/3D Moore neighbourhood for state evaluation, extended Moore neighbourhood for grain boundary curvature evaluation (Fig. 3.3, Fig. 3.4),. CA transition rules: defined for nucleation and grain growth, respectively. As seen in Fig. 6.1, for the first developed stage of the algorithm, a specific set of input data which replicate material state after deformation, prior heat treatment is required. The microstructure morphology and accumulated deformation energy are crucial data in this case. Details of this developed preprocessing module are presented in the following chapter. •. 6.1. Preprocessing module. To obtain accurate input data for further calculation stage, regarding microstructure morphology as well as accumulated energy after deformation, two different approaches were used. The first is a combination of the digital material representation (DMR) concept within the FE type simulations. As mentioned in Chapter 3 the undeformed DMR can be generated with different numerical approaches based on e.g. Monte Carlo or cellular automata grain growth models [103,139] or Voronoi type tessellations [101]. On the other hand, it can also be based on an experimental data in the form of microstructure image from light or electron microscopy. In the latter case, these data have to be additionally preprocessed by image analysis algorithms to obtain a proper DMR, where each grain has a unique identification number [109]. In the present work both approaches were considered at the preprocessing stage to provide input data for further SRX calculations. First, the grain growth algorithm within combined CA/MC framework was implemented according to [140]. That way digital microstructures both in 2D and 3D computational domains can be easily obtained (Fig. 6.2).. Fig. 6.2 Examples of microstructures obtained with a) cellular automata and b) Monte Carlo methods.. To obtain undeformed digital material representations based on an experimental data, an image analysis code, previously developed in [141], was adapted. This approach provides a very accurate representation of investigated microstructures, however it is limited only to 2D computational domain (Fig. 6.3). It can operate at images acquired both from the light and 44.

(45) scanning electron microscopy. In the latter case, not only a grain morphology but also their crystallographic orientations in the form of Euler angles ( ϕ1 , Φ, ϕ2 ) can be reconstructed.. Fig. 6.3 Example of a) an optical microscopy image of the microstructure b) a corresponding digital microstructure after image processing.. Presented digital microstructures obtained from experimental investigation or numerically generated are then discretized with a dedicated mesh generator DMRmesh [140,142,143] and incorporated into the finite element software, to perform numerical simulation of material behaviour under required loading conditions. That way detailed information on an accumulated energy during deformation as a function of local microstructural heterogeneities can be obtained without extensive experimental effort. The concept of the approach is presented in Fig. 6.4.. Fig. 6.4 Generation of the CA SRX model input data based on the DMR FE calculations.. It should be pointed out that this approach is especially useful for the 3D simulation case studies, where capabilities of experimental analysis are particularly limited. In the FE model, hardening behaviour of investigated microstructure features can be described by a conventional flow stress models or by more advanced crystal plasticity approaches, which additionally 45.