On-the-fly construction of surrogate constitutive models for concurrent multiscale mechanical analysis through probabilistic machine learning

(1)

Delft University of Technology

On-the-fly construction of surrogate constitutive models for concurrent multiscale

mechanical analysis through probabilistic machine learning

Rocha, I. B.C.M.; Kerfriden, P.; van der Meer, F. P.

DOI

10.1016/j.jcpx.2020.100083

Publication date

2021

Document Version

Final published version

Published in

Journal of Computational Physics: X

Citation (APA)

Rocha, I. B. C. M., Kerfriden, P., & van der Meer, F. P. (2021). On-the-fly construction of surrogate

constitutive models for concurrent multiscale mechanical analysis through probabilistic machine learning.

Journal of Computational Physics: X, 9, [100083]. https://doi.org/10.1016/j.jcpx.2020.100083

Important note

To cite this publication, please use the final published version (if applicable).

Please check the document version above.

Copyright

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy

Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.

(2)

Contents lists available atScienceDirect

Journal

of

Computational

Physics:

X

www.elsevier.com/locate/jcpx

On-the-ﬂy

construction

of

surrogate

constitutive

models

for

concurrent

multiscale

mechanical

analysis

through

probabilistic

machine

learning

I.B.C.M. Rocha

a

,

∗

,

P. Kerfriden

b

,

c

,

F.P. van der Meer

a

a_Delft_University_of_Technology,_Faculty_of_Civil_Engineering_and_Geosciences,_P.O._Box_5048,_2600GA_Delft,_the_Netherlands b_Mines_ParisTech_(PSL_University),_Centre_des_matériaux,_63-65_Rue_{Henri-Auguste}_Desbruères_BP87,_F-91003_Évry,_France c_Cardiff_University,_School_of_Engineering,_Queen’s_Buildings,_The_Parade,_Cardiff,_CF24_3AA,_United_Kingdom

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received10June2020

Receivedinrevisedform5October2020 Accepted21December2020

Availableonline28December2020

Keywords: Concurrentmultiscale Surrogatemodeling Probabilisticlearning GaussianProcesses(GP) Activelearning

Concurrent multiscale ﬁnite element analysis (FE2₎ _is _a _powerful _approach _for

high-ﬁdelity modeling of materials for which a suitable macroscopic constitutive model is not available. However, the extreme computational effort associated with computing a nested micromodel at every macroscopic integration point makes FE2 _prohibitive _for

mostpracticalapplications.Constructingsurrogatemodelsabletoeﬃcientlycomputethe microscopicconstitutiveresponseisthereforeapromisingapproachinenablingconcurrent multiscalemodeling.Thisworkpresentsareductionframeworkforadaptivelyconstructing surrogatemodelsforFE2_based_on_statistical_learning._The_nested_micromodels_are_replaced

by a machine learning surrogate model based on Gaussian Processes (GP). The need for oﬄine data collection is bypassed by training the GP models online based on data comingfromasmallsetoffully-solved

anchor micromodels

thatundergothesamestrain historyastheirassociatedmacroscopicintegrationpoints.TheBayesianformalisminherent to GPmodels provides anatural tool for online uncertainty estimation through which new observations or inclusion ofnew anchor micromodels are triggered. The surrogate constitutivemanifoldisconstructedwithasfewmicromechanicalevaluationsaspossible byenhancingtheGPmodelswithgradientinformationandthesolutionschemeismade robustthroughagreedydataselectionapproachembeddedwithintheconventionalfinite elementsolutionloopfornonlinearanalysis.Thesensitivitytomodelparametersisstudied with atapered barexample with plasticityand theframework is furtherdemonstrated withthe elastoplasticanalysis ofaplatewithmultiplecutoutsandwithacrackgrowth exampleformixed-modebending.Althoughnotabletohandlenon-monotonicstrainpaths in its current form, the framework is found to be a promising approach in reducing the computational cost ofFE2,with significant efficiency gains being obtainedwithout resortingto

oﬄine training.

*

Correspondingauthor.

E-mailaddress:i.rocha@tudelft.nl(I.B.C.M. Rocha).

https://doi.org/10.1016/j.jcpx.2020.100083

(3)

1. Introduction

Thereisagrowingdemandforhigh-ﬁdelity numericaltechniquescapableofdescribingmaterialbehavioracrossspatial scales.Withrecentadvancesinadditivemanufacturingallowingforthedevelopmentofnovelmaterialswithhighly-tailored microstructures [1], multiscale modeling techniques will become increasingly relevant in the design of novel materials and structures. One popular approach for concurrent multiscale modeling is the so-calledFE2 _approach _[₂_,₃_], _in _which macroscopicmaterialresponseisdirectlyupscaledfromembeddedmicromodelswithoutintroducingadditionalconstitutive assumptions.FE2 _is_a_powerful_and_versatile_technique _used_in_a_number_of_solid_mechanics_{applications,}_from_continuous [4] anddiscontinuous[5] mechanicalequilibriumtomultiphysicsproblemsinvolvingheatandmasstransfer[6] andmaterial degradationduetoaging[7,8].However,FE2 _has_the_major_drawback_of_being_associated_with_extreme_{computational}_costs, hinderingitsapplicationinactualdesignscenarios.EnablingtheuseofFE2 inmanypracticalapplicationsthatwouldbeneﬁt fromitsaccuracyandversatilityisthereforehighlycontingent onbeingabletoreduceits computationalcosttotractable levels.

A promisingapproachin accelerating FE2 _models _consists_in_constructing _surrogate_models _that _take_the _place_of_the originalhigh-fidelitymicromodelsateachmacroscopicintegrationpoint.Whenbuildingsurrogates,thegoalistomaintain as much of the generality offered by the original micromodel while eliminating as much computational complexity as possible.One optionis toemploy unsupervised learning ona numberoffull-order solutionsnapshots inorderto define lower-dimensional solutionmanifoldsforbothdisplacements [9,10] andinternal forces[11,12] atthemicroscale [13–15]. Alternatively, a supervisedlearning approach canbe takenby using snapshotsofthe homogenizedmicromodelresponse to directlydefineadata-driven regressionmodelforthe macroscopicconstitutivebehavior[16–21]. Althoughresultingin modelsofdistinctnatures,bothapproachesrelyontheexistenceofanobservationdatabaseonthebehavioroftheoriginal micromodel that is usually obtainedoffline (before deployment on a multiscalesetting) and should coverevery possible scenariothesurrogateisexpectedtoapproximateonline.

However,buildingsuchadatabaseofmodelsnapshotscanbeachallengingtask(see[20,22] forinterestingapproaches based onDesign ofExperimentsand[14] for a data-driventraining framework basedonBayesian Optimization).For mi-cromodels employingpath-dependentmaterials,thisofflinetraining processentailssamplingahighly-complexconstitutive manifold that dependson anarbitrarily longstrain historyandcanthereforebe excessivelysensitive tosmallchanges in boundaryconditions(e.g.strain localizationandcrackpropagationproblems). Furthermore,training asurrogatewithsuch a complex datasetoften requiresadditional partitioningtechniquesin orderto avoidcomputationally inefficient reduced models [23,24]. An alternative to theconventional offline-online approach that hasbeen gainingpopularity isthe use of adaptive reduction frameworks that either preclude the need for offline training altogether [25,26] or combine different reductiontechniquesintoasingleframeworkwithonlylimitedoffline effortwhileemployingonline errorindicatorsto con-tinuouslyassessthequalityoftheapproximationandtriggerarefinementofthesurrogateswhennecessary.Nevertheless, obtaining consistent adaptivity criteriaforeitherhyper-reduced models[25] or machinelearning models basedon least-squaressolutions[19,27] isnotstraightforward.Attheotherendofthespectrum,worksdealing withconstitutivemodels basedonBayesianregressiontechniques,thatprovideanaturalwaytoestimateerror[28–30],donotoftentakeadvantage oftheirpotential forcreatingadaptiveframeworks(see[31,32] foraninterestingframework specificallytailoredtocrystal plasticity).Thereisaneed,therefore,forthedevelopmentoffully-onlineapproachesforgeneralFE2 _models_with_reliable adaptivity strategies basedon sequential learning techniques andon error estimation methods with robust probabilistic foundations.

Thisworkpresentsanadaptiveprobabilisticframeworkforconstructingsurrogateconstitutivemodelsfornonlinear con-currentmultiscaleanalysis.Theapproachisbasedonsubstitutingtheoriginalmodelsassociatedtomacroscopicintegration pointswithamachinelearningsurrogate.Inordertobypasstheneedforanoﬄine trainingphase,asmallnumberof fully-solved models associated to representative macroscopicintegration points are used to generate constitutive data online.

Becausethesemodelsarenotdirectlyusedtomakepredictionsateverymacroscopiciterationbutonlyprovideanindirect couplingbetweenthescales,we denotethemasanchormodels (Fig.1).Stress andstiffness dataresultingfromsubjecting theanchor modelstothesamestrainhistoriesseenbytheirrespectiveanchoringpointsisusedtobuildasurrogatemodel basedontheGaussian Process(GP)regressiontechnique.The frameworkpreservesthegeneralityofFE2 _by _not_assuming theexistenceofanyinternalvariablesatthemacroscaleandbuildssurrogatesthattakeonlystrainsasinput.However,this impliesaone-to-onemappingbetweenstressesandstrains,makingtheframeworkinitscurrentformunsuitablefor treat-ing non-monotonic loadpaths.Thiscan beaddressedinfutureextensions oftheframework byemploying autoregressive GPs[33,34].

The accuracy of thisreduced-order solution is controlled withuncertainty information that arises naturally fromthe Bayesian formalism of the GP models. The resultant material model is embedded within a conventional finite element solutioninawaythatensuresthemacroscopicsolutionisnumericallyrobustandlimitsthesamplingofnewdataasmuch aspossible inorderto maximize efficiency.The framework isdemonstratedwitha setof numericaltestsincludingboth bulkstrainlocalizationandcrackpropagationinordertoassessitsaccuracy,efficiencyandversatility.

(4)

Fig. 1. Schematic representationoftheonline adaptivereductionframeworkpresentedinthiswork.Asmallnumberofanchor models—inthiscasea matrixreinforcedwithcircularinclusions—areusedtotrainamachinelearningsurrogatethatevolvesasthemacroscopicstructureisloaded.Adaptive samplingguaranteesthatanchormodelsoperatingatnovelregionsinparameterspace—illustratedherebyplasticstrainbandsshowninorangeandred —willpopulatethedataset

D

moreoften.

2. Concurrentmultiscaleanalysis

2.1. Macroscopicproblem

We beginby brieﬂy introducing theconcurrent multiscale equilibriumproblemthat we seekto accelerate. Let

de-fine themacroscopicdomainbeingmodeled. Wewish tofindthe displacementfield u _resulting_from_a_set _of_Dirichlet and Neumannboundary conditions applied to thesurface

that bounds

.

Under the assumption ofsmall strains,the equilibriumsolutionisfoundbysatisfying:

div

σ

₌

₀

_ε

₌

1 2

∇

u

+

∇

u

T

(1) wherediv

(

·)

isthedivergenceoperator,

σ

and

ε

arerespectivelythestressandstraintensorsandbodyforceshavebeen neglected.Inordertosolveforu_,_a_constitutive_model

_M

_that_relates

_σ

_and

_ε

_must_be_introduced:

σ

₌

_M

_ε

_,

_ε

h

(2) where

ε

h is a history termthat accounts forstrain path dependency. In the context ofmultiscale analysis, the model

M

can beseen asahomogenizationoperatorthat lumps allphysicalprocesseshappeningatscales lower than

intoa homogeneous medium with equivalent behavior. Depending on howcomplex the microscopic behavior is,

M

can range fromhavingarelativelysimpleform(e.g.linearelasticity)tobeingnexttoimpossibletoformulateexplicitly.

2.2. Microscopicproblem

Inaconcurrentmultiscaleapproach,wedonotformulate

M

directlyandinsteadoptforupscalingmicroscopicbehavior frommicromodelsembeddedateachmacroscopicmaterialpoint.Let

ω

be aRepresentativeVolumeElement(RVE)ofthe microscopic materialfeatures whosebehaviorwe wouldliketo upscale.Assuming theprincipleofseparation ofscalesis valid(

ω

),wecanlinkthetwoscalesbyenforcing:

uω

=

ε

_xω

₊

_u ₍₃₎

wheretheresponseisdecomposedintoalineardisplacementfieldimposedbythepresenceofmacroscopicstrains(xω are themicroscopicspatialcoordinates)andafluctuationfield

u thataccountsformicroscopicinhomogeneities.

Inordertoobtain

σ

_,_we_ﬁrst_ﬁnd_an_equilibrium_solution_for_{uω by}_satisfying:

div

σ

ω

=

0

ε

ω

=

1 2

(5)

where we seethat once morethe needarises forthe deﬁnitionofa constitutive model,thistime relating

ε

ω with

σ

ω .

Underlyingthechoiceforamultiscaleapproachistheassumptionthatconstitutivebehaviorcanberepresentedbymodels of decreasingcomplexity asone descends to lower scales. Itis thereforecommonto employ regular constitutivemodels (e.g.(visco)elasticity,(visco)plasticity, damage)formaterialcomponentsatthemicroscale[4,6–8].However,theframework isﬂexibleinallowingformodelsonanevenlowerthirdscaletobeembeddedtomaterialpointsin

ω

(atthecostofeven highercomputationaleffort).

2.3. Bulkhomogenization

Withasolution foruω under theconstraintthattheﬂuctuation ﬁeld

u must beperiodic,we canusetheHill-Mandel principle[35] toobtainhomogenizationexpressionsforthemacroscopicstresses

σ

_and_consistent_tangent_stiffness_D_:

σ

=

1

ω

σ

ωd

ω

D

=

P

Kω

(5)

fromwhichweobtaintheintuitiveresultthatthemacroscopicstressesaresimplythevolumeaverageofthemicroscopic ones.Finally,themacroscopicconstitutivetangentstiffnessiscomputedthroughaprobingoperator

P

appliedontheglobal microscopictangentstiffnessmatrixKω [5].

2.4. Cohesivehomogenization

Although theprecedingformulation allows forgeneralmicroscopic constitutivebehavior to be upscaled, theresponse losesobjectivitywithrespecttotheRVEsizeaftertheonsetofglobalmicroscopicsoftening[36],i.e.whenthedeterminant oftheacoustictensoriszeroalongagivendirectionn:

det

nTDn

=

0 (6)

Thisnon-objectivity arisesbecausethevolume

ω

d ofthestrainlocalizationbandthat causesthesofteningdoesnotscale withtheRVEsize,anobservationthatmotivatestheuseofamodiﬁedversionoftheHill-Mandelprinciple[5,36]:

1 w

τ

_δ

_v

₌

1 wh

ωd

σ

ω

δ

ε

ωd

ω

(7)

wherew andh aregeometricRVEparametersthatdependonthelocalizationbandorientation,

τ

_is_a_macroscopic_traction and

v

isashifteddisplacementjumpthatallowsforaninitially-rigidcohesive response.Notethatthehomogenization isnowperformedtowardsacohesivetractionactingonamacroscopicsurfacethatdeﬁnesadiscontinuityinu_.

Thedevelopment ofa consistentstrategyforcontinuous-discontinuousscalelinkinginFE2 _is_an_open_issue_that_is_left out ofthescopeofthepresentdiscussion,withanumberofdifferentapproachesbeingfound,forinstance,in[5,36–39]. For the purposeof buildingsurrogate models for the RVEresponse, it suﬃces to acknowledge that two distinct models shouldbetrainedforbulkandcohesiveresponses,astheunderlyingconstitutivemanifoldshavedimensionswithdifferent physicalinterpretations(strain/stressversusjump/traction).

2.5. Accelerationstrategy

AssumingFEMisusedtosolvethemechanicalproblemsatbothscales,wecanapproximatethecomputationalcostofa singlemacroscopiciterationas:

Cost

≈

N_dof

x

+

N_ipN_iterω

N_dofω

x (8)

whereNdofrepresentsthenumberofdegreesoffreedomofthemodel,Nipisthenumberofmacroscopicintegrationpoints andNω_iter isthenumberofiterationsnecessaryforconvergenceofthemicroscopicBVP.Weassumeforsimplicitythatthe bulk ofthe effort comesfrom solvingthe linearizedsystems ofequationsinvolving K _and_{Kω for}_nodal_{displacements,} wherethecomplexityexponentx dependsonthesolverused.Itcanbeseenthatthesecondterm,associatedwithsolving themicroscopicequilibriumproblems,quicklyoutweighstheﬁrstandbecomesaperformancebottleneckasNω_dofincreases, especially since thenumberof macroscopicintegrationpoints N_ip increasestogether with N_dof. Constructing asurrogate thatreplacestheoriginalconstitutivemodel

M

ofEq. (2) isthereforeaneffectiveapproachtoacceleratingFE2_.

However, constructingsuch asurrogateoﬄine isa challengingundertaking,otherwisetherewouldnothavebeenneed foramultiscaleapproachintheﬁrstplace.From Eq. (2) weseethat theconstitutivemanifoldtobereproduced canhave anarbitrarilyhighdimensionalityduetothedependencyon

ε

h.Thisisequivalenttostatingthattheshapeofthe

ε

-

σ

manifold canchangeaftereach loadstep.Samplingthishigh-dimensional inputspaceoﬄine inordertohaveasurrogate

(6)

thatisaccurateforarbitrarystrainhistoriesseemstobeanintractableproblemthat,tothebestofourknowledge,hasstill notbeentackledinasatisfactoryway.

This issue can be avoided by exploiting thefact that

ε

_(and_therefore

_ε

h) are often highly constrained by the ge-ometry and boundary conditions of the macroscopic structure being modeled [24]. We therefore opt forconstructing a highly-tailoredsurrogatemodel

S

online basedonadataset

D

ofobservationscomingfromasmallnumberoffully-solved micromodels:

σ

₌

_S

_ε

_,

_D

₍₉₎

whichcan thenbe usedtocompute theconstitutive responseforafractionofthecost.When trying tokeep

D

assmall aspossibleforthe caseathand,it iscrucialto haveameans forquantifyingthe uncertaintyinprobing

S

foranygiven

ε

_._In_this_work,_a_Bayesian_approach_is_adopted_to_assess_{online whether}

_D

_is_large_enough_to_provide_the_desired_level_of conﬁdencein

S

atagiven

ε

_.

3. Bayesiansurrogatemodeling

InthissectionweintroducetheBayesianregressionapproachusedtoconstructsurrogateconstitutivemodels,beginning fromparametricversionsofthesurrogatemodel

S

—i.e.by encapsulatingtheconstitutiveinformationin

D

intoasetof parametersw —andeventuallymovingtoanon-parametricmodelbasedonGaussianProcesses(GP)thatusesthedatain

D

directlyinordertomakepredictions.Ourgoalhereistoappealtothereaderwhomightbeunfamiliarwithprobabilistic regression models by starting fromclassical least-squares regressionand gradually moving towards a Bayesianapproach. Nevertheless,the discussioniskept asbriefandfocused aspossible.The interestedreadercanﬁndricherdiscussionson thesubjectin[40,41].

3.1. Least-squaresregression

Westartbybuildingaparametricmodely

(

w

,

x

)

thatapproximatesascalartargetresponset byﬁttingw withadataset

D

ofN observationsto atXo

=

[xo1

· · ·

xoN].Wethereforeassumethatthetargett canbewrittenas:

t

=

y

(

w

,

x

)

+

with y

(

w

,

x

)

=

M

j wj

φ

j

(

x

)

and p

(

)

=

N

|

0

,

σ

_n2

(10)

wherethe noise

is givenbya zero-mean Gaussiandistribution withvariance

σ

2

n and y is linearwithrespectto its M weights,gatheredinthevectorw.Notethattheassumptionofalinearmodeldoesnotlimititsﬁttingcapabilitiessincethe basisfunctions

φ

deﬁne afeaturespacethatcanbenonlinearinx.Inthecontextofthiswork,t canbeasingle stressor tractioncomponent,butthediscussionisequallyapplicabletotheproblemofmodelingindividualmodelcomponents(e.g. yieldparameters).

Inorderto ﬁndvaluesforw,we computethelikelihoodfunction ofthemodel,i.e.how likelythemodelistoproduce thevaluestoin

D

givenw:

p

to

|

w

,

σ

n2

=

N

i

N

toi

|

wT

φ (

xoi

) ,

σ

n2

(11) whichisaproductoftheprobabilitiesofeach pointinisolation becauseweassume thateachsampletoi issampledfrom theconditionaldistributionp

(

t

|

y

)

independently.Wecanﬁndanoptimumdataﬁtforw bymaximizingthelikelihoodand assumingthattheshapesof

φ

areﬁxedapriori:

∇w

p

to

|

w

,

σ

n2

=

0

⇒

wML

=

T

−1

Ttoand

σ

n2

=

1 N N

i

toi

−

wTML

φ (

xoi

)

2 (12) where

∈ R

N×M is a matrix withbasis function values

φ

i,j

= φ

i

xo j

evaluated at each point in

D

. Note that this is equivalent tominimizingthesumofsquareddifferencesbetween y andt,sowML isthesameparametervectorobtained bytheclassicalleastsquaresapproachand

σ

2

n quantiﬁesthespreadoft aroundy.

Herewedonotoptforaleast-squaresapproachforthreereasons.Firstly,itcansufferfromsevereoverﬁttingwhenthe dataset

D

issmall—whichisthecaseinthepresentworksince wehavenooﬄine trainingandonlyasmallnumberof fully-solved micromodelsto sample from. Secondly,the uncertaintyassociated with

σ

2

n is constantthroughout theinput spacex anddoesnot provideanindication that themodelisbeingusedata locationfar fromdata points(which could thenbeusedtotriggerareﬁnementof

D).

Finally,performingmodelselectionintheleast-squaresframework,using cross-validation for instance, is rather tedious compared to performing model selection in the context ofBayesian regression methods[40].

(7)

3.2. Bayesianparametricregression

IntheBayesianapproachtoregression,wenotonlyassumeanuncertaintyoverthetargett butalsoovertheweightsw.

Weinitiallyassumeapriorprobabilityoverw thatrepresentsourinitialmodelassumptionsbeforeanydataisencountered:

p

(

w

)

=

N

w

|

0

,

σ

_w2I

(13) where

σ

2

wisthevarianceparameterassociatedwiththeuncertaintyovervaluesofw. Informationfrom

D

isincorporated byusingBayes’theoremtoobtainaposteriorprobabilitydistributionforw:

p

(

w

|

to

)

=

p

(

to

|

w

)

p

(

w

)

p

(

to

)

with p

(

to

)

=

p

(

to

|

w

)

p

(

w

)

dw (14)

where p

(

to

|

w

)

isthelikelihoodfunctionofEq. (11) and p

(

to

)

isthemarginallikelihood ofthemodel(i.e.theprobabilityof producingthedataset

D

).Becausebothw andt

|

w areGaussianvariables,ananalyticalsolutionexistsforp

(

w

|

to

)

[40]:

p

(

w

|

to

)

=

N

(

w

|

wN

,

SN

)

with wN

=

1

σ

2 n SN

Ttoand SN

=

1

σ

2 w I

+

1

σ

2 n

T

₋1 (15) Notethattheexpectedvalueofw nowdependsonboththedatapointsin

D

andonourinitialbeliefsaboutw represented

bythepriordistribution.Giventhisposterior,thebestguessforw istheonewiththehighestp

(

w

|

to

).

Thisistheso-called Maximum A Posteriori (MAP) value and, for a Gaussian posterior, wMAP

=

wN. This estimation for w is equivalent to a least-squarespredictionwithaquadraticregularizationtermproportionalto

σ

2

w whichhelpsreducingthenegativeeffects ofoverﬁtting.

However,inafullyBayesiantreatment oflinearregression,wedonotchooseonespeciﬁcvalueforw butrathermake apredictionforanewtargetvaluet_∗ byaveragingoverallpossiblevaluesofw:

p

t_∗

|

to

,

σ

n2

,

σ

w2

=

p

t_∗

|

w

,

σ

2 n

p

w

|

to

,

σ

w2

,

σ

n2

dw (16)

InordertosetthestageforGaussianProcesses,itisalsointerestingtorepresenttheexpectationoft_∗ as:

E

[t_∗

|

x_∗]

=

N

i k

(

xoi

,

x∗

)

toi with k

xp

,

xq

=

1

σ

2 n

φ

T

xp

SN

φ

xq

(17) wherew hasnowvanishedandthenewpredictioniscastasalinearcombinationofvaluesfrom

D

,wherek

xp

,

xq

isa

kernel thatmeasuresthesimilaritybetweentwopointsininputspace.

TheBayesianapproachtothe(generalized)linearmodelcircumventstheproblemofoverfittingandprovidesconfidence intervals.However,thelinearmodelhasthedrawbackofhavingafixednumberofbasisfunctions

φ

whoseshapesneedto bechosenbefore

D

isobserved.Thiscanbecircumventedbyeitherallowing

φ

tochangeinshapeduringtraining(e.g.ina neuralnetwork,where

φ

areneuronvaluescomingfromthelast hiddenlayer)orbyexplicitlychoosingakernelk

xp

,

xq

andadoptingadistributionoverfunctionsinsteadofoverweights,asinGaussianProcessregressionmodels.

3.3. GaussianProcess(GP)regression

Sincethe regressionfunction y is afunctionofw, adoptingaprior distributionoverw (Eq. (13))implicitlyleads toa distributionoverfunctionvaluesy.ThisisthebasisofGaussianProcess(GP)models.Hereweassumethesefunctionvalues arejointlyGaussian:

p

(

y

)

=

N

(

y

|

0

,

K

(

X

,

X

))

(18)

whereazeromeanisassumedwithoutlossofgeneralityandK

(

X

,

X

)

istheso-calledGram matrixthatdeﬁnesthe covari-ancebetweentheinputvaluesX associatedwithy throughakernelk

xp

,

xq

: Kpq

=

k

xp

,

xq

=

σ

_f2exp

−

1 2

2

xp

−

xq

2

(19) whereinsteadoffirstdefininga prioroverweights andderiving anequivalent kernelasinEq. (17) wedirectlyassumea kernel—inthiscasethesquaredexponential kernel[41] definedbyavariance

σ

2

f andalengthscale

—whichdetermines the level ofcorrelation between points in input space. It can be shownthat adopting the squaredexponential kernel is equivalent toformulating a parametricmodelwithan inﬁnitenumberofbasis functions

φ

oraBayesian neuralnetwork withonehiddenlayerwithaninﬁnitenumberofneurons[40,41].

Withthisdeﬁnitionforthecovariancestructurebetweenfunctionvalues,wecanobtainthejointdistributionoftraining targetsto bymarginalizing(averaging)overallpossiblevaluesofyo:

(8)

Fig. 2. GP predictionsforthe functiont=sin(x)constructedwith twoobservations.Inclusionofderivativeobservationsimprovespredictionsaround trainingpointsandreducesuncertainty.

p

(

to

)

=

p

(

to

|

yo

)

p

(

yo

)

dy

=

N

to

|

0

,

K

(

Xo

,

Xo

)

+

σ

n2I

(20) whichallowsustoincorporateinformationfrom

D

bydeﬁninga jointdistributionbetweentrainingvaluesandnew pre-dictionsforthefunctionvalue:

p

to y_∗

=

N

to y_∗

0

,

K

(

Xo

,

Xo

)

+

σ

n2I K

(

Xo

,

X∗

)

K

(

X_∗

,

Xo

)

K

(

X∗

,

X_∗

)

(21) andsubsequentlyﬁndtheconditionalprobability p

(

y_∗

|

to

)

ofthenewfunctionvaluesgiventhatvaluescomingfrom

D

are known:

p

(

y_∗

|

to

)

=

N

(

y∗

|

m∗

,

S_∗

)

(22)

whichisagainaGaussiandistributionwithmeanandcovariancegivenby[40,41]: m_∗

=

K

(

X_∗

,

Xo

)

K

(

Xo

,

Xo

)

+

σ

n2I

₋1 to (23) S_∗

=

K

(

X_∗

,

X_∗

)

−

K

(

X_∗

,

Xo

)

K

(

Xo

,

Xo

)

+

σ

n2I

₋1 K

(

Xo

,

X∗

)

(24)

Deﬁningk_∗

=

K

(

Xo

,

x∗

)

∈ R

N×1 andKo

=

K

(

Xo

,

Xo

)

∈ R

N×N,we canarriveatshorterexpressionsfortheexpectation andvarianceofthefunctionvalueatasinglenewpointx_∗:

E

[ y∗

|

x_∗]

=

kT_∗

Ko

+

σ

n2I

−1 to (25)

V

[ y_∗

|

x_∗]

=

k

(

x_∗

,

x_∗

)

−

kT_∗

Ko

+

σ

n2I

−1 k_∗ (26)

Notethathereweoptfordirectlyusingthefunctionvalue y_∗insteadofitsnoisyversiont_∗todeﬁneoursurrogatemodels. Thiseffectivelymakestheadaptivecomponentsoftheaccelerationframework,whicharedrivenbyincreasesinthevariance givenbyEq. (26),lesssensitivetochangesin

σ

2

n.

InFig.2awedemonstrateaGPmodelbyplottingpredictionsbasedontwoobservationsofthenoiselesstargetfunction

t

=

sin(x

).

TheconditioningofEq. (22) guaranteesthat y_∗ coincideswiththeobservationsatthetrainingpointsXo.Away from the training space

D

the GP returns to its zero-mean prior with high variance. This behavior is desirable for our application, since thisuncertainty can be used asa trigger to eithermaking newobservations atthe existing full-order integrationpointsoraddingnewpointstothefull-ordersetwhennecessary.

3.4. Predictingderivativesandincludingderivativeobservations

Inbuildingconstitutivemodelsurrogates,inadditiontopredictingnewfunctionvalues(Eq. (25)),theirderivativeswith respecttotheinputx_∗ arealsoneeded.ThesecanbecomputedbydifferentiatingEq. (25):

∂

x_∗

E

[ y∗

|

x∗]

=

N

i ai

∂

x_∗k

(

x∗

,

xi

)

(27)

(9)

whereaiisthei-thcomponentofthevectora

=

K

+

σ

2 nI

−1

t andthederivativeofthekernelisgivenby:

k

xp

,

xq

≡

∂

xp k

xp

,

xq

= −

1

2

xp

−

xq

k

xp

,

xq

(28) However, sincewecan alsoobservethesederivatives(intheformofD _from_{Eq. (}₅₎₎_at_the_anchor_models,_it _is_also interestingtoincludethisinformationintheGPinordertoimproveitspredictions.Thisisachievedbyrecognizingthatthe derivativeofaGPremainsGaussiananddeﬁningthecovariancebetweenfunctionvaluesandderivativesas[42]:

cov

y_p

,

yq

=

∂

xpk

xp

,

xq

cov

y_p

,

y_q

=

∂

2

∂

xq

∂

xpk

xp

,

xq

+

σ

_d2 (29) where

σ

2

d isanoiseparameterthatrepresentstheuncertaintyassociatedtoobservingthederivatives,theﬁrst-order deriva-tiveofthekernelisgiveninEq. (28) anditssecondderivativeisamatrixgivenby:

k

xp

,

xq

≡

∂

2

∂

xqxpk

xp

,

xq

=

1

2

I

−

1

2

xp

−

xq

xp

−

xq

T

k

xp

,

xq

(30)

Withthesedeﬁnitions,thejointpriordistributionoftrainingtargetsbecomes[43]:

p

to t_o

=

N

to t_o

0

,

Ko

+

σ

n2I Ktd KT_td Kdd

≡

N

to t_o

0

,

Ko

(31) where Ko is thesamekernelmatrixappearing inEq. (25) and Ktd

∈ R

N×N D andKdd

∈ R

N D×N D are composed ofblocks of k

xp

,

xq

andk

xp

,

xq

,respectively, withKo beingthe resultant N

(

D

+

1)

×

N

(

D

+

1) covariancematrix (D is the dimensionalityofx).MakingpointpredictionsisdoneinasimilarwayasinEq. (25):

k_∗

=

k_∗ k

(

x1

,

x∗

)

· · ·

k

(

xN

,

x∗

)

T

(32)

E

[ y_∗

|

x_∗]

=

kT_∗K_o−1to

V

[ y∗

|

x_∗]

=

k

(

x_∗

,

x_∗

)

−

k_∗TK−_o1k_∗

+

σ

_n2 (33) wherethetargetvectornowincludestheobservedderivatives:

to

=

to1

· · ·

toN to1

· · ·

toN

T

(34) and predictions for the tangent are done as in Eq. (27) but now by differentiating k_∗ instead of k_∗. Fig. 2 shows GP predictionsfort

=

sin(x

)

withtwotrainingpointswithandwithoutincludingderivativeobservations.Nowtheconditioning notonlyconstrainsthepredictionstoagreewiththetargetvaluesin

D

butalsowiththeirderivatives.Thismakeseffective useofthelimitedamountofinformationcomingfromasmallnumberofonline observations, asmakingasinglegradient observationisequivalenttoaddingD extrafunctionobservationsaroundthetargetvalue.

3.5. Hyperparameteroptimization

Theprocess variance

σ

2

f andlength scale

thatcomposethekernelandthetarget noise

σ

n2 arehyperparametersthat shouldbelearnedfromthedataset

D

.SinceafullBayesiantreatmentfortheseparameters—introducingaprior,deriving aposteriorandmarginalizing—usually demandstheuseofexpensivenumericaltechniquessuchasMarkovChain Monte Carlo(MCMC),hereweoptforamaximumlikelihoodsolutioninordertominimizethecomputationaloverheadassociated withtheonline calibrationoftheGPmodels.Furthermore,duetothelimitedamountofdataavailableforestimation,here werefrainfromoptimizingforthederivativenoise

σ

2

d andinsteadassumederivativeobservationsarenoiseless.

Theaimhereistomaximize themarginallikelihood,obtainedby averagingtheprobability thatthemodelreproduces thetrainingtargetsoverallpossiblevaluesofyoassociatedwithto:

p

to

|

σ

_f2

,

σ

n2

,

=

p

to

|

yo

,

σ

n2

p

y_o

|

σ

2 f

,

dy_o (35)

whichisafunctionthatdependsonlyonthehyperparameters.WeoptimizethismarginallikelihoodwithaBFGSalgorithm [44], which requiresthe computation ofthegradient of p

t

.Taking thenaturallogarithm ofboth sidesof Eq. (35) and differentiatingwithrespecttoKoyieldsanexpressionforthegradient:

∇

ln p

to

= −

1 2tr

K−_o1

∇

Ko

+

1 2t T oK− 1 o

∇

Koto (36)

and

∇

Ko can be obtained in a straightforward manner by differentiating Eqs. (19), (28) and (30) with respect to each hyperparameterandreassemblingthematrixasinEq. (31).

(10)

4. Surrogatemodelingframework

Inthissection,weusethetechniquesofSection3.2tobuildanadaptivesurrogatemodelingframeworkforFE2. Follow-ingFig.1,weintroduceasurrogatemodel

S

trainedonasetofconstitutiveobservations

D

obtainedfromasmallnumber of fully-solved anchor models (gathered in the set

A

) subjected to the strain histories from their respective anchoring integrationpoints.Forbulkhomogenization,wedeﬁne

S

as:

σ

_ε

_,

_D

₌

_D e

ε

+ E

σ

_ε

_,

_D

₍₃₇₎ whereD

e isaninitialstiffnessobtainedfromtheﬁrstcomputedintegrationpointand

σ

isastresscorrectiongivenbythe meanGPresponse,withthevariancebeingusedexclusivelyforadaptivitypurposes.Theconsistenttangentstiffness isthe combinationofDe andthederivativesoftheexpectedstresscorrections(Eq. (27)).Thedataset

D

ispopulatedwithstress andstiffnessvaluescomingfrompointswiththehighestvaluesfortheuncertaintyoveranycomponentof

σ

.

Weoptformodelingeachcomponentof

σ

(threecomponentsforthe2Dexamplestreatedhere,withwhichwe implic-itlyenforcesymmetrytothestresstensor)independently,eachwithits ownzero-meanGPmodel.Thisstrategyimplicitly statesthatcomponentsofthestresstensorareaprioriuncorrelated,whichgreatlysimpliﬁesthestructureofthecovariance matrixof theGPmodels andreducesthenumberof hyperparameterstobe estimated. Furthermore,we assumefornow that

σ

dependsonlyonthecurrentstrainvalue

ε

_,_which_means_that_the_GP_model_loads,_unloads _and_reloads_along_the samepath.Thisisalimitationtobeaddressedinfutureversionsoftheframeworkthatcurrentlyhindersitsabilitytotreat generalnon-monotonic loadpaths.1 _It_is_worth _mentioning,_however,_that _path_dependency_is_already_partially_accounted forsincedifferentstrainhistorieswillleadtotheconstructionofdifferentsurrogates.

Note that the generality of the

S

model givenby Eq. (37) is not compromised by the elastic-correction additive de-composition since

σ

cantake anyshape, butforinitially linear-elasticmaterialsthissplitimproves therobustness ofthe surrogateintwo ways.Firstly, ithelpstheGP inreproducingthe initialelasticbehavioratamomentwhenthesurrogate modelshaveverylittledatatowork with.Secondly,itaidsinpreventingtheoccurrenceofspurious strainlocalizationas theGPmovesbacktoitszeroprior:awayfromthetrainingpoints,themodelofEq. (37) returnsinsteadtoalinear-elastic response.

At this point, it is relevant to distinguish the present approach from other interesting active learning strategies for constitutivemodelingproposed inliterature[45].In[31,32],theauthorsconstructaconstitutivedatabasefromwhich ob-servations areadaptivelyselected andusedtobuild surrogateGP constitutivemodels bothforstressesandforanumber of assumedmacroscopicinternal variables, creatinga one-to-one mapping betweeninputsandoutputs. Here we opt for a moregeneralapproach anddonot assume theexistence ofmacroscopicinternal variables. Thischoice ismotivatedby thebasicFE2 premisethatmicroscopicbehavioristoocomplextoconformtoanapriori assumedmacroscopicconstitutive modelandleadstoanon-uniquemappinginvolvingexclusivelystressesandstrains.Thiseffectivelyamountstoareduction indimensionalityofthefeaturespaceofthesurrogates thatallowsustoemploythesameframework forcompletely dis-tinctconstitutivemodels(c.f.Eqs. (37) and (43)).Thisdimensionalityreductionimplicitlycreatestheneedfortheexistence ofanchormodelsandforsubjectingthemtothestrainpathsseenbytheirrespectiveanchoringpoints,allowingthe micro-scopic internalvariables—whicharenotobservedbythesurrogate—toinfluencethemacroscopicstress-strainresponse andcompensateforthelossofinformationincurredwhenavoidingmakingassumptionsonmacroscopicinternalvariables.2 In order to positionthe present active learning approach within a generalFEM implementation,it is useful to recall themainsteps involvedinfindingequilibriumsolutionsfornonlinearfiniteelement models.TheseareshowninFig.3.A solutionforu isobtainediterativelybyminimizingaglobalforceresidualr computedfromthematerialresponseatevery integrationpoint(

materialUpdate

).Uponconvergenceandbeforemovingtothenexttimestep,thecurrentsolutionis checked(

checkSolution

)andcanberejectedifthereisaneedtoadaptthemodel(e.g.nucleate/propagatecracks,reﬁne themesh,changeconstitutivemodels).Oncethesolutionisconvergedandaccepted,materialhistoryisupdated(

commit

) andthemodelmovestothenexttimestep.

InanFE2 model,Fig.3canrepresentthemacroscopicsolutionloopandthemicromodelscanbeseenasamaterial-like entity,whichleadstoanothersimilarsolutionloopbeingembeddedwithinthe

materialUpdate

routine.Hereweexploit this modularity and discussthe implementation ofthe learning approach asa surrogate to an arbitrarymaterial model denotedas

fullModel

.We canthereforeimplementtheapproachasamaterialwrapperthatencapsulates

fullModel

and handleslearning andprediction tasks.Algorithms 1, 2,4 and5 show how the material routinesmarked in bold in Fig.3areimplementedforthiswrapper.Inthefollowing,weelaborateoneachofthesecomponentsandpresentthemain implementationalaspectsoftheframeworkindetail.

1 _This_can_be_addressed_by_augmenting_the_input_space,_the_most_{straightforward}_way_being_using_both_ε_and

_ε_as_inputs_to_the_GP_models_[₃₃_,₃₄_]. 2 _The_noise_{hyperparameter}_σ2

f learnedduringtheanalysisimplicitlyaccountsforthedimensionalityreductionerrorincurredbyassumingtwoanchor modelsoperatingatthesamestrainlevelbutwithdifferentmicroscopicinternalstateshavethesamestressresponse.

(11)

Fig. 3. Schematic analysisﬂowforﬁniteelementproblemsinvolvingnonlinearmaterialbehavior.Thesubscriptso andn refertoold(converged)andnew (current)values,respectively.Theproposedaccelerationframeworkfocusesonthestepsmarkedinbold.

4.1. Initialsampling

Sincethereisnooﬄine training,thesurrogatemodelstartswithnopriorinformationontheconstitutivebehaviorbeing approximated (D = ∅) andno anchor modelsinitially present (A

= ∅

). We thereforeintroduce an initializationstep by solvingtheﬁrsttimeincrementundertheassumptionthatallintegrationpointsaredeformingelasticallyinordertoobtain an initial strain distribution which we use to initialize

A

and

D

. During thisstep, we usethe fully-solved modelonly once inorderto obtain theinitial stiffness De andwe assume allother points have thesame stiffness inorderto avoid furtherfull-ordercomputations(Algorithm1).Aswewillseeshortly,thisisnotalimitingassumptionsincethisﬁrstelastic approximationisrejectedandtheﬁrsttimestepisrevisitedaftertheGPmodelsandinitialanchorsareinitialized.

Algorithm 1: The

materialUpdate

routine.

Input: strain εattheintegrationpointp

Output: stress σandstiffness

D at

theintegrationpoint 1 if initialization step :

2 if Deisnotinitialized(i.e.p istheﬁrstpointevercomputed) :

3 useoriginalconstitutivemodel:(σ,D)← fullModel :: materialUpdate (ε); 4 initializeelasticstiffness:

D

e←D;

5 else

6 assumep hasthesamestiffnessastheﬁrstcomputedpoint:σ←Deε;

D

←De; 7 else

8 computeelasticstresses:σe←Deε;

9 useGPtopredictaconstitutivecorrection:Eσ, D, V[σ]← GP (ε); 10 approximatetheresponse:σ←σe+ E[σ], D ←De+ ED; 11 computetheuncertaintyindicatorγasinEq. (38);

12 ifγ>γcancel:

13 cancel time step (Algorithm5); 14 if softening isdetected(Dii<0) :

15 penalizepointuncertainty:γ←γ−Dii; 16 if last stephasbeencancelled :

17 switchtoamodiﬁedNewton-Raphsonstrategy:

D

←De; 18 storevaluesforthispoint:εn

p←ε;γpn←γ; 19 returnσ, D

After thisﬁrstapproximationforthesolutionisobtained,the

checkSolution

routineofAlgorithm2iscalled.Since thestrainsateveryintegrationpointarestored,aninformedinitialchoicefor

A

canbemadebyclusteringtheintegration pointsusingak-meansclusteringalgorithm[46].Foreachcluster,thepointclosesttotheclustercentroidischosen,added to

A

andimmediatelysampled.TheGPmodels(oneforeachstresscomponent)arethenpreparedtomakepredictionsby computingandstoringfactorizedversionsoftheircovariancematrices.

Initialvaluesforthehyperparameters

σ

2

f,

σ

n2 and

caneitherbeprovidedby theuserbasedonpriorknowledge (e.g. from using the same material on other models)3 _or _be _estimated _{online with} _an _initial _training _procedure. _We _do _this herebycreatingaﬁctitiousanchormodelforeach clusterandloadingitmonotonicallyinthestraindirectionseenbythe

3 _{Hyperparameters}_reﬂect_intrinsic_patterns_in_the_constitutive_manifold_being_inferred_and_should_therefore_be_fairly_insensitive_to_the_macroscopic modelbeingsolved.

(12)

checkSolution

routine. 1 if ﬁrst timestep :

2 initializedatasetandanchorpointset:

D

← ∅,

A

← ∅;

3 dividetheintegrationpointsintok clustersinstrainspace:C,εn_{← KMC}_εn_,_k_; 4 for every clusterCi∈ C:

5 ﬁndrepresentativepointp=arg min

j∈Ci εn j−ε n i; 6 anchorafully-solvedmodelatpointp:

A

← A∪p;

7 solvethenewly-createdfullmodelonce:

(

σ,D)← fullModel :: materialUpdateεn

p

; 8 computecorrectionsandadddata:σ←σ−Deεnp;D←D −De;

D

← D ∪

εn p, σ D; 9 updateGPandstoremarginallikelihood:updateGP(D);L←ln p(y|x);

10 reject solution (revisit the ﬁrst time step); 11 else

12 update

D

basedonthetoleranceγtolusingAlgorithm3:

D

← D ∪ addData (γtol); 13 ifThasnotchanged :

14 accept solution as is (Algorithm4); 15 else

16 refactorcovariancematrix:updateGP(D); 17 ifL/ln pt>Lretrain:

18 recomputeGPhyperparameters:retrainGP(D);L←ln pt; 19 reject solution (continue on the same time step);

point closest to the cluster centroid.In order to avoid redundancy (e.g. adding linear-elasticdata coming frommultiple models),weﬁrstinitializetheGPwiththestandardhyperparametervalues

σ

2

f

=

1.0 MPa 2_,

=

1.0

×

10−2_,

_σ

2

n

=

0.0 MPa2 anduse avariance threshold asadosing mechanism toensure theadded datais uniformlyspaced.After optimizing for thehyperparameters,we discardthedataobtainedfromtheﬁctitiousanchors sincethereisnoguaranteeanyofthereal integrationpointswillfollowthesamestrain history.Once

D

and

A

areinitialized,theﬁrsttime stepisthenrevisitedby rejecting theinitial elastic approximationdone inAlgorithm1. Notethat thishappensonlyonceatthe beginningofthe analysis.

4.2. Activelearning

After theinitializationstep,theGPsurrogates areusedtocompute

σ

andthe responseisapproximatedasinEq. (37) (referto Fig. 3forthe analysisﬂow ofa single load increment).Since an independentGP model isused foreach stress component,weadoptasingleuncertaintyindicator

γ

givenby:

γ

=

maxnσ i

V

[

σ

i]

(38) with nσ being the number of stress components, and use it to drive the adaptive components of the framework. The indicator

γ

isthereforean absolutemeasure ofuncertainty(expressedinunitsofstress).Weempiricallyﬁndthischoice of indicator to offer the best performance in the examples treated in this work. Nevertheless, other alternatives might performbetterdepending ontheproblembeingsolved —e.g. normalizing

V

[

σ

i] withtheprocessvariance,whichforthe presentexampleswe ﬁndtoadverselyaffecttheperformance oftheframeworkwhen

σ

2

f isperiodicallyre-estimated.The literatureonactivelearningandgreedyGPapproximationscanalsoprovidetheinterestedreaderwithfurtheralternatives fordeﬁning

γ

(referforinstancetoSection8.3.3of[41] andreferenceswithin).

Valuesof

γ

forallintegrationpointsareupdatedateveryglobalNewton-Raphsoniteration(Algorithm1)andtheirﬁnal values

γ

n _(upon_global_convergence) _are _used _to _drive_a _greedy _reﬁnement_of _the_GP _models _(Algorithm ₂₎ _and _ensure the surrogatemodel remains accurate.Thisis enforced bythe toleranceparameter

γ

tol throughthe

addData

routine of Algorithm3.Herewedeﬁneanadditionalset

T

oftrackedanchorsthatcontainsthemodelsforwhichdatahasbeenadded during thepresenttime step.Theset

T

isusedboth toavoidrepeatedly addingdatafroma single anchormodelatany giventime step andto makesurethe single addeddatapoint isupdated toreﬂect thelatestequilibrium solution(since fromFig.3,multiple

materialUpdate

→ checkSolution → continue

cyclescanoccurbeforehistoryiscommitted). Theuncertaintylevel

γ

ischeckedateveryintegrationpointandtheanchormodelassociatedwiththepointhavingthe highestvalueischosen forsampling.Inordertokeepthenumberofmodelsin

A

toaminimum,

addData

givespriority to models already in

A

and only considers other potential anchoring points if every model in

A

with an uncertainty higherthan

γ

tol hasalreadybeensampledonthecurrenttimestep.Sincethemacroscopicproblembeingsolvedimposesa degreeofsimilaritybetweenintegrationpoints,theideaisthataddingdatafrompointsin

A

shouldalsohelpreduce the uncertaintyofnearbyintegrationpoints. Thiscanthereforebe seenasagreedy data selectionapproach[41] thataimsat keepingboth

A

and

D

assmallaspossible.Ifallmodelsin

A

havealreadybeensampledandanewanchoringpointmust be chosen,we ﬁrstrecoverthematerialhistoryofthenewly-createdmodelby makingitrevisit thecompletecumulative strainhistory

ε

p_hofitsanchoringpoint p.Alsonotethatweavoidaddingdatafrommodelsin

A

undergoingunloadingor

(13)

reloading.Thisisaconsequenceofassumingauniquerelationshipbetweenstrainsandstresses:datacomingfromanchor modelsunloadingthroughadifferentpathwouldbeerroneouslyinterpretedbytheGPmodelsasnoise.

addData

routine.

Input: A toleranceγtol

Output: A singledatapointtobeincludedin

D

1 updatedatasampledfrom

T

toreﬂectthenewequilibriumstrainsεn

p∈T; 2 updateGPmodelsanduncertaintyindicators:updateGP(D);γn_←_γεn

; 3 trytoﬁndanuntrackedanchorpointtosample:p=arg max

p∈A,p/∈Tγ

n

p|γpn>γtol; 4 if p cannot befound :

5 trytoﬁndanewanchoringpoint:p=arg max

p∈/A,p∈/Tγ

n

p|γpn>γtol; 6 updatelistoftrackedpoints:

T

← T ∪p;

7 updatelistofanchorpoints:

A

← A∪p if p∈ A/ ;

8 callfullModel:: materialUpdate(εh_p)tograduallybringtheanchoratp tothecurrenttimestep; 9 computemodelresponseatεn

p:

(

σ,D)← fullModel :: materialUpdate εn p ; 10 computecorrections:σn p←σ−Deεn_p;Dn_p←D −De; 11 returnεt p, σt p D t p

witht beingthelateststeponwhichp isnotunloading/reloading;

Upon making changes to

D

, the covariancematrices ofall GP models are updated and refactored.The addition ofa new observation alsoleads to a change in marginal likelihood (Eq. (35)). We therefore allow for a re-estimation of the hyperparameterstotakeplaceoncethelikelihoodreachesavalueLretraintimeslowerthantheonecomputedafterthelatest estimation.If

T

hasnotchangedduringthelatestcallto

addData

,thecurrentsolutionisacceptedasitis.Otherwisewe continuewiththesametimestepbyrejectingthecurrentsolution,whichwillcausetheglobalNewton-Raphsonsolverto keepsearchingforanequilibriumsolution(refertoFig.3)butnowwithupdatedGPmodels.

Twoadditionaladaptivitysafeguardsareputinplaceinthe

materialUpdate

routineofAlgorithm1.Firstly,thetime stepcanbecancelediftheuncertaintyishigherthanathreshold

γ

cancel inordertoavoidconsideringequilibriumsolutions thatareexcessivelyfarfromtheconstitutivebehaviorbeingapproximated.AswewilldiscussinSection 4.3,cancelingthe currentloadstepdoesnot meangivingup ontheanalysis, withnewsolutionattemptsbeingmadeafterincludingextra anchor models in

A

and retraining the GPmodels. Secondly,the diagonal ofthe tangent stiffness matrix ischecked for possiblenegative values,providingan indicationofsoftening.Thiswouldindicateaswitchfromstressestotractions isin order,butthedecisionforsuchaconstitutivemodelswitchshouldalwaysbebasedonaccurateinformationobtainedfrom modelsin

A

.Therefore,weﬂagthepointforsamplingbypenalizingitsuncertainty.4

Apart fromthe very ﬁrst update computed inorder to obtain De,the expensive full-order modelis never computed during

materialUpdate

(Algorithm1).Analternativetothisapproachwouldbetoactuallycallthemodelsin

A

every time stressesat the anchoring points are computed. We do not opt forthis alternative fortwo reasons. Firstly, using a single constitutive model(the surrogate

S

)for thewhole meshavoids potential non-uniquenessissuesthat wouldarise, for instance,if points in

A

are switching betweendifferent constitutive regimes(loading/unloading/softening). Secondly, refraining from constantly updating the models in

A

resultsin signiﬁcant gains interms of accelerationby making the reducedmodelinsensitive tothenumberofglobalNewton-Raphsoniterationsneededforconvergence andallowing full-order models in

A

to become dormant andessentially be removedfrom the analysis foras long as the uncertainty at the associated anchoring point doesnot increase. However, it is worth mentioning in passing that the more expensive alternative hasthe meritofallowing foran extranoveltydetection safeguardtobe employedthrough whichthedeviation betweenpredictionsfor

σ

comingfromthefull-orderandsurrogatemodelscanbekeptincheck.

4.3. Solutionrobustness

Algorithms 2and3ensurethat onlya singledata pointisaddedto

D

atatime. Thisisdone inordertoavoidlarge perturbationstothecurrentequilibriumsolutioncausedbychangesintheconstitutiveresponse.5_Once_a_new_observation_is added,thecurrent(nowunconverged)solutioniskeptbutthealgorithmcontinuesonthesametimestepuntilequilibrium isreestablished,atwhichpoint

checkSolution

isonceagaincalled(refertoFig.3fortheorderinwhichtheseroutines are calledwithin a single analysis step). This process is repeated until

γ

n _is _lower _than

_γ

tol everywhere. The resulting cycleofcarefullyaddingdatawithoutsigniﬁcantlydriftingawayfromequilibriumhelpskeepingtheglobalsolutionscheme robust.

4 _In_the_examples_of_this_paper_we_do_not_treat_models_for_which_such_a_switch_from_bulk_to_cohesive_behavior_would_be_necessary._The_safeguard_is thereforeonlytriggeredinrareoccasionswhentheGPistryingtoapproximateaperfectly-plasticresponse.

5 _Adding_data_to_the_GP_affects_the_response_of_all_points_within_a_hypersphere_in_strain_space_with_a_radius_that_depends_on

_._The_resulting_change_in globalresponsemaycausethesolvertodiverge.

(14)

commit

routine. 1 clearlistoftrackedmodels:

T

← ∅; 2 storeconvergedvalues:γo_←_γn_;

3 updatepersistentstrainhistory:εh_←_εh _εn_; 4 go to next time step;

Whenthesolutionconvergesand

γ

n

_<

_γ

tolforallintegrationpoints,the

commit

routineiscalled(Algorithm4).Before moving to the next step, converged values for

γ

are updated. If the global solver failsto convergeor if the additional uncertainty threshold

γ

cancel of Algorithm1 isviolated, the solutionfor thecurrenttime step is canceled(Algorithm 5). Beforemakinganewsolutionattempt, convergedvaluesarerecoveredandAlgorithm3isusedtosample themodelwith highest uncertainty even if its value for

γ

is lower than

γ

tol. Furthermore, we switch to a secant solution strategy by ﬁxingthestiffnessmatrixtoDe(Algorithm1).Weempiricallynoticethat,forcertaincombinationsofhyperparameters,the surrogate modelcausestheglobalNewton-Raphsonsolver toloserobustness duetorapidchanges instiffness. Switching toa secantstrategyafteracanceled stepassuresasolutionisfound underthisscenario,albeitwithalower convergence rate.

cancel

routine.

1 recoverconvergedvalues:γn_←_γo_;

2 adddatafromthepointwiththehighestvariance(Algorithm3):

D

← D ∪ addData (0); 3 refactorcovariancematrix:updateGP(D);

4 ifL/ln pt>Lretrain:

5 recomputeGPhyperparameters:retrainGP(D);L←ln pt; 6 go back to the beginning of the time step;

5. Numericalexamples

In this section, we put the proposed framework to the test on a number ofnumerical examples. The algorithms of Section4havebeenimplementedinanin-houseFiniteElementcodeusingtheopen-sourceJem/JiveC++numericalanalysis library[47].WebeginbydemonstratingtheapplicabilityoftheframeworkforFE2 _analysis_and_move_on_to_performing_an in-depthinvestigationofitsperformancewithfastsingle-scalehomogeneousexamplesthatallowforextensiveparametric studiestobeperformed.Itisemphasizedthat,althoughitisourvisiontousethesurrogatemodelinamultiscalecontext, itdoesnot matterforthepurposeoftestingtheactivelearning frameworkwhetherthematerialupdateofthefullmodel involvessolvingamicromechanicalBVPorjustevaluatinganonlinearconstitutiverelation.

5.1. FE2demonstration

The firstexampleconcerns afiber-reinforced compositetaperedspecimenloaded intransversetension.The geometry, meshandboundaryconditionsareshowninFig.4.A4-fiberRVEmodelisembeddedateachmacroscopicintegrationpoint. ThegeometryandmeshofthemicromodelarealsoshowninFig.4.Thefibersaremodeledaslinearelasticwithproperties

E

=

74000 MPa and

ν

=

0.2.Forthematrixweemploy thepressure-dependentelastoplasticmodelproposed byMelroet al. [48],withE

=

3130 MPa,

ν

=

0.37,

ν

p

=

0.32 (plasticPoisson’sratio)andyieldstressesgivenby:

σ

t

=

64

.

80

−

33

.

6e−ε p eq/0.003407

−

₁₀

_.

_21e−εeqp/0.06493 ₍₃₉₎

σ

c

=

81

.

00

−

42

.

0e−ε p eq/0.003407

−

₁₂

_.

_77e−εeqp/0.06493 ₍₄₀₎ where

ε

eqp istheequivalent plasticstrain. The modelissolved for100loadsteps,atwhichpoint theglobalmacroscopic responseisalmostperfectlyplasticandthestrainlocalizesaroundthecenterofthespecimen.

We run thereducedmodelwithk

=

1,

γ

tol

=

0.3 MPa,

γ

cancel

=

20 MPa and withhyperparameters estimatedusinga single micromodelloadedinthedirectionofthesinglek-meansclusteringcentroid(inthiscasetheaveragestrain inthe specimen).The analysisstarts with

A

= ∅

and

D = ∅

andendswithatotalof14anchormodels in

A

(outofa totalof 134points)and

|

D|

=

73 datapoints.

Weplottheresultantload-displacementresponseforbothfull- andreduced-ordermodelsinFig.5.Theactivelearning approach is able to capturethe correctglobal model response with negligible loss of accuracy while running 22 times faster than thefull-order model.Ofthe 35 s execution time of thereduced model,a total of15 s isspent on updating the GP models and usingthem for newpredictions. For larger micromodelswith denser meshes, the execution time of thereducedmodelwillbedominatedbycomputingthefewmicromodelsincludedin

A

andtheoverheadassociatedwith theGPmodelswillbecome negligible.Forthefull-ordermodel,99% oftheexecutiontimeisspent solvingtheembedded micromodels, conﬁrming the bottleneckassumption ofEq. (8). InFig. 6 we plotthe horizontal stress distribution along

On-the-fly construction of surrogate constitutive models for concurrent multiscale mechanical analysis through probabilistic machine learning

Delft University of Technology